With the recent releases of parallelly 1.33.0 (2022-12-13) and 1.34.0 (2023-01-13), availableCores() and availableWorkers() gained better support for Linux CGroups, options for avoiding running out of R connections when setting up parallel-style clusters, and killNode() for forcefully terminating one or more parallel workers. I summarize these updates below. For other updates, please see the NEWS. Added support for CGroups v2 availableCores() and availableWorkers() gained support for Linux Control Groups v2 (CGroups v2), besides CGroups v1, which has been supported since parallelly 1.
The detectCores() function of the parallel package is probably one of the most used functions when it comes to setting the number of parallel workers to use in R. In this blog post, I’ll try to explain why using it is not always a good idea. Already now, I am going to make a bold request and ask you to: Please avoid using parallel::detectCores() in your package! By reading this blog post, I hope you become more aware of the different problems that arise from using detectCores() and how they might affect you and the users of your code.
parallelly 1.32.0 is now on CRAN. One of the major updates is that availableCores() and availableWorkers(), and therefore also the future framework, gained support for the ‘Fujitsu Technical Computing Suite’ job scheduler. For other updates, please see NEWS. The parallelly package enhances the parallel package - our built-in R package for parallel processing - by improving on existing features and by adding new ones. Somewhat simplified, parallelly provides the things that you would otherwise expect to find in the parallel package.
parallelly 1.32.0 is on CRAN. This release fixes an important bug that affected users running with the Simplified Chinese, Traditional Chinese (Taiwan), or Korean locale. The bug caused makeClusterPSOCK(), and therefore also future::plan("multisession"), to fail with an error. For other updates, please see NEWS. The parallelly package enhances the parallel package - our built-in R package for parallel processing - by improving on existing features and by adding new ones.
progressr 0.10.1 is on CRAN. I dedicate this release to all plyr users and developers out there. The progressr package provides a minimal API for reporting progress updates in R. The design is to separate the representation of progress updates from how they are presented. What type of progress to signal is controlled by the developer. How these progress updates are rendered is controlled by the end user.
parallelly 1.31.1 is on CRAN. The parallelly package enhances the parallel package - our built-in R package for parallel processing - by improving on existing features and by adding new ones. Somewhat simplified, parallelly provides the things that you would otherwise expect to find in the parallel package. The future package relies on the parallelly package internally for local and remote parallelization. Since my previous post on parallelly in November 2021, I’ve fixed a few bugs and added some new features to the package;
future 1.24.0 is on CRAN. It comes with one significant update related to random number generation, further deprecation of legacy future strategies, a slight improvement to plan() and tweaks(), and some bug fixes. Below are the most important changes. One of many possible random number generators. This one was carefully designed by XKCD [CC BY-NC 2.5]. future(…, seed = TRUE) updates RNG state In future (< 1.
Happy New Year! I made some updates to the future framework during 2021 that involve overall improvements and essential preparations to go forward with some exciting new features that I’m keen to work on during 2022. The future framework makes it easy to parallelize existing R code - often with only a minor change of code. The goal is to lower the barriers so that anyone can quickly and safely speed up their existing R code in a worry-free manner.
parallelly 1.29.0 is on CRAN. The parallelly package enhances the parallel package - our built-in R package for parallel processing - by improving on existing features and by adding new ones. Somewhat simplified, parallelly provides the things that you would otherwise expect to find in the parallel package. The future package rely on the parallelly package internally for local and remote parallelization. Since my previous post on parallelly five months ago, the parallelly package had some bugs fixed, and it gained a few new features;
progressr 0.8.0 is on CRAN. It comes with some new features: A new ‘rstudio’ handler that reports on progress via the RStudio job interface in RStudio withProgressShiny() now updates the detail part, instead of the message part In addition to signalling relative amounts of progress, it’s now also possible to signal total amounts If you’re curious what progressr is about, have a look at my e-Rum 2020 presentation.
parallelly 1.26.0 is on CRAN. It comes with one major improvement and one new function: The setup of parallel workers is now much faster, which comes from using a concurrent, instead of sequential, setup strategy The new freePort() can be used to find a TCP port that is currently available Faster setup of local, parallel workers In R 4.0.0, which was released in May 2020, parallel::makeCluster(n) gained the power of setting up the n local cluster nodes all at the same time, which greatly reduces to total setup time.
A piece of an ice core - more pleasing to look at than yet another illustration of a CPU core (Image credit: Ludovic Brucker, NASA’s Goddard Space Flight Center) parallelly 1.25.0 is on CRAN. It comes with two major improvements: You can now use availableCores(omit = n) to ask for all but n CPU cores makeClusterPSOCK() can finally use the built-in SSH client on MS Windows 10 to set up remote workers
future 1.20.1 is on CRAN. It adds some new features, deprecates old and unwanted behaviors, adds a couple of vignettes, and fixes a few bugs. Interactive debugging First out among the new features, and a long-running feature request, is the addition of argument split to plan(), which allows us to split, or “tee”, any output produced by futures. The default is split = FALSE for which standard output and conditions are captured by the future and only relayed after the future has been resolved, i.
parallelly adverb par·al·lel·ly | \ ˈpa-rə-le(l)li \ Definition: in a parallel manner future noun fu·ture | \ ˈfyü-chər \ Definition: existing or occurring at a later time I’ve cleaned up around the house - with the recent release of future 1.20.1, the package gained a dependency on the new parallelly package. Now, if you’re like me and concerned about bloating package dependencies, I’m sure you immediately wondered why I chose to introduce a new dependency.
Each time we use R to analyze data, we rely on the assumption that functions used produce correct results. If we can’t make this assumption, we have to spend a lot of time validating every nitty detail. Luckily, we don’t have to do this. There are many reasons for why we can comfortably use R for our analyses and some of them are unique to R. Here are some I could think of while writing this blog post - I’m sure I forgot something:
Parallel ‘Digital Rain’ by Jahobr After two-and-a-half months, future 1.19.1 is now on CRAN. As usual, there are some bug fixes and minor improvements here and there (NEWS), including things needed by the next version of furrr. For those of you who use Slurm or LSF/OpenLava as a scheduler on your high-performance compute (HPC) cluster, future::availableCores() will now do a better job respecting the CPU resources that those schedulers allocate for your R jobs.
There are new versions of future and future.apply - your friends in the parallelization business - on CRAN. These updates are mostly maintenance updates with bug fixes, some improvements, and preparations for upcoming changes. It’s been some time since I blogged about these packages, so here is the summary of the main updates this far since early 2020: future: values() for lists and other containers was renamed to value() to simplify the API [future 1.
No dogs were harmed while making this release future 1.15.0 is now on CRAN, accompanied by a recent, related update of future.callr 0.5.0. The main update is a change to the Future API: resolved() will now also launch lazy futures Although this change does not look much to the world, I’d like to think of this as part of a young person slowly finding themselves. This change in behavior helps us in cases where we create lazy futures upfront;
future 1.8.0 is available on CRAN. This release lays the foundation for being able to capture outputs from futures, perform automated timing and memory benchmarking (profiling) on futures, and more. These features are not yet available out of the box, but thanks to this release we will be able to make some headway on many of the feature requests related to this - hopefully already by the next release.
The future package defines the Future API, which is a unified, generic, friendly API for parallel processing. The Future API follows the principle of write code once and run anywhere - the developer chooses what to parallelize and the user how and where. The nature of a future is such that it lends itself to be used with several of the existing map-reduce frameworks already available in R. In this post, I’ll give an example of how to apply a function over a set of elements concurrently using plain sequential R, the parallel package, the future package alone, as well as future in combination of the foreach, the plyr, and the purrr packages.
- OLDER POSTS
- page 1 of 2