parallelly: Querying, Killing and Cloning Parallel Workers Running Locally or Remotely

July 1, 2023 in R

parallelly 1.36.0 is on CRAN since May 2023. The parallelly package is part of the Futureverse and enhances the parallel package of base R, e.g. it adds several features you’d otherwise expect to see in parallel. The parallelly package is one of the internal work horses for the future package, but it can also be used outside of the future ecosystem. In this most recent release, parallelly gained several new skills in how cluster nodes (a.

Continue reading

%dofuture% - a Better foreach() Parallelization Operator than %dopar%

June 26, 2023 in R

doFuture 1.0.0 is on CRAN since March 2023. It introduces a new foreach operator %dofuture%, which makes it even easier to use foreach() to parallelize via the future ecosystem. This new operator is designed to be an alternative to the existing %dopar% operator for foreach() - an alternative that works in similar ways but better. If you already use foreach() together with futures, or plan on doing so, I recommend using %dofuture% instead of %dopar%.

Continue reading

Edmonton R User Group Meetup: Futureverse - A Unifying Parallelization Framework in R for Everyone

May 22, 2023 in R

Below are the slides from my presentation at the Edmonton R User Group Meetup (YEGRUG) on May 22, 2023: Title: Futureverse - A Unifying Parallelization Framework in R for Everyone Speaker: Henrik Bengtsson Slides: HTML, PDF (46 slides) Video: official recording (~60 minutes) Thank you Péter Sólymos and the YEGRUG for the invitate and the opportunity! /Henrik Links YEGRUG: https://yegrug.github.io/ Futureverse website: https://www.futureverse.org/ future package CRAN, GitHub, pkgdown

Continue reading

parallelly 1.34.0: Support for CGroups v2, Killing Parallel Workers, and more

January 18, 2023 in R

With the recent releases of parallelly 1.33.0 (2022-12-13) and 1.34.0 (2023-01-13), availableCores() and availableWorkers() gained better support for Linux CGroups, options for avoiding running out of R connections when setting up parallel-style clusters, and killNode() for forcefully terminating one or more parallel workers. I summarize these updates below. For other updates, please see the NEWS. Added support for CGroups v2 availableCores() and availableWorkers() gained support for Linux Control Groups v2 (CGroups v2), besides CGroups v1, which has been supported since parallelly 1.

Continue reading

Please Avoid detectCores() in your R Packages

December 5, 2022 in R

The detectCores() function of the parallel package is probably one of the most used functions when it comes to setting the number of parallel workers to use in R. In this blog post, I’ll try to explain why using it is not always a good idea. Already now, I am going to make a bold request and ask you to: Please avoid using parallel::detectCores() in your package! By reading this blog post, I hope you become more aware of the different problems that arise from using detectCores() and how they might affect you and the users of your code.

Continue reading

useR! 2022: My 'Futureverse: Profile Parallel Code' Slides

June 23, 2022 in R

Figure 1: A time chart of logged events for two futures resolved by two parallel workers. This is a screenshot of Slide #18 in my talk. Below are the slides for my Futureverse: Profile Parallel Code talk that I presented at the useR! 2022 conference online and hosted by the Department of Biostatistics at Vanderbilt University Medical Center. Title: Futureverse: Profile Parallel Code Speaker: Henrik Bengtsson

Continue reading

parallelly: Support for Fujitsu Technical Computing Suite High-Performance Compute (HPC) Environments

June 9, 2022 in R

parallelly 1.32.0 is now on CRAN. One of the major updates is that availableCores() and availableWorkers(), and therefore also the future framework, gained support for the ‘Fujitsu Technical Computing Suite’ job scheduler. For other updates, please see NEWS. The parallelly package enhances the parallel package - our built-in R package for parallel processing - by improving on existing features and by adding new ones. Somewhat simplified, parallelly provides the things that you would otherwise expect to find in the parallel package.

Continue reading

parallelly 1.32.0: makeClusterPSOCK() Didn't Work with Chinese and Korean Locales

June 8, 2022 in R

parallelly 1.32.0 is on CRAN. This release fixes an important bug that affected users running with the Simplified Chinese, Traditional Chinese (Taiwan), or Korean locale. The bug caused makeClusterPSOCK(), and therefore also future::plan("multisession"), to fail with an error. For other updates, please see NEWS. The parallelly package enhances the parallel package - our built-in R package for parallel processing - by improving on existing features and by adding new ones.

Continue reading

progressr 0.10.1: Plyr Now Supports Progress Updates also in Parallel

June 3, 2022 in R

progressr 0.10.1 is on CRAN. I dedicate this release to all plyr users and developers out there. The progressr package provides a minimal API for reporting progress updates in R. The design is to separate the representation of progress updates from how they are presented. What type of progress to signal is controlled by the developer. How these progress updates are rendered is controlled by the end user.

Continue reading

parallelly 1.31.1: Better at Inferring Number of CPU Cores with Cgroups and Linux Containers

April 22, 2022 in R

parallelly 1.31.1 is on CRAN. The parallelly package enhances the parallel package - our built-in R package for parallel processing - by improving on existing features and by adding new ones. Somewhat simplified, parallelly provides the things that you would otherwise expect to find in the parallel package. The future package relies on the parallelly package internally for local and remote parallelization. Since my previous post on parallelly in November 2021, I’ve fixed a few bugs and added some new features to the package;

Continue reading

future 1.24.0: Forwarding RNG State also for Stand-Alone Futures

February 22, 2022 in R

future 1.24.0 is on CRAN. It comes with one significant update related to random number generation, further deprecation of legacy future strategies, a slight improvement to plan() and tweaks(), and some bug fixes. Below are the most important changes. One of many possible random number generators. This one was carefully designed by XKCD [CC BY-NC 2.5]. future(…, seed = TRUE) updates RNG state In future (< 1.

Continue reading

Future Improvements During 2021

January 7, 2022 in R

Happy New Year! I made some updates to the future framework during 2021 that involve overall improvements and essential preparations to go forward with some exciting new features that I’m keen to work on during 2022. The future framework makes it easy to parallelize existing R code - often with only a minor change of code. The goal is to lower the barriers so that anyone can quickly and safely speed up their existing R code in a worry-free manner.

Continue reading

parallelly 1.29.0: New Skills and Less Communication Latency on Linux

November 22, 2021 in R

parallelly 1.29.0 is on CRAN. The parallelly package enhances the parallel package - our built-in R package for parallel processing - by improving on existing features and by adding new ones. Somewhat simplified, parallelly provides the things that you would otherwise expect to find in the parallel package. The future package rely on the parallelly package internally for local and remote parallelization. Since my previous post on parallelly five months ago, the parallelly package had some bugs fixed, and it gained a few new features;

Continue reading

parallelly 1.26.0: Fast, Concurrent Setup of Parallel Workers (Finally)

June 10, 2021 in R

parallelly 1.26.0 is on CRAN. It comes with one major improvement and one new function: The setup of parallel workers is now much faster, which comes from using a concurrent, instead of sequential, setup strategy The new freePort() can be used to find a TCP port that is currently available Faster setup of local, parallel workers In R 4.0.0, which was released in May 2020, parallel::makeCluster(n) gained the power of setting up the n local cluster nodes all at the same time, which greatly reduces to total setup time.

Continue reading

parallelly 1.25.0: availableCores(omit=n) and, Finally, Built-in SSH Support for MS Windows 10 Users

April 30, 2021 in R

A piece of an ice core - more pleasing to look at than yet another illustration of a CPU core (Image credit: Ludovic Brucker, NASA’s Goddard Space Flight Center) parallelly 1.25.0 is on CRAN. It comes with two major improvements: You can now use availableCores(omit = n) to ask for all but n CPU cores makeClusterPSOCK() can finally use the built-in SSH client on MS Windows 10 to set up remote workers

Continue reading

Using Kubernetes and the Future Package to Easily Parallelize R in the Cloud

April 8, 2021 in R

This is a guest post by Chris Paciorek, Department of Statistics, University of California at Berkeley. In this post, I’ll demonstrate that you can easily use the future package in R on a cluster of machines running in the cloud, specifically on a Kubernetes cluster. This allows you to easily doing parallel computing in R in the cloud. One advantage of doing this in the cloud is the ability to easily scale the number and type of (virtual) machines across which you run your parallel computation.

Continue reading

future.BatchJobs - End-of-Life Announcement

January 8, 2021 in R

This is an announcement that future.BatchJobs - A Future API for Parallel and Distributed Processing using BatchJobs has been archived on CRAN. The package has been deprecated for years with a recommendation of using future.batchtools instead. The latter has been on CRAN since June 2017 and builds upon the batchtools package, which itself supersedes the BatchJobs package. To wrap up the three-and-a-half year long life of future.

Continue reading

future 1.20.1 - The Future Just Got a Bit Brighter

November 6, 2020 in R

future 1.20.1 is on CRAN. It adds some new features, deprecates old and unwanted behaviors, adds a couple of vignettes, and fixes a few bugs. Interactive debugging First out among the new features, and a long-running feature request, is the addition of argument split to plan(), which allows us to split, or “tee”, any output produced by futures. The default is split = FALSE for which standard output and conditions are captured by the future and only relayed after the future has been resolved, i.

Continue reading

parallelly, future - Cleaning Up Around the House

November 4, 2020 in R

parallelly adverb par·al·lel·ly | \ ˈpa-rə-le(l)li \ Definition: in a parallel manner future noun fu·ture | \ ˈfyü-chər \ Definition: existing or occurring at a later time I’ve cleaned up around the house - with the recent release of future 1.20.1, the package gained a dependency on the new parallelly package. Now, if you’re like me and concerned about bloating package dependencies, I’m sure you immediately wondered why I chose to introduce a new dependency.

Continue reading

Trust the Future

November 4, 2020 in R

Each time we use R to analyze data, we rely on the assumption that functions used produce correct results. If we can’t make this assumption, we have to spend a lot of time validating every nitty detail. Luckily, we don’t have to do this. There are many reasons for why we can comfortably use R for our analyses and some of them are unique to R. Here are some I could think of while writing this blog post - I’m sure I forgot something:

Continue reading

OLDER POSTS
page 1 of 2

Henrik Bengtsson

MSc CS | PhD Math Stat | Associate Professor | R Foundation | R Consortium

Associate Professor