Are you a good R citizen and preallocates your matrices? If you are allocating a numeric matrix in one of the following two ways, then you are doing it the wrong way! x <- matrix(nrow = 500, ncol = 100) or x <- matrix(NA, nrow = 500, ncol = 100) Why? Because it is counter productive. And why is that? In the above, x becomes a logical matrix, and not a numeric matrix as intended.

Continue reading

The R function capture.output() can be used to “collect” the output of functions such as cat() and print() to strings. For example, > s <- capture.output({ + cat("Hello\nworld!\n") + print(pi) + }) > s [1] "Hello" "world!" "[1] 3.141593" More precisely, it captures all output sent to the standard output and returns a character vector where each element correspond to a line of output. By the way, it does not capture the output sent to the standard error, e.

Continue reading

When processing large data sets in R you often also end up creating large temporary objects. In order to keep the memory footprint small, it is always good to remove those temporary objects as soon as possible. When done, removed objects will be deallocated from memory (RAM) the next time the garbage collection runs. Better: Use rm(list = "x") instead of rm(x), if using rm() To remove an object in R, one can use the rm() function (with alias remove()).

Continue reading

Today it’s 16 years ago and 367,496 messages later since Martin Mächler started the R-help (321,119 msgs), R-devel (45,830 msgs) and R-announce (547 msgs) mailing lists [1] - a great benefit to all of us. Special thanks to Martin and also thanks to everyone else contributing to these forums. [1] https://stat.ethz.ch/pipermail/r-help/1997-April/001490.html

Continue reading

Sometimes a minor change to your R code can make a big difference in processing time. Here is an example showing that if you’re don’t care about the names attribute when unlist():ing a list, specifying argument use.names = FALSE can speed up the processing lots! > x <- split(sample(1000, size = 1e6, rep = TRUE), rep(1:1e5, times = 10)) > t1 <- system.time(y1 <- unlist(x)) > t2 <- system.time(y2 <- unlist(x, use.

Continue reading

The below code shows how to configure the help.ports option in R such that the built-in R help server always uses the same URL port. Just add it to the .Rprofile file in your home directory (iff missing, create it). For more details, see help("Startup"). # Force the URL of the help to http://127.0.0.1:21510 options(help.ports = 21510) A slighter fancier version is to use a environment variable to set the port(s):

Continue reading

The below code shows how to configure the repos option in R such that install.packages() etc. will locate the packages without having to explicitly specify the repository. Just add it to the .Rprofile file in your home directory (iff missing, create it). For more details, see help("Startup"). local({ repos <- getOption("repos") # http://cran.r-project.org/ # For a list of CRAN mirrors, see getCRANmirrors(). repos["CRAN"] <- "http://cran.stat.ucla.edu" # http://www.stats.ox.ac.uk/pub/RWin/ReadMe if (.Platform$OS.type == "windows") { repos["CRANextra"] <- "http://www.

Continue reading

Author's picture

Henrik Bengtsson

MSc CS | PhD Math Stat | Associate Professor | R

Associate Professor