In this post, I want to test r2u, a very rapid and efficient tool to install R packages. It currently supports 19,066 and 18,921 binary packages from CRAN in “focal” and “jammy” respectively. It also supports 207 (focal) and 215 (jammy) BioConductor packages from the 3.15 release. They limited the Bioconductor packages to the ones used in CRAN. Everything is provided as “.deb” binary files with proper dependency resolution by using a proper apt repo which also has a signed Release file.
From their web page, here are its key features:
Below we want to measure the time spent on the installation of dplyr
and deseq2
from R and using r2u. For this, we are going to use a singularity container. If you do not have singularity already installed, please look at the procedure here.
Let’s build two singularities in sandbox mode (writable). Create a singularity recipe in a Singularity
file:
BootStrap: docker
From: ubuntu:focal
%post
# ~~~~~~ General setup ~~~~~~ #
# See https://cloud.r-project.org/bin/linux/ubuntu/
apt update -qq
export DEBIAN_FRONTEND=noninteractive
apt-get install --assume-yes --no-install-recommends software-properties-common dirmngr \
wget build-essential libblas-dev liblapack-dev gcc-10 g++-10 gfortran-10 emacs \
libcurl4-openssl-dev libxml2-dev libsodium-dev libssl-dev
# ~~~~~~ R 4.2.0 ~~~~~~ #
wget -q -O- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc \
| tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
echo "deb [arch=amd64] https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/" \
> /etc/apt/sources.list.d/cran-ubuntu.list
apt update && apt upgrade --yes
apt install --yes r-base r-base-core
Insert the following code in a script buildSingularity.sh
:
#!/usr/bin/bash
singularity build --sandbox sandbox1 Singularity
singularity build --sandbox sandbox2 Singularity
Build the two images using sudo
and measure the execution time with time
(4:25.84elapsed 68%CPU):
sudo time bash buildSingularity.sh
Run sandbox1:
sudo singularity shell --writable sandbox1
Open R and install dplyr and deseq2:
> start_time1<-Sys.time();install.packages("dplyr");end_time1<-Sys.time()
> install.packages("BiocManager")
> library("BiocManager")
> start_time2<-Sys.time();install("DESeq2");end_time2<-Sys.time()
> end_time1-start_time1
## 1.51152 mins
> end_time2-start_time2
## 12.23746 mins
After installing docker, run the command (0m18,790s):
time docker pull eddelbuettel/r2u:focal
Run the docker:
docker run -it eddelbuettel/r2u:focal
Install dplyr
from R:
> start_time<-Sys.time();install.packages("dplyr");end_time<-Sys.time()
> end_time-start_time
## 12.99056 secs
Install deseq2
with apt
:
time apt install --yes r-bioc-deseq2
## 0m23.676s
Run sandbox2 in another terminal:
sudo singularity shell --writable sandbox2
Copy the following script to install-r2u.sh
:
#!/usr/bin/bash
## First: update apt and get gpg-agent and key
apt update -qq
apt install --yes --no-install-recommends gpg-agent # to add the key
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A1489FE2AB99A21A
## Second: add the repo
echo "deb [arch=amd64] https://dirk.eddelbuettel.com/cranapt focal main" > /etc/apt/sources.list.d/cranapt.list
apt update
## Third: ensure R 4.2.0 is used
echo "deb [arch=amd64] https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/" > /etc/apt/sources.list.d/edd-misc.list
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 67C2D66C4B1D4339
## Fourth: add pinning to ensure package sorting
echo "Package: *" > /etc/apt/preferences.d/99cranapt
echo "Pin: origin \"dirk.eddelbuettel.com\"" >> /etc/apt/preferences.d/99cranapt
echo "Pin-Priority: 700" >> /etc/apt/preferences.d/99cranapt
## Fifth: install bspm and enable it
Rscript -e 'install.packages("bspm")'
RHOME=$(R RHOME)
echo "suppressMessages(bspm::enable())" >> ${RHOME}/etc/Rprofile.site
echo "options(bspm.sudo=TRUE)" >> ${RHOME}/etc/Rprofile.site
Run the script (0m38.782s):
bash install-r2u.sh
Open R and install dplyr
:
> start_time<-Sys.time();install.packages("dplyr");end_time<-Sys.time()
> end_time-start_time
## 33.09381 secs
From the command line:
time apt install --yes r-bioc-deseq2
## 4m51.869s
In this post, we have compared the time needed for installing two R packages (dplyr and DESeq2) with or without using r2u
. Here is a summary:
Method | package | time |
---|---|---|
From R | dplyr | 1.51152 mins |
From R | DESeq2 | 12.23746 mins |
From r2u docker | dplyr | 12.99056 secs |
From r2u docker | DESeq2 | 0m23.676s |
From r2u | dplyr | 33.09381 secs |
From r2u | DESeq2 | 4m51.869s |