Package 'mcprogress' reference manual

Title:	Progress Bars and Messages for Parallel Processes
Description:	Tools for monitoring progress during parallel processing. Lightweight package which acts as a wrapper around mclapply() and adds a progress bar to it in 'RStudio' or 'Linux' environments. Simply replace your original call to mclapply() with pmclapply(). A progress bar can also be displayed during parallelisation via the 'foreach' package. Also included are functions to safely print messages (including error messages) from within parallelised code, which can be useful for debugging parallelised R code.
Authors:	Myles Lewis [aut, cre]
Maintainer:	Myles Lewis <[email protected]>
License:	GPL (>= 3)
Version:	0.1.1
Built:	2025-03-27 05:49:31 UTC
Source:	https://github.com/myles-lewis/mcprogress

Versions of `cat()` and `message()` for parallel processing

Description

Prints messages to the console using echo during to enable messages to be printed during parallel processing. Text is only printed if the Rstudio environment is detected.

Usage

cat_parallel(...)

message_parallel(...)
cat_parallel(...)

message_parallel(...)

Arguments

...

zero or more objects which can be coerced to character and which are pasted together.

Value

Prints a message to the console. cat_parallel() uses no line feed, while message_parallel() always adds a newline.

Catch error messages during parallel processing

Description

Allows an expression to be wrapped in try() to catch error messages. Any error messages are printed to the console using mcstop().

Usage

catchError(expr, ...)
catchError(expr, ...)

Arguments

`expr`	An expression to be wrapped in `try()` to allow execution and catch error messages.
`...`	Optional objects to be tracked if you want to know state of objects at the point error messages are generated.

Value

Prints error messages during parallel processing. If there is no error, the result of the evaluated expression is returned.

Show progress bar during parallel processing

Description

Uses echo to safely output a progress bar to Rstudio or Linux console during parallel processing.

Usage

mcProgressBar(
  val,
  len = 1L,
  cores = 1L,
  subval = NULL,
  title = "",
  spinner = FALSE,
  eta = TRUE,
  start = NULL,
  sensitivity = 0.01
)

closeProgress(start = NULL, title = "", eta = TRUE)
mcProgressBar(
  val,
  len = 1L,
  cores = 1L,
  subval = NULL,
  title = "",
  spinner = FALSE,
  eta = TRUE,
  start = NULL,
  sensitivity = 0.01
)

closeProgress(start = NULL, title = "", eta = TRUE)

Arguments

`val`	Integer measuring progress
`len`	Total number of processes to be executed overall.
`cores`	Number of cores used for parallel processing.
`subval`	Optional subvalue ranging from 0 to 1 to enable granularity during long processes. Especially useful if `len` is small relative to `cores`.
`title`	Optional title for the progress bar.
`spinner`	Logical whether to show a spinner which moves when each core completes a process. More useful for relatively long processes where the length of time for each process to complete is variable. Not shown if `subval` is used. Can add significant overhead is `len` is large and each process is very fast.
`eta`	Logical whether to show estimated time to completion. `start` system time must be supplied with each call to `mcProgressbar()` in order to estimate the time to completion.
`start`	Used to pass the system time from the start of the call to show a total time elapsed. See the example below.
`sensitivity`	Determines maximum sensitivity with which to report progress for situations where `len` is large, to reduce overhead. Default 0.01 refers to 1%. Not used if `subval` is invoked.

Details

This package provides 2 main methods to show progress during parallelised code using mclapply(). If X (the list object looped over in a call to mclapply()) has many elements compared to the number of cores, then it is easiest to use pmclapply(). However, in some use cases the length of X is comparable to the number of cores and each process may take a long time. For example, machine learning applied to each of 8 folds on an 8-core machine will open 8 processes from the outset. Each process will often complete at roughly the same time. In this case pmclapply() is much less informative as it only shows completion at the end of 1 round of processes so it will go from 0% to 100%. In this example, if each process code is long and subprogress can be reported along the way, for example during nested loops, then mcProgressBar() provides a way to show the subprogress during the inner loop. The example below shows how to write code involving an outer call to mclapply() and an inner loop whose subprogress is tracked via calls to mcProgressBar().

Technically only 1 process can be tracked. If cores is set to 4 and subval is invoked, then the 1st, 5th, 9th, 13th etc process is tracked. Subprogress of this process is computed as part of the number of blocks of processes required. ETA is approximate. As part of minimising overhead, it is only updated with each change in progress (i.e. each time a block of processes completes) or when subprogress changes. It is not updated by interrupt.

Value

No return value. Prints a progress bar to the console if called within an Rstudio or Linux environment.

Author(s)

Myles Lewis

Examples

if (Sys.info()["sysname"] != "Windows") {

## Example function with mclapply wrapped around another nested function
library(parallel)

my_fun <- function(x, cores) {
  start <- Sys.time()
  mcProgressBar(0, title = "my_fun")  # initialise progress bar
  res <- mclapply(seq_along(x), function(i) {
    # inner loop of calculation
    y <- 1:4
    inner <- lapply(seq_along(y), function(j) {
      Sys.sleep(0.2 + runif(1) * 0.1)
      mcProgressBar(val = i, len = length(x), cores, subval = j / length(y),
                    title = "my_fun")
      rnorm(4)
    })
    inner
  }, mc.cores = cores)
  closeProgress(start, title = "my_fun")  # finalise the progress bar
  res
}

res <- my_fun(letters[1:4], cores = 2)

## Example of long function
longfun <- function(x, cores) {
  start <- Sys.time()
  mcProgressBar(0, title = "longfun")  # initialise progress bar
  res <- mclapply(seq_along(x), function(i) {
    # long sequential calculation in parallel with 3 major steps
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 0.33,
                  title = "longfun")  # 33% complete
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 0.66,
                  title = "longfun")  # 66% complete
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 1,
                  title = "longfun")  # 100% complete
    return(rnorm(4))
  }, mc.cores = cores)
  closeProgress(start, title = "longfun")  # finalise the progress bar
  res
}

res <- longfun(letters[1:2], cores = 2)

}
if (Sys.info()["sysname"] != "Windows") {

## Example function with mclapply wrapped around another nested function
library(parallel)

my_fun <- function(x, cores) {
  start <- Sys.time()
  mcProgressBar(0, title = "my_fun")  # initialise progress bar
  res <- mclapply(seq_along(x), function(i) {
    # inner loop of calculation
    y <- 1:4
    inner <- lapply(seq_along(y), function(j) {
      Sys.sleep(0.2 + runif(1) * 0.1)
      mcProgressBar(val = i, len = length(x), cores, subval = j / length(y),
                    title = "my_fun")
      rnorm(4)
    })
    inner
  }, mc.cores = cores)
  closeProgress(start, title = "my_fun")  # finalise the progress bar
  res
}

res <- my_fun(letters[1:4], cores = 2)

## Example of long function
longfun <- function(x, cores) {
  start <- Sys.time()
  mcProgressBar(0, title = "longfun")  # initialise progress bar
  res <- mclapply(seq_along(x), function(i) {
    # long sequential calculation in parallel with 3 major steps
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 0.33,
                  title = "longfun")  # 33% complete
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 0.66,
                  title = "longfun")  # 66% complete
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 1,
                  title = "longfun")  # 100% complete
    return(rnorm(4))
  }, mc.cores = cores)
  closeProgress(start, title = "longfun")  # finalise the progress bar
  res
}

res <- longfun(letters[1:2], cores = 2)

}

Stop and print error message during parallel processing

Description

mcstop() is a multicore version of stop() which prints to the console using 'echo' during parallel commands such as mclapply(), to allow error messages to be more visible.

Usage

mcstop(...)
mcstop(...)

Arguments

...

Objects coerced to character and pasted together and printed to the console using echo.

Value

Prints an error message.

mclapply with progress bar

Description

pmclapply() adds a progress bar to mclapply() in Rstudio or Linux environments using output to the console. It is designed to add very little overhead.

Usage

pmclapply(
  X,
  FUN,
  ...,
  progress = TRUE,
  spinner = FALSE,
  title = "",
  eta = TRUE,
  mc.preschedule = TRUE,
  mc.set.seed = TRUE,
  mc.silent = FALSE,
  mc.cores = getOption("mc.cores", 2L),
  mc.cleanup = TRUE,
  mc.allow.recursive = TRUE,
  affinity.list = NULL
)
pmclapply(
  X,
  FUN,
  ...,
  progress = TRUE,
  spinner = FALSE,
  title = "",
  eta = TRUE,
  mc.preschedule = TRUE,
  mc.set.seed = TRUE,
  mc.silent = FALSE,
  mc.cores = getOption("mc.cores", 2L),
  mc.cleanup = TRUE,
  mc.allow.recursive = TRUE,
  affinity.list = NULL
)

Arguments

`X`	a vector (atomic or list) or an expressions vector. Other objects (including classed objects) will be coerced by `as.list()`.
`FUN`	the function to be applied via `mclapply()` to each element of `X` in parallel.
`...`	Optional arguments passed to `FUN`.
`progress`	Logical whether to show the progress bar.
`spinner`	Logical whether to show a spinner which moves each time a parallel process is completed. More useful if the length of time for each process to complete is variable.
`title`	Title for the progress bar.
`eta`	Logical whether to show estimated time to completion.
`mc.preschedule`, `mc.set.seed`, `mc.silent`, `mc.cleanup`, `mc.allow.recursive`, `affinity.list`	See `mclapply()`.
`mc.cores`	The number of cores to use, i.e. at most how many child processes will be run simultaneously. The option is initialized from environment variable `MC_CORES` if set. Must be at least one, and parallelization requires at least two cores.

Details

This function can be used in an identical manner to mclapply(). It is ideal for use if the length of X is comparably > cores. As processes are spawned in a block and most code for each process completes at roughly the same time, processes move along in blocks as determined by mc.cores. To track progress, pmclapply() only tracks the nth process, where n=mc.cores. For example, with 4 cores, pmclapply() reports progress when the 4th, 8th, 12th, 16th etc process has completed. If the length of X is very large (e.g. in the 1000s), then the progress bar will only update for each 1% of progress in order to reduce overhead.

However, in some scenarios the length of X is comparable to the number of cores and each process may take a long time. For example, machine learning applied to each of 8 cross-validation folds on an 8-core machine will open 8 processes from the outset. Each process will often complete at roughly the same time. In this case pmclapply() is much less informative as it only shows completion at the end of 1 round of processes, so it will go from 0% straight to 100%. For this scenario, we recommend users use mcProgressBar() which allows more fine-grained reporting of subprogress from within a block of parallel processes.

ETA is approximate. As part of minimising overhead, it is only updated with each change in progress (i.e. each time a block of processes completes). It is not updated by interrupt.

Value

A list of the same length as X and named by X.

Author(s)

Myles Lewis

Examples

if (Sys.info()["sysname"] != "Windows") {

res <- pmclapply(letters[1:20], function(i) {
                 Sys.sleep(0.2 + runif(1) * 0.1)
                 setNames(rnorm(5), paste0(i, 1:5))
                 }, mc.cores = 2, title = "Working")

}
if (Sys.info()["sysname"] != "Windows") {

res <- pmclapply(letters[1:20], function(i) {
                 Sys.sleep(0.2 + runif(1) * 0.1)
                 setNames(rnorm(5), paste0(i, 1:5))
                 }, mc.cores = 2, title = "Working")

}

Package 'mcprogress'

Help Index

Versions of `cat()` and `message()` for parallel processing

Description

Usage

Arguments

Value

Catch error messages during parallel processing

Description

Usage

Arguments

Value

See Also

Show progress bar during parallel processing

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Stop and print error message during parallel processing

Description

Usage

Arguments

Value

mclapply with progress bar

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Package 'mcprogress'

Help Index

Versions of cat() and message() for parallel processing

Description

Usage

Arguments

Value

Catch error messages during parallel processing

Description

Usage

Arguments

Value

See Also

Show progress bar during parallel processing

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Stop and print error message during parallel processing

Description

Usage

Arguments

Value

mclapply with progress bar

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Versions of `cat()` and `message()` for parallel processing