Apply an arbitrary R function across a collection of minc files, distributing the computation to multiple cores or workers on a grid computing environment
Usage
pMincApply(
filenames,
fun,
...,
mask = NULL,
tinyMask = FALSE,
batches = 4,
slab_sizes = NULL,
method = NULL,
local = FALSE,
cores = NULL,
resources = list(),
packages = NULL,
vmem = NULL,
walltime = NULL,
workers = batches,
temp_dir = getwd(),
cleanup = TRUE,
collate = simplify2minc,
conf_file = getOption("RMINC_BATCH_CONF"),
registry_name = new_file("pMincApply_registry"),
registry_dir = getwd()
)Arguments
- filenames
Paths to the minc files to be applied accross
- fun
The function to apply
- ...
Additional arguments to fun through mincApplyRCPP see notes for a warnings
- mask
The path to a mask for the minc files
- tinyMask
whether to use a small subset of voxels to test the computation
- batches
The number of jobs to break the computation into, ignored for snowfall/local mode
- slab_sizes
A 3 element vector indicating how large a chunk of data to read from each minc file at a time defaults to one slice along the first dimension.
- method
A deprecated argument formerly used to configure how to parallelize the jobs now this is handled with
conf_file- local
boolean whether to run the jobs locally (with the parallel package) or with batchtools
- cores
defaults to 1 or
max(getOption("mc.cores"), parallel::detectCores() - 1)if running locally see qMincApply for details.- resources
A list of resources to request from the queueing system common examples including memory, walltime, and nodes see
system.file("parallel/pbs_script.tmpl", package = "RMINC")andsystem.file("parallel/sge_script.tmpl", package = "RMINC")for more examples- packages
Character vector of packages to load for all jobs
- vmem
The number of gigabytes of memory to request for each batched job. It is a compatibility argument and will overload
vmemset in the resource list (if it is defined)- walltime
The amount of walltime to request for each batched job. It is a compatibility argument and will overwrite
walltimeset in the resource list (if it is defined)- workers
The number of workers to use. It is a compatibility option and will overwrite
batchesif it is supplied.- temp_dir
A path to a temporary directory to hold the job registry created when using a true queuing system and for writing temporary mask files. This must be a location read/writable by all nodes when using a true queuing system (so /tmp will not work).
- cleanup
Whether to clean up registry after a queue parallelization job
- collate
A function to be applied to collapse the results of the the pMincApply. Defaults to simplify2minc.
- conf_file
A batchtools config file, defaults to
option("RMINC_BATCH_CONF")- registry_name
a name for the registry
- registry_dir
where batchtools should create the registry
Details
This is a convenience wrapper for two underlying functions qMincApply
and mcMincApply for queueing and multicore processing respectively. Each of
these functions divides all of the voxels that are masked by mask into
batches. Batches are processed in parallel, with calling
mincApplyRCPP to process the voxels in the batch. Arguments passed in through
... will be bound by mincApplyRCPP before fun, so be wary
of potential partial matches. When in doubt, partially apply your function before
hand, and do not rely on positional matching.