使用parallel
R包,我可以像这样并行运行:
library(parallel) cl <- makeCluster(2) # Create a cluster with 2 workers ... # do some parallel stuff stopCluster(cl)
但是,cl
引用群集的变量可能会丢失,例如从失败的函数运行时:
do.something <- function() { library(parallel) cl <- makeCluster(detectCores()) parLapply(cl, 1:10, function(x) { stop("An error occured") }) stopCluster(cl) } do.something()
在这里,stopCluster
还没有被执行.当发生这种情况时,我会让工作人员继续运行,如下所示ps
:
501 53300 9225 0 2:16PM ttys003 0:00.27 /opt/local/Library/Frameworks/R.framework/Resources/bin/exec/R 501 53390 1 0 2:19PM ttys003 0:00.16 /opt/local/Library/Frameworks/R.framework/Resources/bin/exec/R --slave --no-restore -e parallel:::.slaveRSOCK() --args MASTER=localhost PORT=11099 OUT=/dev/null TIMEOUT=2592000 XDR=TRUE 501 53399 1 0 2:19PM ttys003 0:00.16 /opt/local/Library/Frameworks/R.framework/Resources/bin/exec/R --slave --no-restore -e parallel:::.slaveRSOCK() --args MASTER=localhost PORT=11099 OUT=/dev/null TIMEOUT=2592000 XDR=TRUE 501 53408 1 0 2:19PM ttys003 0:00.16 /opt/local/Library/Frameworks/R.framework/Resources/bin/exec/R --slave --no-restore -e parallel:::.slaveRSOCK() --args MASTER=localhost PORT=11099 OUT=/dev/null TIMEOUT=2592000 XDR=TRUE 501 53417 1 0 2:19PM ttys003 0:00.16 /opt/local/Library/Frameworks/R.framework/Resources/bin/exec/R --slave --no-restore -e parallel:::.slaveRSOCK() --args MASTER=localhost PORT=11099 OUT=/dev/null TIMEOUT=2592000 XDR=TRUE
当然,我可以kill
逐个手动设置从站,或者重新启动R.但有时它可能不实用,例如,如果R的多个实例正在运行它们自己的池.有什么办法可以cl
在丢失时阻止他们进入R内吗?人们通常如何处理这种情况?
即使存在错误,也有一些机制可以使代码始终运行:
try
将容易出错的部分包含在一个try
或一个tryCatch
块内.然后,您可以检查结果以查看是否存在错误.
do.something <- function() { library(parallel) cl <- makeCluster(detectCores()) result <- try({ parLapply(cl, 1:10, function(x) { stop("An error occured") }) }) if(inherits(result, "try-error")) print("there was an error!") stopCluster(cl) result }
on.exit
on.exit
调用内部的代码将始终在函数结束时运行,无论是干净还是由于错误.
do.something <- function() { library(parallel) cl <- makeCluster(detectCores()) on.exit(stopCluster(cl)) parLapply(cl, 1:10, function(x) { stop("An error occured") }) }