我正在尝试在python中使用joblib来加快某些数据处理的速度,但是在尝试确定如何将输出分配为所需格式时遇到了问题。我试图生成一个也许过于简单的代码来显示我所遇到的问题:
from joblib import Parallel, delayed import numpy as np def main(): print "Nested loop array assignment:" regular() print "Parallel nested loop assignment using a single process:" par2(1) print "Parallel nested loop assignment using multiple process:" par2(2) def regular(): # Define variables a = [0,1,2,3,4] b = [0,1,2,3,4] # Set array variable to global and define size and shape global ab ab = np.zeros((2,np.size(a),np.size(b))) # Iterate to populate array for i in range(0,np.size(a)): for j in range(0,np.size(b)): func(i,j,a,b) # Show array output print ab def par2(process): # Define variables a2 = [0,1,2,3,4] b2 = [0,1,2,3,4] # Set array variable to global and define size and shape global ab2 ab2 = np.zeros((2,np.size(a2),np.size(b2))) # Parallel process in order to populate array Parallel(n_jobs=process)(delayed(func2)(i,j,a2,b2) for i in xrange(0,np.size(a2)) for j in xrange(0,np.size(b2))) # Show array output print ab2 def func(i,j,a,b): # Populate array ab[0,i,j] = a[i]+b[j] ab[1,i,j] = a[i]*b[j] def func2(i,j,a2,b2): # Populate array ab2[0,i,j] = a2[i]+b2[j] ab2[1,i,j] = a2[i]*b2[j] # Run script main()
其输出如下所示:
Nested loop array assignment: [[[ 0. 1. 2. 3. 4.] [ 1. 2. 3. 4. 5.] [ 2. 3. 4. 5. 6.] [ 3. 4. 5. 6. 7.] [ 4. 5. 6. 7. 8.]] [[ 0. 0. 0. 0. 0.] [ 0. 1. 2. 3. 4.] [ 0. 2. 4. 6. 8.] [ 0. 3. 6. 9. 12.] [ 0. 4. 8. 12. 16.]]] Parallel nested loop assignment using a single process: [[[ 0. 1. 2. 3. 4.] [ 1. 2. 3. 4. 5.] [ 2. 3. 4. 5. 6.] [ 3. 4. 5. 6. 7.] [ 4. 5. 6. 7. 8.]] [[ 0. 0. 0. 0. 0.] [ 0. 1. 2. 3. 4.] [ 0. 2. 4. 6. 8.] [ 0. 3. 6. 9. 12.] [ 0. 4. 8. 12. 16.]]] Parallel nested loop assignment using multiple process: [[[ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.]] [[ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.]]]
从Google和StackOverflow搜索功能来看,使用joblib时似乎不会在每个子进程之间共享全局数组。我不确定这是否是joblib的限制,或者是否有解决方法?
实际上,我的脚本周围是其他代码位,这些代码依赖于此全局数组的最终输出为(4,x,x)格式,其中x是可变的(但通常在100到数千之间)。这是我目前考虑并行处理的原因,因为x = 2400 ,整个过程最多可能需要2个小时。
不必使用joblib(但我喜欢命名和简单性),因此可以随意建议简单的替代方法,最好牢记最终数组的要求。我正在使用python 2.7.3和joblib 0.7.1。