我目前正在嘗試使用Python多處理包來使CPU綁定進程運行得更快。我有一個非常大的numpy矩陣,並希望使用Pool和apply_async來分割工作,以計算矩陣中的值。但是,當我在函數上運行單元測試來測試它是否有效時,我得到錯誤「NameError:全局名稱'self'未定義」。我在Google或StackOverflow上找不到任何有幫助的東西。任何想法爲什麼這可能會發生?Python多處理 - ApplyResult.get()NameError:未定義全局名稱'self'
Pytest輸出:
_____________________ TestBuildEMMatrix.test_build_em_matrix_simple _____________________
self = <mixemt_master.mixemt2.preprocess_test.TestBuildEMMatrix testMethod=test_build_em_matrix_simple>
def test_build_em_matrix_simple(self):
reads = ["1:A,2:C", "1:T,2:C", "3:T,4:T", "2:A,4:T"]
in_mat = preprocess.build_em_matrix(self.ref, self.phy,
> reads, self.haps, self.args)
preprocess_test.py:272:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
preprocess.py:239: in build_em_matrix
results[i] = results[i].get()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <multiprocessing.pool.ApplyResult object at 0x7f4218ea07d0>, timeout = None
def get(self, timeout=None):
self.wait(timeout)
if not self._ready:
raise TimeoutError
if self._success:
return self._value
else:
> raise self._value
E NameError: global name 'self' is not defined
/vol/hpc/apps/python-anaconda2-4.3.1-abat/install/lib/python2.7/multiprocessing/pool.py:567: NameError
--------------------------------- Captured stdout call ----------------------------------
False
和相關的Python函數:
def build_em_matrix_process(markers, haplogroups, pos_obs, mut_prob, column_length, start_index, end_index):
columns = [[prob_for_vars(markers, haplogroups[j], pos_obs, mut_prob) for j in xrange(column_length)]
for i in xrange(start_index, end_index)]
return columns
def build_em_matrix(refseq, phylo, reads, haplogroups, args):
"""
Returns the matrix that describes the probabiliy of each read
originating in each haplotype.
"""
hvb_mat = HapVarBaseMatrix(refseq, phylo)
read_hap_mat = numpy.empty((len(reads), len(haplogroups)))
if args.verbose:
sys.stderr.write('Building EM input matrix...\n')
num_processors = args.p
pool = Pool(processes = num_processors);
results = []
partition_size = int(math.ceil(len(reads)/float(num_processors)))
for i in xrange(num_processors):
start_index = i * partition_size
end_index = (i + 1) * partition_size
pos_obs = pos_obs_from_sig(reads[i])
results.append(pool.apply_async(build_em_matrix_process, (hvb_mat.markers, haplogroups, pos_obs, hvb_mat.mut_prob, len(haplogroups), start_index, end_index)))
column = 0
for i in xrange(num_processors):
results[i].wait()
print results[i].successful()
results[i] = results[i].get()
for j in xrange[len(results)]:
read_hap_mat[column] = results[i][j]
column += 1
if args.verbose:
sys.stderr.write('Done.\n\n')
return read_hap_mat
調用後 '的結果[I] .wait()增加了一個聲明' 的打印效果[I] .successful ()',將False輸出到stdout。我不知道爲什麼這不會返回true,因爲我在build_em_matrix_process中找不到任何錯誤。
單元測試代碼在哪裏?該錯誤表示TestBuildEMMatrix.test_build_em_matrix_simple中存在問題,而不是正在測試的代碼中。 – hpaulj
單元測試代碼很好。這是一個現有的應用程序,我正在重構它以利用並行處理。單元測試工作之前,我沒有更改方法簽名,並且方法的結果應該是一樣的,一旦它是正確的。 – AWeston