2017-09-23 131 views
0

今天我跑了一些代碼,我想在我的多核cpu上運行它,所以即使我寫了map,我也將其更改爲pool.map。令人驚訝的是,即使它使用瞭如此之多的處理能力或內存(據我所知),我的代碼運行速度較慢。 所以我寫了這個測試,它使用了pathos和multiprocessing。python multiprocessing,pathos slow

from pathos.pools import ProcessPool 
from pathos.pools import ThreadPool 
#from pathos.pools import ParallelPool 
from pathos.pools import SerialPool 
from multiprocessing import Pool 

import time 

def timeit(method): 
    def timed(*args, **kw): 
     ts = time.time() 
     result = method(*args, **kw) 
     te = time.time() 
     print ('%r (%r, %r) %2.2f sec' % \ 
       (method.__name__, args, kw, te-ts)) 
     return result 

    return timed 

def times2(x): 
    return 2*x 

@timeit 
def test(max,p): 
    (p.map(times2, range(max))) 

def main(): 
    ppool = ProcessPool(4) 
    tpool = ThreadPool(4) 
    #parapool = ParallelPool(4) 
    spool = SerialPool(4) 
    pool = Pool(4) 
    for i in range(8): 
     max = 10**i 
     print(max) 
     print('ThreadPool') 
     test(max,tpool) 
     #print('ParallelPool') 
     #test(max,parapool) 
     print('SerialPool') 
     test(max,spool) 
     print('Pool') 
     test(max,pool) 
     print('ProcessPool') 
     test(max,ppool) 
     print('===============') 


if __name__ == '__main__': 
    main() 

這些結果

1 
ThreadPool 
'test' ((1, <pool ThreadPool(nthreads=4)>), {}) 0.00 sec 
SerialPool 
'test' ((1, <pool SerialPool()>), {}) 0.00 sec 
Pool 
'test' ((1, <multiprocessing.pool.Pool object at 0x0000011E63D276A0>), {}) 0.17 sec 
ProcessPool 
'test' ((1, <pool ProcessPool(ncpus=4)>), {}) 0.00 sec 
=============== 
10 
ThreadPool 
'test' ((10, <pool ThreadPool(nthreads=4)>), {}) 0.00 sec 
SerialPool 
'test' ((10, <pool SerialPool()>), {}) 0.00 sec 
Pool 
'test' ((10, <multiprocessing.pool.Pool object at 0x0000011E63D276A0>), {}) 0.00 sec 
ProcessPool 
'test' ((10, <pool ProcessPool(ncpus=4)>), {}) 0.01 sec 
=============== 
100 
ThreadPool 
'test' ((100, <pool ThreadPool(nthreads=4)>), {}) 0.00 sec 
SerialPool 
'test' ((100, <pool SerialPool()>), {}) 0.00 sec 
Pool 
'test' ((100, <multiprocessing.pool.Pool object at 0x0000011E63D276A0>), {}) 0.00 sec 
ProcessPool 
'test' ((100, <pool ProcessPool(ncpus=4)>), {}) 0.01 sec 
=============== 
1000 
ThreadPool 
'test' ((1000, <pool ThreadPool(nthreads=4)>), {}) 0.00 sec 
SerialPool 
'test' ((1000, <pool SerialPool()>), {}) 0.00 sec 
Pool 
'test' ((1000, <multiprocessing.pool.Pool object at 0x0000011E63D276A0>), {}) 0.00 sec 
ProcessPool 
'test' ((1000, <pool ProcessPool(ncpus=4)>), {}) 0.02 sec 
=============== 
10000 
ThreadPool 
'test' ((10000, <pool ThreadPool(nthreads=4)>), {}) 0.00 sec 
SerialPool 
'test' ((10000, <pool SerialPool()>), {}) 0.00 sec 
Pool 
'test' ((10000, <multiprocessing.pool.Pool object at 0x0000011E63D276A0>), {}) 0.00 sec 
ProcessPool 
'test' ((10000, <pool ProcessPool(ncpus=4)>), {}) 0.09 sec 
=============== 
100000 
ThreadPool 
'test' ((100000, <pool ThreadPool(nthreads=4)>), {}) 0.04 sec 
SerialPool 
'test' ((100000, <pool SerialPool()>), {}) 0.00 sec 
Pool 
'test' ((100000, <multiprocessing.pool.Pool object at 0x0000011E63D276A0>), {}) 0.01 sec 
ProcessPool 
'test' ((100000, <pool ProcessPool(ncpus=4)>), {}) 0.74 sec 
=============== 
1000000 
ThreadPool 
'test' ((1000000, <pool ThreadPool(nthreads=4)>), {}) 0.42 sec 
SerialPool 
'test' ((1000000, <pool SerialPool()>), {}) 0.00 sec 
Pool 
'test' ((1000000, <multiprocessing.pool.Pool object at 0x0000011E63D276A0>), {}) 0.17 sec 
ProcessPool 
'test' ((1000000, <pool ProcessPool(ncpus=4)>), {}) 7.54 sec 
=============== 
10000000 
ThreadPool 
'test' ((10000000, <pool ThreadPool(nthreads=4)>), {}) 4.57 sec 
SerialPool 
'test' ((10000000, <pool SerialPool()>), {}) 0.00 sec 
Pool 
'test' ((10000000, <multiprocessing.pool.Pool object at 0x0000011E63D276A0>), {}) 2.25 sec 
ProcessPool 
'test' ((10000000, <pool ProcessPool(ncpus=4)>), {}) 81.51 sec 
=============== 

,你可以看到多處理經常勾引ProcessPool,比SerialPool更慢。 我運行i5-2500和我今天通過PIP

>pip freeze 
colorama==0.3.9 
decorator==4.1.2 
dill==0.2.7.1 
helper-htmlparse==0.1 
htmldom==2.0 
lxml==4.0.0 
multiprocess==0.70.5 
pathos==0.2.1 
pox==0.2.3 
ppft==1.6.4.7.1 
py==1.4.34 
pyfs==0.0.8 
pyreadline==2.1 
pytest==3.2.2 
six==1.11.0 

安裝悲愴,爲什麼會發生這種情況?

+0

我確信的一件事是,你使用的線程越多,Python代碼所花費的時間就越多。最新的Python有一個更好的GIL版本,所以...在最新的python 3.x版本中可能會有一些性能增益,相比於舊版本, –

+0

也是如此,python並沒有真正使用多線程。它使用單個線程,並在進程之間交換鎖 –

回答

0

您只會從需要執行任務的並行化中受益。與多處理/多線程代碼所需的通信相比,您的任務相當迅速。嘗試使用持續1秒的功能,你會看到效果。另外,請記住,在python中,由於GIL,如果你的IO有界,你將只能從多線程中受益。對於有限的CPU任務,需要進行多處理。

Raymond看這個演講。