可能重複:
Multiprocessing launching too many instances of Python VM蟒蛇多進程調用者(以及被叫)在Windows XP中多次調用
我試圖使用Python多進程並行網頁抓取,但我發現調用多處理的應用程序被多次實例化,而不僅僅是我想要調用的函數(這對我來說是一個問題,因爲調用者對實例化的庫有一些依賴關係 - 失去了大部分性能提升從pa rallelism)。
我在做什麼錯誤或如何避免?
my_app.py:
from url_fetcher import url_fetch, parallel_fetch
import my_slow_stuff
my_slow_stuff.py:
if __name__ == '__main__':
import datetime
urls = ['http://www.microsoft.com'] * 20
results = parallel_fetch(urls, fn=url_fetch)
print([x[:20] for x in results])
class MySlowStuff(object):
import time
print('doing slow stuff')
time.sleep(0)
print('done slow stuff')
url_fetcher.py:
import multiprocessing
import urllib
def url_fetch(url):
#return urllib.urlopen(url).read()
return url
def parallel_fetch(urls, fn):
PROCESSES = 10
CHUNK_SIZE = 1
pool = multiprocessing.Pool(PROCESSES)
results = pool.imap(fn, urls, CHUNK_SIZE)
return results
if __name__ == '__main__':
import datetime
urls = ['http://www.microsoft.com'] * 20
results = parallel_fetch(urls, fn=url_fetch)
print([x[:20] for x in results])
部分輸出:
$ python my_app.py
doing slow stuff
done slow stuff
doing slow stuff
done slow stuff
doing slow stuff
done slow stuff
doing slow stuff
done slow stuff
doing slow stuff
done slow stuff
...
更多信息你在Windows上觀察到這一點? –
是的,我是。事實上,在我的Linux服務器上,它並沒有表現出這種行爲。 – RuiDC