2012-03-16 74 views
10

我想在任何時候都可以並行運行多個進程,並且能夠執行stdout。我應該怎麼做?我是否需要爲每個subprocess.Popen()調用運行線程,一個什麼?並行Python子進程

+0

可能重複[如何使用python運行多個可執行文件?](http://stackoverflow.com/questions/9724499/how-to-run-several-executable-using-python) – 2012-03-17 01:28:07

+0

相關:這裏是如何[運行多個shell命令(並可以同時捕獲它們的輸出)](http://stackoverflow.com/a/23616229/4279) – jfs 2014-07-26 14:16:55

回答

13

你可以在一個線程中完成。

假設你有一個隨機時間打印行的腳本:

#!/usr/bin/env python 
#file: child.py 
import os 
import random 
import sys 
import time 

for i in range(10): 
    print("%2d %s %s" % (int(sys.argv[1]), os.getpid(), i)) 
    sys.stdout.flush() 
    time.sleep(random.random()) 

而且要儘快,因爲它成爲可用收集輸出,你可以在POSIX系統使用select@zigg suggested

#!/usr/bin/env python 
from __future__ import print_function 
from select  import select 
from subprocess import Popen, PIPE 

# start several subprocesses 
processes = [Popen(['./child.py', str(i)], stdout=PIPE, 
        bufsize=1, close_fds=True, 
        universal_newlines=True) 
      for i in range(5)] 

# read output 
timeout = 0.1 # seconds 
while processes: 
    # remove finished processes from the list (O(N**2)) 
    for p in processes[:]: 
     if p.poll() is not None: # process ended 
      print(p.stdout.read(), end='') # read the rest 
      p.stdout.close() 
      processes.remove(p) 

    # wait until there is something to read 
    rlist = select([p.stdout for p in processes], [],[], timeout)[0] 

    # read a line from each process that has output ready 
    for f in rlist: 
     print(f.readline(), end='') #NOTE: it can block 

更便攜的解決方案(可在Windows,Linux,OSX上運行)可以爲每個進程使用讀取器線程,請參閱Non-blocking read on a subprocess.PIPE in python

這裏的os.pipe()爲基礎的解決方案,在Unix和Windows的工作原理:

#!/usr/bin/env python 
from __future__ import print_function 
import io 
import os 
import sys 
from subprocess import Popen 

ON_POSIX = 'posix' in sys.builtin_module_names 

# create a pipe to get data 
input_fd, output_fd = os.pipe() 

# start several subprocesses 
processes = [Popen([sys.executable, 'child.py', str(i)], stdout=output_fd, 
        close_fds=ON_POSIX) # close input_fd in children 
      for i in range(5)] 
os.close(output_fd) # close unused end of the pipe 

# read output line by line as soon as it is available 
with io.open(input_fd, 'r', buffering=1) as file: 
    for line in file: 
     print(line, end='') 
# 
for p in processes: 
    p.wait() 
+2

您似乎在最後一個解決方案中將所有孩子的標準輸出複用到單個fd(output_fd)。如果兩個孩子同時打印,不會弄亂輸出(例如'AAA \ n'+'BBB \ n' - >'ABBB \ nAA \ n') – dan3 2013-11-15 07:09:43

+1

@ dan3:這是一個有效的擔憂。小於「PIPE_BUF」字節的「寫入」是原子的。否則來自多個進程的數據可能被交織。 POSIX至少需要512個字節。在Linux上,'PIPE_BUF'是4096字節。 – jfs 2013-11-15 19:55:53

+0

這裏有一個類似的問題,我最近在這裏發佈,http://stackoverflow.com/questions/36624056/running-a-secondary-script-in-a-new-terminal將是太棒了,如果你可以幫忙,謝謝在任何情況下。 – 2016-04-14 14:42:16

4

您不需要爲每個進程運行線程。您可以查看每個進程的stdout流而不阻塞它們,並且只有在有數據可供讀取的情況下才從它們讀取。

必須小心,不要意外阻止他們,雖然,如果你不打算。

+0

我做了'p = subprocess.Popen(...)'然後'print p.communicate() [0]'幾次。但'communic()'在進程結束之前就等待。 – sashab 2012-03-16 20:26:43

+1

是的,這就是爲什麼如果你想使用單線程你不能使用'communic()'的原因。除了'communic()'外,還有其他一些獲取stdout的方法。 – Amber 2012-03-16 20:27:26

+2

您可能需要查看[select](http://docs.python.org/library/select.html)模塊,以便一次等待多個子進程。 – zigg 2012-03-16 20:28:55

6

您也可以同時使用twisted收集來自多個子進程的stdout:

#!/usr/bin/env python 
import sys 
from twisted.internet import protocol, reactor 

class ProcessProtocol(protocol.ProcessProtocol): 
    def outReceived(self, data): 
     print data, # received chunk of stdout from child 

    def processEnded(self, status): 
     global nprocesses 
     nprocesses -= 1 
     if nprocesses == 0: # all processes ended 
      reactor.stop() 

# start subprocesses 
nprocesses = 5 
for _ in xrange(nprocesses): 
    reactor.spawnProcess(ProcessProtocol(), sys.executable, 
         args=[sys.executable, 'child.py'], 
         usePTY=True) # can change how child buffers stdout 
reactor.run() 

Using Processes in Twisted