並行Python子進程

我想在任何時候都可以並行運行多個進程，並且能夠執行stdout。我應該怎麼做？我是否需要爲每個subprocess.Popen()調用運行線程，一個什麼？並行Python子進程

2012-03-16 sashab

可能重複[如何使用python運行多個可執行文件？]（http://stackoverflow.com/questions/9724499/how-to-run-several-executable-using-python） – 2012-03-17 01:28:07

相關：這裏是如何[運行多個shell命令（並可以同時捕獲它們的輸出）]（http://stackoverflow.com/a/23616229/4279） – jfs 2014-07-26 14:16:55

你可以在一個線程中完成。

假設你有一個隨機時間打印行的腳本：

#!/usr/bin/env python 
#file: child.py 
import os 
import random 
import sys 
import time 

for i in range(10): 
    print("%2d %s %s" % (int(sys.argv[1]), os.getpid(), i)) 
    sys.stdout.flush() 
    time.sleep(random.random())

而且要儘快，因爲它成爲可用收集輸出，你可以在POSIX系統使用select爲@zigg suggested：

#!/usr/bin/env python 
from __future__ import print_function 
from select  import select 
from subprocess import Popen, PIPE 

# start several subprocesses 
processes = [Popen(['./child.py', str(i)], stdout=PIPE, 
        bufsize=1, close_fds=True, 
        universal_newlines=True) 
      for i in range(5)] 

# read output 
timeout = 0.1 # seconds 
while processes: 
    # remove finished processes from the list (O(N**2)) 
    for p in processes[:]: 
     if p.poll() is not None: # process ended 
      print(p.stdout.read(), end='') # read the rest 
      p.stdout.close() 
      processes.remove(p) 

    # wait until there is something to read 
    rlist = select([p.stdout for p in processes], [],[], timeout)[0] 

    # read a line from each process that has output ready 
    for f in rlist: 
     print(f.readline(), end='') #NOTE: it can block

更便攜的解決方案（可在Windows，Linux，OSX上運行）可以爲每個進程使用讀取器線程，請參閱Non-blocking read on a subprocess.PIPE in python。

這裏的os.pipe()爲基礎的解決方案，在Unix和Windows的工作原理：

#!/usr/bin/env python 
from __future__ import print_function 
import io 
import os 
import sys 
from subprocess import Popen 

ON_POSIX = 'posix' in sys.builtin_module_names 

# create a pipe to get data 
input_fd, output_fd = os.pipe() 

# start several subprocesses 
processes = [Popen([sys.executable, 'child.py', str(i)], stdout=output_fd, 
        close_fds=ON_POSIX) # close input_fd in children 
      for i in range(5)] 
os.close(output_fd) # close unused end of the pipe 

# read output line by line as soon as it is available 
with io.open(input_fd, 'r', buffering=1) as file: 
    for line in file: 
     print(line, end='') 
# 
for p in processes: 
    p.wait()

來源

2012-03-16 23:40:03 jfs

您似乎在最後一個解決方案中將所有孩子的標準輸出複用到單個fd（output_fd）。如果兩個孩子同時打印，不會弄亂輸出（例如'AAA \ n'+'BBB \ n' - >'ABBB \ nAA \ n'） – dan3 2013-11-15 07:09:43

@ dan3：這是一個有效的擔憂。小於「PIPE_BUF」字節的「寫入」是原子的。否則來自多個進程的數據可能被交織。 POSIX至少需要512個字節。在Linux上，'PIPE_BUF'是4096字節。 – jfs 2013-11-15 19:55:53

這裏有一個類似的問題，我最近在這裏發佈，http://stackoverflow.com/questions/36624056/running-a-secondary-script-in-a-new-terminal將是太棒了，如果你可以幫忙，謝謝在任何情況下。 – 2016-04-14 14:42:16

您不需要爲每個進程運行線程。您可以查看每個進程的stdout流而不阻塞它們，並且只有在有數據可供讀取的情況下才從它們讀取。

你做必須小心，不要意外阻止他們，雖然，如果你不打算。

來源

2012-03-16 20:12:27 Amber

我做了'p = subprocess.Popen（...）'然後'print p.communicate（） [0]'幾次。但'communic（）'在進程結束之前就等待。 – sashab 2012-03-16 20:26:43

是的，這就是爲什麼如果你想使用單線程你不能使用'communic（）'的原因。除了'communic（）'外，還有其他一些獲取stdout的方法。 – Amber 2012-03-16 20:27:26

您可能需要查看[select]（http://docs.python.org/library/select.html）模塊，以便一次等待多個子進程。 – zigg 2012-03-16 20:28:55

您也可以同時使用twisted收集來自多個子進程的stdout：

#!/usr/bin/env python 
import sys 
from twisted.internet import protocol, reactor 

class ProcessProtocol(protocol.ProcessProtocol): 
    def outReceived(self, data): 
     print data, # received chunk of stdout from child 

    def processEnded(self, status): 
     global nprocesses 
     nprocesses -= 1 
     if nprocesses == 0: # all processes ended 
      reactor.stop() 

# start subprocesses 
nprocesses = 5 
for _ in xrange(nprocesses): 
    reactor.spawnProcess(ProcessProtocol(), sys.executable, 
         args=[sys.executable, 'child.py'], 
         usePTY=True) # can change how child buffers stdout 
reactor.run()

見Using Processes in Twisted。

來源

2012-03-17 01:11:24 jfs

並行Python子進程

回答

相關問題