2013-02-15 127 views
2

我想在python中使用多線程技術在項目euler中解決Problem 8python循環多線程

查找1000位數字中連續五位數字的最大積。該號碼可以找到here

我的方法是從原始列表中生成5個塊,然後重複此過程5次,每個過程的起始索引右移一位。

這裏是我的線程類

class pThread(threading.Thread): 
    def __init__(self, l): 
     threading.Thread.__init__(self) 
     self.l = l 
     self.p = 0 

    def run(self): 

     def greatest_product(l): 
     """ 
     Divide the list into chunks of 5 and find the greatest product 
     """ 
      def product(seq): 
       return reduce(lambda x,y : x*y, seq) 

      def chunk_product(l, n=5): 
       for i in range(0, len(l), n): 
        yield product(l[i:i+n]) 

      result = 0 
      for p in chunk_product(num): 
       result = result > p and result or p 

      return result 

     self.p = greatest_product(self.l) 

當我嘗試創建5個線程,以覆蓋在我原來的列表中的所有5位塊,手動方法下面給出了正確的答案,與num作爲個位數的號碼清單,我從文字解析:

thread1 = pThread(num) 
del num[0] 
thread2 = pThread(num) 
del num[0] 
thread3 = pThread(num) 
del num[0] 
thread4 = pThread(num) 
del num[0] 
thread5 = pThread(num) 

thread1.start() 
thread2.start() 
thread3.start() 
thread4.start() 
thread5.start() 

thread1.join() 
thread2.join() 
thread3.join() 
thread4.join() 
thread5.join() 

def max(*args): 
    result = 0 
    for i in args: 
     result = i > result and i or result 
    return result 

print max(thread1.p, thread2.p, thread3.p, thread4.p, thread5.p) 

但是這並不能給出正確的結果:

threads = [] 
for i in range(0, 4): 
    tmp = num[:] 
    del tmp[0:i+1] 
    thread = pThread(tmp) 
    thread.start() 
    threads.append(thread) 

for i in range(0, 4): 
    threads[i].join() 

我做了什麼錯在這裏?我對多線程很陌生,所以請溫和。

+1

提示:重寫代碼,而無需使用德爾,這將是很容易理解爲什麼它沒有工作,爲什麼你原來的代碼也是不正確的。另外,考慮到GIL,這個代碼不可能以比單線程版本更快的速度運行。 – 2013-02-15 01:37:10

+0

我已經重寫了沒有del的代碼,是的,它打破了兩個版本。我今天晚些時候再學習一遍。謝謝你的建議。 – 2013-02-15 01:47:37

+1

線程這是嚴重矯枉過正。你可以在1行中用'max(reduce(op.mul,n_list [i:i + 5])爲1行在xrange(1000)中)' – wim 2013-02-15 02:08:30

回答

4

有3個問題:

  1. 第一個是「手動」方法沒有沒有給出正確的答案。只是發生問題的正確答案是在距列表開頭的偏移量4處。您可以通過使用看到:

    你的「手冊」的方法
    import operator as op 
    print max(reduce(op.mul, num[i:i+5]) for i in range(1000)) 
    for k in range(5): 
        print max(reduce(op.mul, num[i:i+5]) for i in range(k, 1000, 5)) 
    

    的一個問題是,線程共享num變量,每個人都有相同的列表。所以當你做del num[0]時,所有的threadX.l都會受到影響。你始終得到相同答案的原因是由於第二個問題。

  2. for p in chunk_product(num): 
    

    應該是:

    for p in chunk_product(l): 
    

    ,因爲您要使用的功能greatest_product(l)的參數,而不是全局變量num

  3. 在第二種方法中,由於循環範圍超過[0, 1, 2, 3],所以只產生4個線程。此外,您要刪除值tmp[0:i]而不是tmp[0:i+1]。下面是代碼:

    threads = [] 
    for i in range(5): 
        tmp = num[:] 
        del tmp[0:i] 
        thread = pThread(tmp) 
        thread.start() 
        threads.append(thread) 
    
    for i in range(5): 
        threads[i].join() 
    
    print len(threads), map(lambda th: th.p, threads) 
    print max(map(lambda th: th.p, threads)) 
    
+0

@JasonWhite這是預期的功能,因爲每個線程以不同的偏移量在列表上循環。在你的例子中(2線程),線程1將計算[1,2] [3,4] [5,6] [7,8] [9,10]的產品,線程2將計算[2, 3] [4,5] [6,7] [8,9]。這是因爲線程2具有self.l = [2,3,4,5,6,7,8,9,10]。我認爲這是OP想要實現的。 – wasserfeder 2013-02-15 08:16:28

+0

啊我明白你在說什麼了。刪除了我以前的評論,因爲我讀錯了; P – 2013-02-15 08:26:43

+0

謝謝你SOOOOOO很多。你碰巧知道如何獎勵賞金?我真的想給你100代表你的答案。 – 2013-02-15 14:31:50

1

我對此進行了一次嘗試,主要是爲了獲得一些練習多處理,並學習如何使用argparse。

這花了大約4-5演出的RAM,以防萬一你的機器沒有太多。

python euler.py -l 50000000 -n 100 -p 8 

Took 5.836833333969116 minutes 
The largest product of 100 consecutive numbers is: a very large number 

如果在命令行中輸入python euler.py -h你:

usage: euler.py [-h] -l L [L ...] -n N [-p P] 

Calculates the product of consecutive numbers and return the largest product. 

optional arguments: 
    -h, --help show this help message and exit 
    -l L [L ...] A single number or list of numbers, where each # is seperated 
       by a space 
    -n N   A number that specifies how many consecutive numbers should be 
       multiplied together. 
    -p P   Number of processes to create. Optional, defaults to the # of 
       cores on the pc.   

,代碼:

"""A multiprocess iplementation for calculation the maximum product of N consecutive 
numbers in a given range (list of numbers).""" 

import multiprocessing 
import math 
import time 
import operator 
from functools import reduce 
import argparse 

def euler8(alist,lenNums): 
    """Returns the largest product of N consecutive numbers in a given range""" 
    return max(reduce(operator.mul, alist[i:i+lenNums]) for i in range(len(alist))) 

def split_list_multi(listOfNumbers,numLength,threads): 
    """Split a list into N parts where N is the # of processes.""" 
    fullLength = len(listOfNumbers) 
    single = math.floor(fullLength/threads) 
    results = {} 
    counter = 0 
    while counter < threads: 
     if counter == (threads-1): 
      temp = listOfNumbers[single*counter::] 
      if counter == 0: 
       results[str(counter)] = listOfNumbers[single*counter::] 
      else: 
       prevListIndex = results[str(counter-1)][-int('{}'.format(numLength-1))::] 
       newlist = prevListIndex + temp 
       results[str(counter)] = newlist 
     else: 
      temp = listOfNumbers[single*counter:single*(counter+1)] 
      if counter == 0: 
       newlist = temp 
      else: 
       prevListIndex = results[str(counter-1)][-int('{}'.format(numLength-1))::] 
       newlist = prevListIndex + temp 
      results[str(counter)] = newlist 
     counter += 1 
    return results,threads 

def worker(listNumbers,number,output): 
    """A worker. Used to run seperate processes and put the results in the queue""" 
    result = euler8(listNumbers,number) 
    output.put(result) 

def main(listOfNums,lengthNumbers,numCores=multiprocessing.cpu_count()): 
    """Runs the module. 
    listOfNums must be a list of ints, or single int 
    lengthNumbers is N (an int) where N is the # of consecutive numbers to multiply together 
    numCores (an int) defaults to however many the cpu has, can specify a number if you choose.""" 

    if isinstance(listOfNums,list): 
     if len(listOfNums) == 1: 
      valuesToSplit = [i for i in range(int(listOfNums[0]))] 
     else: 
      valuesToSplit = [int(i) for i in listOfNums] 
    elif isinstance(listOfNums,int): 
     valuesToSplit = [i for i in range(listOfNums)] 
    else: 
     print('First arg must be a number or a list of numbers') 

    split = split_list_multi(valuesToSplit,lengthNumbers,numCores) 
    done_queue = multiprocessing.Queue() 
    jobs = [] 
    startTime = time.time() 

    for num in range(split[1]): 
     numChunks = split[0][str(num)] 
     thread = multiprocessing.Process(target=worker, args=(numChunks,lengthNumbers,done_queue)) 
     jobs.append(thread) 
     thread.start() 

    resultlist = [] 
    for i in range(split[1]): 
     resultlist.append(done_queue.get()) 

    for j in jobs: 
     j.join() 

    resultlist = max(resultlist) 
    endTime = time.time() 
    totalTime = (endTime-startTime)/60 
    print("Took {} minutes".format(totalTime)) 

    return print("The largest product of {} consecutive numbers is: {}".format(lengthNumbers, resultlist))    

if __name__ == '__main__': 
    #To call the module from the commandline with arguments 
    parser = argparse.ArgumentParser(description="""Calculates the product of consecutive numbers \ 
    and return the largest product.""") 
    parser.add_argument('-l', nargs='+', required=True, 
         help='A single number or list of numbers, where each # is seperated by a space') 
    parser.add_argument('-n', required=True, type=int, 
         help = 'A number that specifies how many consecutive numbers should be \ 
         multiplied together.') 
    parser.add_argument('-p', default=multiprocessing.cpu_count(), type=int, 
         help='Number of processes to create. Optional, defaults to the # of cores on the pc.') 
    args = parser.parse_args() 
    main(args.l, args.n, args.p) 
+0

謝謝,請不要刪除這個答案。一旦我能理解你寫的所有內容,我會盡快回復你。 – 2013-02-15 15:29:20