2013-05-09 135 views
0

我想開始一個項目來學習python,並且我選擇編寫一個簡單的web代理。Python:多線程無法正常工作

在某些情況下,某些線程似乎得到一個空的請求,和Python RASIE例外:

first_line: GET http://racket-lang.org/ HTTP/1.1 
Connect to: racket-lang.org 80 
first_line: 
Exception in thread Thread-2: 
Traceback (most recent call last): 
    File "C:\Python27\lib\threading.py", line 551, in __bootstrap_inner 
    self.run() 
    File "C:\Python27\lib\threading.py", line 504, in run 
    self.__target(*self.__args, **self.__kwargs) 
    File "fakespider.py", line 37, in proxy 
    url = first_line.split(' ')[1] 
IndexError: list index out of range 

first_line: first_line: GET http://racket-lang.org/plt.css HTTP/1.1GET http://racket-lang.org/more.css HTTP/1.1 

Connect to:Connect to: racket-lang.orgracket-lang.org 8080 

我的代碼很簡單。 我不知道發生了什麼事情,任何幫助,將不勝感激:)

from threading import Thread 
from time import time, sleep 
import socket 
import sys 

RECV_BUFFER = 8192 
DEBUG = True 

def recv_timeout(socks, timeout = 2): 
    socks.setblocking(0); 
    total_data = [] 
    data = '' 
    begin = time() 
    while True: 
     if total_data and time() - begin > timeout: 
      break 
     elif time() - begin > timeout * 2: 
      break 
     try: 
      data = socks.recv(RECV_BUFFER) 
      if data: 
       total_data.append(data) 
       begin = time() 
      else: 
       sleep(0.1) 
     except: 
      pass 
    return ''.join(total_data) 

def proxy(conn, client_addr): 
    request = recv_timeout(conn) 

    first_line = request.split('\r\n')[0] 
    if (DEBUG): 
     print "first_line: ", first_line 
    url = first_line.split(' ')[1] 

    http_pos = url.find("://") 
    if (http_pos == -1): 
     temp = url 
    else: 
     temp = url[(http_pos + 3):] 

    port_pos = temp.find(":") 
    host_pos = temp.find("/") 
    if host_pos == -1: 
     host_pos = len(temp) 

    host = "" 
    if (port_pos == -1 or host_pos < port_pos): 
     port = 80 
     host = temp[:host_pos] 
    else: 
     port = int((temp[(port_pos + 1):])[:host_pos - port_pos - 1]) 
     host = temp[:port_pos] 

    print "Connect to:", host, port 

    try: 
     s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 
     s.connect((host, port)) 
     s.send(request) 

     data = recv_timeout(s) 
     if len(data) > 0: 
      conn.send(data) 
     s.close() 
     conn.close() 
    except socket.error, (value, message): 
     if s: 
      s.close() 
     if conn: 
      conn.close() 
     print "Runtime error:", message 
     sys.exit(1) 



def main(): 
    if len(sys.argv) < 2: 
     print "Usage: python fakespider.py <port>" 
     return sys.stdout 

    host = "" #blank for localhost 
    port = int(sys.argv[1]) 

    try: 
     s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 
     s.bind((host, port)) 
     s.listen(50) 

    except socket.error, (value, message): 
     if s: 
      s.close() 
     print "Could not open socket:", message 
     sys.exit(1) 

    while 1: 
     conn, client_addr = s.accept() 
     t = Thread(target=proxy, args=(conn, client_addr)) 
     t.start() 

    s.close() 

if __name__ == "__main__": 
    main() 

回答

1

您所看到的堆棧跟蹤說明了一切:

url = first_line.split(' ')[1] 
    IndexError: list index out of range 

顯然分裂變量first_line的結果是不是列表正如你所想的那樣,有多個元素。所以它包含了與你預期不同的東西。要查看它實際包含的內容,只需將其打印出來:

print first_line 

或使用調試器。

+0

非常感謝,我的代碼中有'print first_line'。我不知道爲什麼有些打印空,但有些打印兩次像'first_line:GET http://racket-lang.org/plt.css HTTP/1.1GET http://racket-lang.org/more.css HTTP/1.1' – liweijian 2013-05-10 01:47:38

+0

我會嘗試'request.splitlines()'而不是'request.split('\ r \ n')'。如果你這樣做了,你會在每一行的末尾加上'\ n',所以你需要在每行上調用'.rstrip()',我想。 – piokuc 2013-05-10 16:14:44