2014-10-16 79 views
1

我的代碼是:我怎樣才能得到這個網頁的內容?

import urllib 
response = urllib.urlopen("https://namepal.com/").read() 
print response 

我想這個頁面的內容,但它拋出一個異常:

Traceback (most recent call last): 
    File "C:/python27/tcl.py", line 3, in <module> 
response = urllib.urlopen("https://namepal.com/").read() 
    File "C:\python27\lib\urllib.py", line 84, in urlopen 
    return opener.open(url) 
    File "C:\python27\lib\urllib.py", line 205, in open 
    return getattr(self, name)(url) 
    File "C:\python27\lib\urllib.py", line 435, in open_https 
    h.endheaders(data) 
    File "C:\python27\lib\httplib.py", line 951, in endheaders 
    self._send_output(message_body) 
    File "C:\python27\lib\httplib.py", line 811, in _send_output 
    self.send(msg) 
    File "C:\python27\lib\httplib.py", line 773, in send 
    self.connect() 
    File "C:\python27\lib\httplib.py", line 1158, in connect 
    self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file) 
    File "C:\python27\lib\ssl.py", line 372, in wrap_socket 
    ciphers=ciphers) 
    File "C:\python27\lib\ssl.py", line 134, in __init__ 
    self.do_handshake() 
    File "C:\python27\lib\ssl.py", line 296, in do_handshake 
    self._sslobj.do_handshake() 
IOError: [Errno socket error] [Errno 10054] 

所以我用socket得到它,但它仍然失敗:

import socket 
import ssl 

sock = ssl.wrap_socket(socket.socket()) 
#sock=socket.socket() 
sock.connect(('namepal.com',80)) 
sock.sendall('GET/HTTP/1.1\r\n' 
      'Host: namepal.com\r\n' 
      'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0\r\n' 
      'Connection: keep-alive\r\n' 
      '\r\n') 
response = sock.recv(4096) 

print response 

它拋出一個新的異常。

Traceback (most recent call last): 
    File "C:\Users\Administrator\Desktop\test.py", line 6, in <module> 
    sock.connect(('namepal.com',80)) 
    File "C:\Python27\lib\ssl.py", line 322, in connect 
    self._real_connect(addr, False) 
    File "C:\Python27\lib\ssl.py", line 315, in _real_connect 
    raise e 
SSLError: [Errno 1] _ssl.c:503: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol 

我只是想得到這個網頁的內容。的https://namepal.com/

+0

你先實現支持對我的作品上的Python 2.7.7。你的Python版本是什麼?雖然我的第二個實現出現錯誤! – Dataman 2014-10-16 12:45:43

+0

已驗證,同樣的事情發生在'捲曲' – whereswalden 2014-10-16 13:03:03

+0

我認爲他們的SSL有些東西很時髦。除了瀏覽器(curl,urllib/urllib2和httplib,在Python 2和3中),我得到了「通過對等方重置連接」 – whereswalden 2014-10-16 13:53:37

回答

1
+0

似乎在Python 3.3.2或curl 7.21.4(libcurl/7.21)中不起作用。 4 OpenSSL/0.9.8z)。 Python3中是否支持TLS 1.1/1.2? – whereswalden 2014-10-16 14:56:01

+0

@whereswalden根據https://docs.python.org/3/library/ssl.html,「TLSv1.1和TLSv1.2帶有openssl version 1.0.1」,因此python和curl都不起作用。 – Binux 2014-10-16 14:59:52

相關問題