我需要爲一個大學項目抓取幾個網站,並且我已經達到了需要登錄的網站的死衚衕。我在Python中使用urllib,urllib2,cookielib模塊來登錄。它不適用於http://www.cafemom.com。 我收到的http響應被保存在一個.txt文件中,並對應於'不成功的登錄'頁面。需要幫助使用python登錄網站
我也嘗試使用包裝「斜紋」爲此目的,這也沒有爲我工作。任何人都可以提出我應該做什麼?
下面是我用於此目的的主要login()方法。
def urlopen(req):
try:
r = urllib2.urlopen(req)
except IOError, e:
if hasattr(e, 'code'):
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
elif hasattr(e, 'reason'):
print 'We failed to reach a server.'
print 'Reason: ', e.reason
raise
return r
class Cafemom:
"""Communication with Cafemom"""
def __init__(self, cookieFile = 'cookie.jar', debug = 0):
self.cookieFile = cookieFile
self.debug = debug
self.loggedIn = 0
self.uid = ''
self.email = ''
self.passwd = ''
self.cj = cookielib.LWPCookieJar()
if os.path.isfile(cookieFile):
self.cj.load(cookieFile)
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(self.cj))
urllib2.install_opener(opener)
def __del__(self):
self.cj.save(self.cookieFile)
def login(self, email, password):
"""Logging in Cafemom"""
self.email = email
self.passwd = password
url='http://www.cafemom.com/lohin.php?'
cnt='http://www.cafemom.com'
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
body = {'identifier': email, 'password': password }
if self.debug == 1:
print "Logging in..."
req = urllib2.Request(url, urllib.urlencode(body), headers)
print urllib.urlencode(body)
#print req.group()
handle = urlopen(req)
h = handle.read()
f = open("responseCafemom.txt","w")
f.write(f)
f.close()
我也是用這個代碼嘗試和失敗
import urllib, urllib2, cookielib
username = myusername
password = mypassword
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'identifier' : username, 'password' : password})
opener.open('http://www.cafemom.com/login.php', login_data)
resp = opener.open('http://www.cafemom.com')
print resp.read()
你「的login.php」 SPEL響應導致錯誤 - 「lohin.php」。此外,請查看http://cl.ly/272Q2o2q3P2p1g1B1K44 - 注意有更多字段比'標識符'和'密碼'。 – 2012-04-16 01:52:22