2011-09-22 76 views
1

我有一個幾乎工作代碼連接到服務器,然後進行搜索。但是這裏的代碼有一些問題。這就是我現在用下面的python腳本所做的。我登錄到該網站。構建搜索結果的URL並訪問搜索頁面並在瀏覽器中打開它。但我在這裏面臨的問題是,我正在獲取會話過期頁面。但是,如果我在瀏覽器中打開(手動)登錄頁面,則此構造的url會給我所需的輸出。所以我的問題在於,如何讓腳本中的「登錄」會話保持活動狀態,並在瀏覽器中打開構建的url,從而獲得所需的輸出。自動登錄在python

#!/usr/bin/env python 

import urllib, urllib2, cookielib, mechanize, webbrowser, subprocess 
from mechanize import ParseResponse, urlopen, urljoin 

def main(): 
    usr = 'sg092350' 
    pwd = 'gk530911' 
    login_url = 'http://int15.sla.findhere.net/logininq.act?&site=superarms' 

    search_url_1 = '&need_air=yes&need_rail=no&need_train=no&need_hotel=no&need_car=no&origin_request=regular+booking&monAbbrList[0]=9&monAbbrList[1]=9&dateList[0]=29&dateList[1]=30&pickUpCity=&pickUpTime=&dropOffCity=&dropOffTime=&dispAOptions=&dispADestinations=&checkSurroundingAirports=false&doEncodeDecodeForSurrArpt=false&checkPlusMinusDays=N&tripType=roundTrip&itinType=on&departList[0]=BLR&destinationList[0]=DEL&date0=9%2F29%2F11&travelMethodList[0]=departs&timeList[0]=8&date1=9%2F30%2F11&travelMethodList[1]=departs&timeList[1]=8&numPassengers=1&cabinClass=1&pricingType=1&preferredCarrier[0]=&preferredCarrier[1]=&preferredCarrier[2]=&userRequestedWebFares=true' 



    br = mechanize.Browser() 
    cj = cookielib.CookieJar() 

    br.set_cookiejar(cj) 
    br.set_handle_robots(False) 

    br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2')] 

    br.open(login_url) 
    br.select_form('loginForm') 

    br.form['user'] = usr 
    br.form['pwd'] = pwd 

    br.submit() 
    print br.geturl() 

    response = urlopen(login_url) 
    forms = ParseResponse(response, backwards_compat=False) 
    form = forms[0] 
    token = forms[0]['token'] 
    site_id = forms[0]['siteID'] 
    site = forms[0]['site'] 
    water_mark = forms[0]['watermark'] 
    trans_index = forms[0]['transIndex'] 


    search_url_0 = 'http://int15.sla.findhere.net/pwairavail.act;' + '?site=' + site + '&sid=4' + '&siteID=' + site_id + '&watermark=' + water_mark + '&token=' + token + '&transIndex=' + trans_index + search_url_1 
    print search_url_0 

    print token 
    print site_id 
    print site 
    print water_mark 
    print trans_index 

    print form 

    response.read() 

    #Inserting code to generate html and display the overlay 

    htmlString = """ 
    <html> 
    <head> 
    <title>DirectBook 1.0</title> 
    <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4/jquery.min.js"></script> 
    <script type="text/javascript" src="./fx/jquery.fancybox-1.3.4.pack.js"></script> 
    <link rel="stylesheet" type="text/css" href="./fx/jquery.fancybox-1.3.4.css" media="screen"/> 
    <script type="text/javascript"> 
    $(document).ready(function() { 
    $("#urlLink").fancybox 
    ({ 
    'width'  : '100%', 
    'height'  : '100%', 
    'autoScale'  : false, 
    'transitionIn' : 'fade', 
    'transitionOut' : 'fade', 
    'type'  : 'iframe' 
    }); 
    }); 
    </script> 
    </head> 
    <body onload="document.getElementById('urlLink').click()"> 
    <div id="content"> 
    <script type="text/javascript"> 
    var search_url = " """ + search_url_0 + """ "; 
    document.write('<a id="urlLink"' + 'href="' + search_url + '"></a>'); 
    </script> 
    </div> 
    </body> 
    </html>""" 


    # write the html file to the working folder 
    fout = open("search.html", "w") 
    fout.write(htmlString) 
    fout.close() 

    subprocess.Popen('"C:\\Program Files\\Mozilla Firefox\\firefox.exe" "C:\\Python27\\mechanize-0.2.5\\search.html"') 


if __name__ == '__main__': 
    main() 

回答

1

您需要在Mechanize和瀏覽器之間共享會話(或會話標識符,可能存儲在cookie中)。這並不容易,並且在瀏覽器之間絕對不可移植(如果你需要的話)。

然而,似乎是在機械化的支持,因爲3版本的Firefox使用的SQLite數據庫格式: https://github.com/jjlee/mechanize/blob/master/mechanize/_firefox3cookiejar.py

您可能要檢查的文檔。

1

這可能是在登錄部分有所幫助:。

(能得到現場使用Wireshark來發送數據也是「用戶」可能是別的東西,例如「用戶名」同用「密碼」再次Wireshark的將有助於這一點。也可以看看登錄頁面的源。祝你好運!)

from urllib import urlencode 

from urllib2 import Request, urlopen 

req = Request('www.site.com',urlencode({'user':'userhere', 'password':'passwordhere'})) 

open = urlopen(req)