2012-02-19 58 views
0

我的代碼PHP +捲髮。它填補的形式,但不會發布

<?php 

$url='Search.jsp'; 
// disguises the curl using fake headers and a fake user agent. 
function disguise_curl($url) 
{ 
    $curl = curl_init(); 

    // Setup headers - I used the same headers from Firefox version 2.0.0.6 
    // below was split up because php.net said the line was too long. :/ 
    $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,"; 
    $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; 
    $header[] = "Cache-Control: max-age=0"; 
    $header[] = "Connection: keep-alive"; 
    $header[] = "Keep-Alive: 300"; 
    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; 
    $header[] = "Accept-Language: en-us,en;q=0.5"; 
    $header[] = "Pragma: "; // browsers keep this blank. 


    curl_setopt($curl, CURLOPT_URL, $url); 
    curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)'); 
    curl_setopt($curl, CURLOPT_HTTPHEADER, $header); 
    curl_setopt($curl, CURLOPT_REFERER, 'https://lalpacweb.blackpool.gov.uk/protected/wca/publicRegisterVehicleSearch.jsp'); 
    curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate'); 
    curl_setopt($curl, CURLOPT_AUTOREFERER, 1); 
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($curl, CURLOPT_COOKIESESSION, false); 

    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); 

    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true); 
    curl_setopt($curl, CURLOPT_COOKIEJAR, "cookies.txt"); 
    curl_setopt($curl, CURLOPT_COOKIEFILE, "cookies.txt"); 
    curl_setopt($curl, CURLOPT_HEADER, 1); 
curl_setopt($curl, CURLOPT_POST, 1); 
    curl_setopt ($curl, CURLOPT_POSTFIELDS, 'search.licenceTypeID=34&search.licenceLinkFileID=2&search.vehicleRegNumber=5&publicRegisterVehicle=Search'); 
    $html = curl_exec($curl); // execute the curl command 
    echo curl_getinfo($curl, CURLINFO_HTTP_CODE); 
    curl_close($curl); // close the connection 
    return $html; // and finally, return $html 
} 

// uses the function and displays the text off the website 
$text = disguise_curl($url); 
echo $text; 
?> 

它返回一個頁面,使用表格填寫,但它並沒有張貼。該curl_getinfo響應我得到的是..

200HTTP/1.1 200 OK雜注:無緩存緩存控制: 無緩存,無店鋪,必重新驗證到期日:星期四,1970年00:00 01一月:00 GMT內容類型:text/html; charset = ISO-8859-1內容 - 語言:en-GB 內容長度:5901日期:Sun,19 Feb 2012 12:24:08 GMT服務器: Apache

任何想法?

感謝您的幫助

+0

看來你的代碼是正確的,所以你應該確保你所要求的字段是正確的,與狀態200的迴應顯示你的要求是正確的,但可能你的領域是不正確的,你可以在服務器上編寫一個測試頁面並使用上面的代碼來測試。 – Sean 2012-02-19 12:44:18

+0

狀態200代碼是否顯示錶單已發佈?或者它會不會迴應這些信息?該表單有兩個提交按鈕,我通過說'publicRegisterVehicle = Search'來指定一個,是正確的嗎? – Tom 2012-02-19 12:49:24

+0

頁面上有2個可能需要設置的隱藏表單域。 _sourcePage和__fp – ben 2012-02-24 19:09:56

回答

3

有你可能會想要做的幾件事情,首先我相信,如果你提供一個絕對路徑cookiejar它可以跨越不同的操作系統更好:

curl_setopt($curl, CURLOPT_COOKIEJAR, dirname(__FILE__) . "/cookies.txt"); 
curl_setopt($curl, CURLOPT_COOKIEFILE, dirname(__FILE__) . "/cookies.txt"); 

此外,您還可以使用該腳本瀏覽網頁第一次搶會話cookie:

disguise_curl("https://lalpacweb.blackpool.gov.uk"); 

然後您可以將表格郵寄到https://lalpacweb.blackpool.gov.uk/protected/actions/PublicRegister.action(確保cookie.txt的存在):

<?php 

// disguises the curl using fake headers and a fake user agent. 
function disguise_curl($url, $post = false) 
{ 
    $curl = curl_init(); 

    // Setup headers - I used the same headers from Firefox version 2.0.0.6 
    // below was split up because php.net said the line was too long. :/ 
    $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,"; 
    $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; 
    $header[] = "Cache-Control: max-age=0"; 
    $header[] = "Connection: keep-alive"; 
    $header[] = "Keep-Alive: 300"; 
    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; 
    $header[] = "Accept-Language: en-us,en;q=0.5"; 
    $header[] = "Pragma: "; // browsers keep this blank. 


    curl_setopt($curl, CURLOPT_URL, $url); 
    curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)'); 
    curl_setopt($curl, CURLOPT_HTTPHEADER, $header); 
    curl_setopt($curl, CURLOPT_REFERER, 'https://lalpacweb.blackpool.gov.uk/protected/wca/publicRegisterVehicleSearch.jsp'); 
    curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate'); 
    curl_setopt($curl, CURLOPT_AUTOREFERER, 1); 
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($curl, CURLOPT_COOKIESESSION, false); 

    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); 

    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true); 
    curl_setopt($curl, CURLOPT_COOKIEJAR, dirname(__FILE__) . "/cookies.txt"); 
    curl_setopt($curl, CURLOPT_COOKIEFILE, dirname(__FILE__) . "/cookies.txt"); 
    curl_setopt($curl, CURLOPT_HEADER, 1); 
    if ($post) 
    { 
    curl_setopt($curl, CURLOPT_POST, 1); 
    curl_setopt ($curl, CURLOPT_POSTFIELDS, 'search.licenceTypeID=34&search.licenceLinkFileID=2&search.vehicleRegNumber=5&publicRegisterVehicle=Search'); 
    } 
    $html = curl_exec($curl); // execute the curl command 
    //echo curl_getinfo($curl, CURLINFO_HTTP_CODE); 
    curl_close($curl); // close the connection 
    return $html; // and finally, return $html 
} 

// Visit the home-page first to get the session cookie 
disguise_curl("https://lalpacweb.blackpool.gov.uk"); 

// uses the function and displays the text off the website 

$url = 'https://lalpacweb.blackpool.gov.uk/protected/actions/PublicRegister.action'; 

$text = disguise_curl($url, true); 
echo $text; 
?> 
+0

絕對太棒了!非常感謝你的幫助。 – Tom 2012-02-24 20:33:54

1

當我的瀏覽器中打開https://lalpacweb.blackpool.gov.uk/protected/wca/publicRegisterVehicleSearch.jsp,我重定向到https://lalpacweb.blackpool.gov.uk/sessiontimeout.jsp並與「會話超時」的錯誤呈現。也許你必須提出兩個要求。一個登錄(並可能獲得會話cookie),一個實際執行搜索。 curl會在同一會話中自動發送之前請求中收到的cookies。否則將其設置爲curl_setopt($curl, CURLOPT_COOKIE, 'CookieName=CookieValue');

+0

嗨,謝謝你的回覆。在我自己做了更多的研究之後,我發現如果我手動去URL,發一篇文章,然後我得到一個jsessionID cookie。現在,當我使用jsessionID更新我的cookies.txt時,我的腳本工作並加載頁面。但是,它似乎只是從會話時間開始加載信息。看起來,當我的腳本發佈表單時,它會保存一個jsessionID,但在我的網站上創建的那些是無效的,並且不起作用。但是,如果我使用在本地網站上創建的並更新我的cookies.txt,它就可以工作。 – Tom 2012-02-24 20:03:48

+0

問題是,sessionID在10分鐘後過期。所以我必須繼續前往該網站,發佈表單,獲取會話ID,然後更新我的cookies.txt文件。任何想法爲什麼,當通過curl發佈表單時,創建的sessionID不起作用。它更新cookies.txt,並創建會話,但這對服務器沒有任何意義。但我不明白爲什麼不 - 服務器認爲引用頁面本身,而且這是一個常規請求。 – Tom 2012-02-24 20:09:12

0
$post = urlencode('search.licenceTypeID=34&search.licenceLinkFileID=2&search.vehicleRegNumber=5&publicRegisterVehicle=Search'); 

$post = array(
'search.licenceTypeID' => 34, 
'search.licenceLinkFileID' => 2, 
'search.vehicleRegNumber' => 5, 
'publicRegisterVehicle' => 'Search' 
) 


curl_setopt ($init, CURLOPT_POSTFIELDS, $post);