2017-07-31 91 views
1

我已經使用POST請求在python中編寫了一些代碼以從網頁中獲取特定數據。但是,當我運行它時,除了空白控制檯外,沒有任何結果。我試圖相應地填寫請求參數。也許,我不能注意到哪些應該包含在參數中。我正在處理的頁面在其右側面板中包含多個圖像。點擊圖片時,我在這裏談論的請求被髮送到服務器,並將結果返回並顯示有關其下的風味的新信息。我的目標是解析連接到每個圖像的所有風味。無論如何,我試圖附上所有必要的事情,以找出我失蹤的事情。提前致謝。POST請求給出空結果

這是我從Chrome開發者工具一定要準備POST請求:

=================================================================================== 
General: 
Request URL:https://www.optigura.com/product/ajax/details.php 
Request Method:POST 
Status Code:200 OK 

Response Headers: 
Cache-Control:no-store, no-cache, must-revalidate 
Cache-Control:max-age=0, no-cache, no-store, must-revalidate 
Connection:Keep-Alive 
Content-Encoding:gzip 
Content-Length:782 
Content-Type:text/html; charset=utf-8 

Request Headers: 
Accept:application/json, text/javascript, */*; q=0.01 
Accept-Encoding:gzip, deflate, br 
Accept-Language:en-US,en;q=0.8 
Connection:keep-alive 
Content-Length:34 
Content-Type:application/x-www-form-urlencoded 
Cookie:OGSESSID=s1qqd0euokbfrdub9pf2efubh1; _ga=GA1.2.449310094.1501502802; _gid=GA1.2.791686763.1501502802; _gat=1; __atuvc=1%7C31; __atuvs=597f1d5241db0352000; beyable-TrackingId=499b4c5b-2939-479b-aaf0-e5cd79f078cc; aaaaaaaaa066e9a68e5654b829144016246e1a736=d5758131-71db-41e1-846d-6d719d381060.1501502805122.1501502805122.$bey$https%3a%2f%2fwww.optigura.com%2fuk%2fproduct%2fgold-standard-100-whey%2f$bey$1; aaaaaaaaa066e9a68e5654b829144016246e1a736_cs=; aaaaaaaaa066e9a68e5654b829144016246e1a736_v=1.1.0; checkloc-uk=n 
Host:www.optigura.com 
Origin:https://www.optigura.com 
Referer:https://www.optigura.com/uk/product/gold-standard-100-whey/ 
User-Agent:Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 
X-Requested-With:XMLHttpRequest 

Form Data: 
opt:flavor 
opt1:207 
opt2:47 
ip:105 
======================================================================================= 

這裏就是我,試圖:

import requests 
from lxml import html 

payload = {"opt":"flavor","opt1":"207","opt2":"47","ip":"105"} 
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.81 Safari/537.36'} 
response = requests.post("https://www.optigura.com/product/ajax/details.php", params = payload, headers = headers).text 
print(response) 

原來這是鏈接到網頁: https://www.optigura.com/uk/product/gold-standard-100-whey/

+0

您不在POST主體中發送值,'params'設置URL查詢參數。改用'data'。 –

回答

2

你應該試試下面的請求結構:

  • 要發送的數據:

    data = {'opt': 'flavor', 'opt1': '207', 'opt2': '47', 'ip': 105} 
    
  • 頁眉:

    headers = {'X-Requested-With': 'XMLHttpRequest'} 
    
  • 網址:

    url = 'https://www.optigura.com/product/ajax/details.php' 
    
  • 你也需要得到餅乾,所以requests.session()要求:

    s = requests.session() 
    r = s.get('https://www.optigura.com/uk/product/gold-standard-100-whey/') 
    cookies = r.cookies 
    

完成請求:

response = s.post(url, cookies=cookies, headers=headers, data=data) 

現在,您可以得到所需的一塊HTML

print(response.json()['info2']) 

輸出的:

'<ul class="opt2"><li class="active"> 
        <label> 
         <input type="radio" name="ipr" value="1360" data-opt-sel="47" checked="checked" /> Delicious Strawberry - <span class="green">In Stock</span></label> 
       </li><li> 
        <label> 
         <input type="radio" name="ipr" value="1356" data-opt-sel="15" /> Double Rich Chocolate - <span class="green">In Stock</span></label> 
       </li><li> 
        <label> 
         <input type="radio" name="ipr" value="1169" data-opt-sel="16" /> Vanilla Ice Cream - <span class="green">In Stock</span></label> 
       </li></ul>' 

然後你可以使用lxml刮味值:

from lxml import html 

flavors = response.json()['info2'] 
source = html.fromstring(flavors) 

[print(element.replace(' - ', '').strip()) for element in source.xpath('//label/text()[2]')] 

輸出:

Delicious Strawberry 
Double Rich Chocolate 
Vanilla Ice Cream 
+0

哦,我的上帝!多麼精細的答案!它完成了我之後的工作。 Martijn Pieters爵士也提出了相同的建議,但我無法正確理解事情應該如何。非常感謝,安德森先生。你讓我今天一整天都感覺很好。 – SIM

4

您不在POST正文中發送值,params設置URL查詢參數。使用data代替:

response = requests.post(
    "https://www.optigura.com/product/ajax/details.php", 
    data=payload, 
    headers=headers) 

您可能需要設置一個網址標頭(加'Referer': 'https://www.optigura.com/uk/product/gold-standard-100-whey/'到你的頭字典),並使用session object捕獲和管理餅乾(發出GET請求https://www.optigura.com/uk/product/gold-standard-100-whey/第一)。

通過一些實驗,我注意到該網站還要求在設置X-Requested-With標題之前設置其內容。

以下工作:

with requests.session(): 
    session.get('https://www.optigura.com/uk/product/gold-standard-100-whey/') 
    headers = { 
     'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.81 Safari/537.36', 
     'Referer': 'https://www.optigura.com/uk/product/gold-standard-100-whey/', 
     'X-Requested-With': 'XMLHttpRequest' 
    } 
    response = session.post(
     "https://www.optigura.com/product/ajax/details.php", 
     data=payload, headers=headers) 

響應之際,JSON數據:

data = response.json() 
+0

試圖應付你所建議的先生Martijn Pieters。不過對我來說有點先進水平! – SIM