進入下一頁的麻煩

當我運行我的函數從某個站點獲取某些鏈接時，它從第一個頁面獲取鏈接，但不是進入下一個頁面來執行相同的操作，而是顯示跟隨錯誤。進入下一頁的麻煩

履帶：

import requests 
from lxml import html 

def Startpoint(mpage): 
    page=4 
    while page<=mpage: 
     address = "https://www.katalystbusiness.co.nz/business-profiles/bindex"+str(page)+".html" 
     tail="https://www.katalystbusiness.co.nz/business-profiles/" 
     page = requests.get(address) 
     tree = html.fromstring(page.text) 
     titles = tree.xpath('//p/a/@href') 
     for title in titles: 
      if "bindex" not in title: 
       if "cdn-cgi" not in title: 
        print(tail + title) 


    page+=1 

Startpoint(5)

錯誤消息：

Traceback (most recent call last): 
    File "C:\Users\ar\AppData\Local\Programs\Python\Python35-32\New.py", line 19, in <module> 
    Startpoint(5) 
    File "C:\Users\ar\AppData\Local\Programs\Python\Python35-32\New.py", line 6, in Startpoint 
    while page<=mpage: 
TypeError: unorderable types: Response() <= int()

來源

2017-04-21 SIM

你分配的requests.get(address)的結果page。然後，Python無法將requests.Response對象與int進行比較。只需撥打page即可，如response。您的最後一行也有縮進錯誤。

import requests 
from lxml import html 

def Startpoint(mpage): 
    page=4 
    while page<=mpage: 
     address = "https://www.katalystbusiness.co.nz/business-profiles/bindex"+str(page)+".html" 
     tail="https://www.katalystbusiness.co.nz/business-profiles/" 
     response = requests.get(address) 
     tree = html.fromstring(response.text) 
     titles = tree.xpath('//p/a/@href') 
     for title in titles: 
      if "bindex" not in title: 
       if "cdn-cgi" not in title: 
        print(tail + title) 


     page+=1 

Startpoint(5)

來源

2017-04-21 17:27:23 bernie

謝謝伯爵先生您的尖銳迴應。它像魔術一樣工作。當這個網站允許我這樣做的時候會接受你的回答。再次感謝。 – SIM

非常歡迎！快樂的編碼給你。 – bernie

我發現了一個傳奇性的錯誤，但是當我發現這個錯誤時，我的腦袋正在旋轉。這就是爲什麼編碼不應該在飛行中實施的原因。再次感謝，先生，伯尼。 – SIM

你覆蓋就行了page變量：page = requests.get(address)

所以，當它得到回while page<=mpage:在第二次迭代，它試圖比較page（現在是一個響應對象）到mpage（一個整數）。

此外，page+=1應該在while循環內。

來源

2017-04-21 17:27:39 Stacktrace

進入下一頁的麻煩

回答

相關問題