網絡與BeautifulSoup和Python

-2

我triyng從這個網站https://hidemy.name/es/proxy-list/#list 網絡與BeautifulSoup和Python

打印出所有的IP地址，但沒有殺發生

代碼在Python 2.7：

import requests 
from bs4 import BeautifulSoup 

def trade_spider(max_pages): #go throw max pages of the website starting from 1 
    page = 0 
    value = 0 
    print('proxies') 
    while page <= 18: 
     value += 64 
     url = 'https://hidemy.name/es/proxy-list/?start=' + str(value) + '#list' #add page number to link 
     source_code = requests.get(url) #get website html code 
     plain_text = source_code.text 
     soup = BeautifulSoup(plain_text, 'html.parser') 

     for link in soup.findAll('td',{'class': 'tdl'}): #get the link of this class 
      proxy = link.string #get the string of the link 
      print(proxy) 

     page += 1 

trade_spider(1)

來源

2017-05-28 I' m not human

你不」沒有看到任何輸出，因爲你的湯裏沒有匹配的元素。我試圖將所有變量轉儲爲輸出流，並發現該網站阻止抓取工具。嘗試打印plain_text變量。它很可能僅包含警告信息，如：

看來你是機器人。如果是這樣，請使用單獨的API接口。它便宜且易於使用。

來源

2017-05-28 11:33:29

網絡與BeautifulSoup和Python

回答

相關問題