1
如何提取網址,我試圖從網頁與以下模式中提取URL:模式匹配的
「http://www.realclearpolitics.com/epolls/????/governor/??/-的.html」
我當前的代碼提取所有鏈接。我怎樣才能改變我的代碼,只提取符合模式的網址?謝謝!
import requests
from bs4 import BeautifulSoup
def find_governor_races(html):
url = html
base_url = 'http://www.realclearpolitics.com/'
page = requests.get(html).text
soup = BeautifulSoup(page,'html.parser')
links = []
for a in soup.findAll('a', href=True):
links.append(a['href'])
find_governor_races('http://www.realclearpolitics.com/epolls/2010/governor/2010_elections_governor_map.html')
謝謝你這麼多。這真的有幫助 – user6283465