2017-08-24 111 views
0

要提取標題中提到的評論評級彈出評級百分比。 這裏給出的html:提取標題標籤元素與beautifulsoup4

a class="a-link-normal" href="http://www.amazon.in/product-reviews/B01FM7GGFI/ref=cm_cr_dp_hist_one/261-4285111-5015802?ie=UTF8&amp;filterByStar=one_star&amp;reviewerType=all_reviews&amp;showViewpoints=0" title="11% of reviews have 1 stars">1 star</a> 

beautifulsoup python腳本:

 from bs4 import BeautifulSoup 
    import requests 
    url = "http://www.amazon.in/Samsung-G-550FY-On5-Pro-Gold/dp/B01FM7GGFI/ref=lp_4363159031_1_1/261-4285111-5015802?s=electronics&ie=UTF8&qid=1503582445&sr=1-1" 

    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'} 
    r = requests.get(url, headers=headers) 
    soup = BeautifulSoup(r.content, "lxml") 

    for link in soup.find_all("div", attrs={"class": "a-fixed-left-grid-col a-col-left"}): 
     for link1 in link.find_all("a", attrs={"class": "a-link-normal"}): 
     print(link1) 
+0

是它可以提取標題標籤元素? – aenish

回答

0
html = '<a class="a-link-normal" href="http://www.amazon.in/product-reviews/B01FM7GGFI/ref=cm_cr_dp_hist_one/261-4285111-5015802?ie=UTF8&amp;filterByStar=one_star&amp;reviewerType=all_reviews&amp;showViewpoints=0" title="11% of reviews have 1 stars">1 star</a>' 
soup = BeautifulSoup(html, 'lxml') 

a_tags = soup.find_all('a', class_='a-link-normal') 
for a in a_tags: 
    if 'title' in a.attrs: 
     print(a['title']) 
+0

它工作。謝謝兄弟! – aenish

+0

,但它適用於所有沒有類屬性的鏈接標記。 – aenish

+0

你可以在'find_all'裏面定義類的屬性,我更新瞭解決方案。 –