2016-11-21 64 views
1

我想h1標籤的文本,以瞭解如何使用提取美麗的湯包含了很多別人的標籤的h1標籤的文本:如何提取與beautifulsoup

<h1 class="listing-name"> 
 
Hôtel Vevey 
 
<span class="entry-feedbacks-summary-title-rating-stars-container bootstrap"> 
 
<span class="entry-feedbacks-summary-title-rating-stars entry-feedbacks-summary-title-rating-stars-empty" data-container=".entry-feedbacks-summary-title-rating-stars-container" data-content="Il n'y a pas encore d'avis de clients à propos de Astra Hôtel Vevey 4*sup. Cliquez pour évaluer." data-placement="right" data-toggle="popover" data-trigger="hover" data-original-title="" title=""> 
 
<a class="feedback-login-link entry-feedbacks-header-link" href="/auth/localch?origin=https%3A%2F%2Ftel.local.ch%2Ffr%2Fd%2FVevey%2F1800%2FHotel%2FAstra-Hotel-Vevey-4sup-SVGb8b5z-QdrzGTddmyAAg%3Fwhat%3DHotel%26where%3DVaud%2B%2528Canton%2529%23entry-feedbacks-bottom-rate-button"><span class="entry-feedback-rating-star"> 
 
<i class="icon-star-outline entry-feedback-rating-star-empty"></i> 
 
</span> 
 
<span class="entry-feedback-rating-star"> 
 
<i class="icon-star-outline entry-feedback-rating-star-empty"></i> 
 
</span> 
 
<span class="entry-feedback-rating-star"> 
 
<i class="icon-star-outline entry-feedback-rating-star-empty"></i> 
 
</span> 
 
<span class="entry-feedback-rating-star"> 
 
<i class="icon-star-outline entry-feedback-rating-star-empty"></i> 
 
</span> 
 
<span class="entry-feedback-rating-star"> 
 
<i class="icon-star-outline entry-feedback-rating-star-empty"></i> 
 
</span> 
 

 
</a></span> 
 

 
</span> 
 
</h1>

我m試圖在h1標籤「hôtelVevey」之後提取文本。

import requests 
from bs4 import BeautifulSoup 

url = "https://tel.local.ch/fr/d/Vevey/1800/Hotel/Astra-Hotel-Vevey-4sup-SVGb8b5z-QdrzGTddmyAAg?what=Hotel&where=Vaud+%28Canton%29" 
get_url = requests.get(url) 
get_text = get_url.text 
soup = BeautifulSoup(get_text, "html.parser") 

company = soup.find_next('h1', 'class:listing-name') 


print(company) 

它返回我 「無」

回答

3

爲您提供,你可以得到它像這樣當前鏈接:

company = soup.select('h1.listing-name')[0].text.strip() 
print(company) 

輸出:

Astra Hôtel Vevey 4*sup 
+1

謝謝!我確實嘗試過這個,但我沒有想到這個地帶! – jjyoh

2

使用try字典:

company = soup.find('h1', {'class' : 'listing-name'}) 

或以下類型:

company = soup.find('h1', class_ ='listing-name') 

注意課後下劃線。這是因爲class是python中的保留字。

更多信息可在這裏找到:https://www.crummy.com/software/BeautifulSoup/bs4/doc/#attrs

+0

已經嘗試過了,它會返回所有其他標籤 – jjyoh

+0

soup.select():1000循環,最好爲3:每循環815μs soup.find():1000循環,最好爲3:每循環1.61 ms; select()的速度要快2倍。 – MYGz