BeautifulSoup困惑

我正在使用BeautifulSoup從Python中查找一些數據的腳本。我有一個堆疊和這麼多的困惑，我的大腦停止工作，我沒有任何想法如何湊這些元素的完整地址：BeautifulSoup困惑

<li class="spacer"> 
<span>Location:</span> 
<br>Some Sample Street<br> 
Abbeville, AL 00000 
</li>

我已經試過類似location = info.find('li', 'spacer').text 但我仍然只拿到串「位置：」。試圖與許多父母 - 孩子關係，但仍不知道如何刮這一個..

任何人都可以幫助我嗎？

來源

2016-08-15 Ukii

適用於本示例HTML中的我。你確定在你正在使用的實際HTML中有完整的地址嗎？ – alecxe

這和我的HTML部分是一樣的，對我來說只打印「位置：」 – Ukii

@Ukii，添加一個鏈接到網站，你發佈了什麼'.find（'li'，'spacer'）.text'會得到顯然這個問題比你告訴我們的還要多。 –

嘗試了這一點：

locations = info.find_all('span',Class_="spacer") 
for location in locations: 
    print (location.text)

來源

2016-08-15 16:22:47

即使我嘗試位置= soup.find_all（「跨度」，類_ =「間隔」）的位置的位置：打印（location.text）它不會打印出任何東西 – Ukii

我不需要位置：我說我總是這樣做，但我需要這兩個元素。 – Ukii

可以使用nextSibling導航到li境內的下一個元素和span

實例後：

from bs4 import BeautifulSoup as Soup 

html_text= """ 
<li class="spacer"> 
<span>Location:</span> 
<br>Some Sample Street<br> 
Abbeville, AL 00000 
</li> 
""" 
location_address = "" 

html_souped = Soup(html_text, 'html.parser') 

# get the next sibling after the span: 
siblings = html_souped.find('li', {'class': 'spacer'}).find('span').nextSibling 

# iterate until the end of the li element: 
while siblings.nextSibling is not None: 
    # add the text to the location: 
    location_address += siblings.nextSibling.text 
    siblings = siblings.nextSibling 

# print the stripped location: 
print('location: ' + location_address.strip())

這將很好地工作如果列表的格式與您給出的示例相同，則列出所有列表。

來源

2016-08-16 14:15:39

BeautifulSoup困惑

回答

相關問題