2016-09-16 58 views
1

使用BeautifulSoup4存儲數據,我可以隔離: 從Python中的標籤與BeautifulSoup4

<a href="#" data-nutrition="{ 
    &quot;serving-name&quot;:&quot;Milk, 2%&quot;, 
    &quot;serving-size&quot;:&quot;16 FL OZ&quot;, 
    &quot;calories&quot;:&quot;267&quot;}"> 
Milk, 2% 
<i class="icon-leaf icon-hidden-text">Meatless</i> 
</a> 

通過運行:

for i in soup('a', attrs={'data-nutrition' : True}): 
    sample = i 
    break 
print(sample) 

我需要創建字典:

my_dict = { 
    'serving-name': 'Milk, 2%', 
    'serving-size': '16 FL OZ', 
    'calories': '267' 
} 

我如何使用Python中的BeautifulSoup4來做到這一點?

回答

1

找到元素,並使用json.loads()data-nutrition屬性數值裝入Python字典:

import json 
from bs4 import BeautifulSoup 


data = """ 
<a href="#" data-nutrition="{ 
    &quot;serving-name&quot;:&quot;Milk, 2%&quot;, 
    &quot;serving-size&quot;:&quot;16 FL OZ&quot;, 
    &quot;calories&quot;:&quot;267&quot;}"> 
Milk, 2% 
<i class="icon-leaf icon-hidden-text">Meatless</i> 
</a>""" 
soup = BeautifulSoup(data, "html.parser") 

a = soup.select_one("a[data-nutrition]") 
nutrition = json.loads(a["data-nutrition"]) 
print(nutrition) 

打印:

{'serving-name': 'Milk, 2%', 'serving-size': '16 FL OZ', 'calories': '267'}