使用beautifulsoup提取屬性值

我試圖提取網頁上特定「輸入」標記中的單個「值」屬性的內容。我使用下面的代碼：使用beautifulsoup提取屬性值

import urllib 
f = urllib.urlopen("http://58.68.130.147") 
s = f.read() 
f.close() 

from BeautifulSoup import BeautifulStoneSoup 
soup = BeautifulStoneSoup(s) 

inputTag = soup.findAll(attrs={"name" : "stainfo"}) 

output = inputTag['value'] 

print str(output)

我得到一個類型錯誤：列表索引必須是整數，而不是str的

即使從Beautifulsoup文檔我的理解是字符串不應該是一個問題在這裏...但ia沒有專家，我可能會誤解。

任何建議，非常感謝！在此先感謝。

來源

2010-04-10 Barnabe

.findAll()返回找到的所有元件的列表，以便：

inputTag = soup.findAll(attrs={"name" : "stainfo"})

inputTag是一個列表（可能只包含一個元素）。這取決於你想要什麼，你要麼應該做的：

output = inputTag[0]['value']

或使用.find()方法，只返回一個（第一個）找到的元素：

inputTag = soup.find(attrs={"name": "stainfo"}) 
output = inputTag['value']

來源

2010-04-10 07:06:28

偉大的東西！謝謝。現在我有一個關於解析輸出的問題，我輸入了很長的一串非ASCII字符，但是我會在單獨的問題中提出這個問題。 – Barnabe 2010-04-10 07:33:30

不應按照http://stackoverflow.com/questions/2616659/extracting-value-in-beautifulsoup訪問'值'。在這種情況下，上述代碼的作用是什麼？我以爲你必須通過'output = inputTag [0]來訪問這個值。內容' – Seth 2010-04-11 23:31:01

@Seth - 不，因爲他正在查找輸入標籤的attrib'value'，而.contents返回標籤封裝的文本（我是.contents） - （現在只是回覆，因爲我必須加倍檢查發生了什麼;圖其他人可能會受益） – 2011-07-27 00:33:01

我真的建議你保存的路要走時間假設你知道哪些標籤具有這些屬性。

假設說一個標籤XYZ有一個名爲「staininfo」那attritube ..

full_tag = soup.findAll("xyz")

而且我wan't你要明白，full_tag是一個列表

for each_tag in full_tag: 
    staininfo_attrb_value = each_tag["staininfo"] 
    print staininfo_attrb_value

因此你可以得到所有所有標記的staininfo的attrb值xyz

來源

2012-07-08 12:20:47 b1tchacked

如果要從上面的源中檢索多個屬性值，可以使用findAll和列表理解g等你需要的一切：

import urllib 
f = urllib.urlopen("http://58.68.130.147") 
s = f.read() 
f.close() 

from BeautifulSoup import BeautifulStoneSoup 
soup = BeautifulStoneSoup(s) 

inputTags = soup.findAll(attrs={"name" : "stainfo"}) 
### You may be able to do findAll("input", attrs={"name" : "stainfo"}) 

output = [x["stainfo"] for x in inputTags] 

print output 
### This will print a list of the values.

來源

2012-08-28 15:35:55 Margath

在Python 3.x，只需將標籤對象使用get(attr_name)你開始使用find_all：

xmlData = None 

with open('conf//test1.xml', 'r') as xmlFile: 
    xmlData = xmlFile.read() 

xmlDecoded = xmlData 

xmlSoup = BeautifulSoup(xmlData, 'html.parser') 

repElemList = xmlSoup.find_all('repeatingelement') 

for repElem in repElemList: 
    print("Processing repElem...") 
    repElemID = repElem.get('id') 
    repElemName = repElem.get('name') 

    print("Attribute id = %s" % repElemID) 
    print("Attribute name = %s" % repElemName)

反對，看起來XML文件conf//test1.xml像：

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> 
<root> 
    <singleElement> 
     <subElementX>XYZ</subElementX> 
    </singleElement> 
    <repeatingElement id="11" name="Joe"/> 
    <repeatingElement id="12" name="Mary"/> 
</root>

打印：

Processing repElem... 
Attribute id = 11 
Attribute name = Joe 
Processing repElem... 
Attribute id = 12 
Attribute name = Mary

來源

2016-11-16 19:36:41 amphibient

，你也可以使用這個：

import requests 
from bs4 import BeautifulSoup 
import csv 

url = "http://58.68.130.147/" 
r = requests.get(url) 
data = r.text 

soup = BeautifulSoup(data, "html.parser") 
get_details = soup.find_all("input", attrs={"name":"stainfo"}) 

for val in get_details: 
    get_val = val["value"] 
    print(get_val)

來源

2017-10-18 18:40:15

使用beautifulsoup提取屬性值

回答

相關問題