2017-10-05 81 views
1

我想在Python中使用BeautifulSoup庫來從html腳本中提取jpg圖像名稱。無論你在哪裏找到srcset,它總是以一個jpg文件名進行。我想以這種方式提取所有jpg文件,但是每當我運行以下代碼時,它都會打印出None。但是在url中,在srcset之後總是有一個jpg文件名。例如,'srcset="https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg"'可以在html中找到。使用美麗的湯從Python提取屬性的Python

import urllib2 
html = urllib2.urlopen("https://www.shopstyle.com/p/prada-notch-lapel-fitted-blazer/645742403").read() 

from bs4 import BeautifulSoup 
soup = BeautifulSoup(html, 'html.parser') 

print soup.find(attrs= {"img":"srcset"}) 
+0

你能給整個標籤在圖片可以找到? –

+0

希望它有幫助 soup.find('img',attrs = {'srcset':True}) – planet260

回答

1

要查找所有網址,srcset你可以這樣做:

import urllib2 
html = urllib2.urlopen("https://www.shopstyle.com/p/prada-notch-lapel-fitted-blazer/645742403").read() 

from bs4 import BeautifulSoup 
soup = BeautifulSoup(html, 'html.parser') 

for el in soup.findAll('img', attrs = {'srcset' : True}): 
    print el['srcset'] 

查詢返回None因爲爭論attrs預期與屬性的關鍵一本字典並過濾爲值。請參閱從bs4 docs

2

試試這個:

soup.find('img')['srcset'] 
'https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg' 
2

我想提取所有的JPG文件

from bs4 import BeautifulSoup 
import requests 

html_doc = requests.get("https://www.shopstyle.com/p/prada-notch-lapel-fitted-blazer/645742403") 
soup = BeautifulSoup(html_doc.content, 'html.parser') 
imgs = [i.get('srcset') for i in soup.find_all('img', srcset=True)] 

print(imgs) 

輸出的解釋:

['https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg', 'https://img.shopstyle-cdn.com/pim/16/c3/16c3e46d3547d6404ba29b61b8f229fd_best.jpg', 'https://img.shopstyle-cdn.com/pim/65/e6/65e6d0e3c0160f0aca361934b999f0c9_best.jpg', 'https://img.shopstyle-cdn.com/sim/31/94/3194ec1ca5e3a56cb83f708533b9084d/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/sim/16/c3/16c3e46d3547d6404ba29b61b8f229fd/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/sim/65/e6/65e6d0e3c0160f0aca361934b999f0c9/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/pim/73/76/737689fa284d6640f7619e5f2f3558a5_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/2c/b0/2cb0acb147bd20df78bc482d66d7218b_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/5c/20/5c20824543749df684f3264c5e976e8c_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/48/b8/48b81f60d61e5c23cdfa343940e43ce9_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/ff/08/ff081818581b0363d4c0ec02c2cba5d4_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/86/0a/860ae7abdde0bf40046d53668abbe126_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/2f/5c/2f5c78d017052b14fd2db0d886a2a326_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/49/d5/49d5de5b62e6ddc0864afee987dd5e67_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/50/04/5004bf25e97ac0e4564d8a219a3b34b4_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/a8/76/a876ac6696e140f34e4cf82b5dbcaadf_xlarge.jpg']