數據使用Python

-2

我使用Python從這個URL https://www.jumia.com.ng/mobile-phones/數據使用Python

這裏湊手機的名稱開發的腳本刮是我的腳本：

from urllib.request import urlopen as uReq 
from bs4 import BeautifulSoup as soup 
my_url = 'https://www.jumia.com.ng/mobile-phones/' 
uClient =uReq(my_url) #open connection.. grab the page 
page_html = uClient.read() #load the content into a varaible 
uClient.close() #close the console 
page_soup = soup(page_html, "html.parser") #it does the html parser 
phone_name = page_soup.findAll("span",{"class":"name"}) #grabs each phone name 
print (phone_name)

我預期的結果應該是這樣的：

Marathon M5 Mini 5.0-Inch IPS (2GB, 16GB ROM) Android 5.1 Lollipop, 13MP + 8MP Smartphone - Grey

但我得到的是這樣的：

<span class="name" dir="ltr">Marathon M5 Mini 5.0-Inch IPS (2GB, 16GB ROM) Android 5.1 Lollipop, 13MP + 8MP Smartphone - Grey</span>.

如何從<span class="name" dir="ltr">Marathon M5 Mini 5.0-Inch IPS (2GB, 16GB ROM) Android 5.1 Lollipop, 13MP + 8MP Smartphone - Grey</span>中提取文本？

來源

2017-05-31 AKP

你所得到的，什麼是預期的是在你的問題同樣 –

你看過[BeautifulSoup文檔]（https://www.crummy.com/software/BeautifulSoup/bs4/doc/）？ – errata

使用python請求。它比urllib –

要提取的名字，用.text

>>> for phone_name in page_soup.findAll("span",{"class":"name"}): 
     print(phone_name.text) 

Boom J8 5.5 Inch (2GB, 16GB ROM) Android Lollipop 5.1 13MP + 5MP Smartphone - White (MWFS) 
Marathon M5 Mini 5.0-Inch IPS (2GB, 16GB ROM) Android 5.1 Lollipop, 13MP + 8MP Smartphone - Grey

因此你的腳本應該是這樣的：

from urllib.request import urlopen as uReq 
from bs4 import BeautifulSoup as soup 
my_url = 'https://www.jumia.com.ng/mobile-phones/' 
uClient =uReq(my_url) #open connection.. grab the page 
page_html = uClient.read() #load the content into a varaible 
uClient.close() #close the console 
page_soup = soup(page_html, "html.parser") #it does the html parser 
for phone_name in page_soup.findAll("span",{"class":"name"}): 
    print(phone_name.text)

來源

2017-05-31 09:38:00

數據使用Python

回答

相關問題