從雅虎財務標題的廢料數據

我在本網站做了一些研究，找到一種方法來解決我的問題，但要麼太舊的線程（雅虎刷新它的頁面在幾年前），或者他們太複雜了（我還在新刮）。我想在此代碼創建的csv文件中搜索關鍵字。從雅虎財務標題的廢料數據

我用這個代碼，但在雅虎的頭條有點棘手，讓我解釋一下。

# import libraries 
import urllib2 
from bs4 import BeautifulSoup 
import csv 
from datetime import datetime 

quote_page = 'https://finance.yahoo.com/' 
page = urllib2.urlopen(quote_page) 
soup = BeautifulSoup(page, 'html.parser') 
name_box = soup.find('h1', attrs={'class': 'name'}) 
name = name_box.text.strip() 
print name 


with open('index.csv', 'a') as csv_file: 
writer = csv.writer(csv_file) 
writer.writerow([name, ])

正如你可以在圖中看到，標題是這兩者之間： - 反應文本：！3388 - > - /反應文本 - > 但我不知道如何將我的代碼轉換成能夠讀取這些代碼。

解決方案可能很簡單，但我嘗試了很多東西，似乎沒有任何工作。

我希望你能幫助我或者找到另一種方式來查找這些標題關鍵字。

非常感謝您提前。

來源

2017-04-03 VI55

你試過選擇的innerText對於剛剛StretchedBox上述錨標記？ –

我用requests而不是urllib2。從我聽到的，這是更多的人使用。

至於標題：

import requests 
from bs4 import BeautifulSoup 
a = requests.get('https://finance.yahoo.com/m/8bb0b8f6-9b97-32df-8f56-31690cd85cea/long-lines-are-killing.html') 
soup = BeautifulSoup(a.content, 'lxml') 
search = soup.find_all('h1', {'class':'Lh(36px) Fz(25px)--sm Fz(32px) Mb(17px)--sm Mb(20px) Mb(30px)--lg Ff($ff-primary) Lts($lspacing-md) Fw($fweight) Fsm($fsmoothing) Fsmw($fsmoothing) Fsmm($fsmoothing) Wow(bw)'}) 
print(search[0].text) # prints Long Lines Are Killing Starbucks, So Here's Its Bold New Solution to the Major Problem

來源

2017-04-04 20:50:30

從雅虎財務標題的廢料數據

回答

相關問題