我有我創建的腳本HTML代碼: http://imgur.com/a/dPNYI
我想提取高亮文本(「一些文本」),並打印出來。
我試圖通過每一個div嵌套在去的路上,我需要在div,像這樣:使用BeautifulSoup來提取特定div嵌套
import requests
from bs4 import BeautifulSoup
url = "the url this is from"
r = requests.get(url)
for div in soup.find_all("div", {"id": "main"}):
for div2 in div.find_all("div", {"id": "app"}):
for div3 in div2.find_all("div", {"id": "right-sidebar"}):
for div4 in div3.find_all("div", {"id": "chat"}):
for div5 in div4.find_all("div", {"id": "chat-messages"}):
for div6 in div5.find_all("div", {"class": "chat-message"}):
for div7 in div6.find_all("div", {"class": "chat-message-content selectable"}):
print(div7.text.strip())
我實現了我的導遊和類似的問題已經在網上看到,但我敢打賭,這甚至不是關閉,必須有一個更簡單的方法。
這不起作用。它沒有打印任何東西,我有點迷路。我怎樣才能打印突出顯示的行(這實質上是div的第一個div子,id爲「chat-messages」)?
HTML代碼:
<!DOCTYPE html>
<html>
<head>
<title>
</title>
</head>
<body>
<div id="main">
<div data-reactroot="" id="app">
<div class="top-bar-authenticated" id="top-bar">
</div>
<div class="closed" id="navigation-bar">
</div>
<div id="right-sidebar">
<div id="chat">
<div id="chat-head">
</div>
<div id="chat-title">
</div>
<div id="chat-messages">
<div class="chat-message">
<div class="chat-message-avatar" style="background-image: url("https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/65/657dcec97cc00bc378629930ecae1776c0d981e0.jpg");">
</div>
<a class="chat-message-username clickable">
<div class="iron-color">
aloe
</div></a>
<div class="chat-message-content selectable">
<!-- react-text: 2532 -->some text<!-- /react-text -->
</div>
</div>
<div class="chat-message">
</div>
<div class="chat-message">
</div>
<div class="chat-message">
</div>
<div class="chat-message">
</div>
<div class="chat-message">
</div>
請將html作爲文本發佈,而不是圖片,它可以幫助每個人試圖幫助! –
@ViníciusAguiar你說得對,現在就做! –