如何使用scrapy從python跨度獲取文本？

我在這裏把HTML代碼：如何使用scrapy從python跨度獲取文本？

<div class="rendering rendering_person rendering_short rendering_person_short"> 
    <h3 class="title"> 
    <a rel="Person" href="https://moh-it.pure.elsevier.com/en/persons/massimo-eraldo-abate" class="link person"><span>Massimo Eraldo Abate</span></a> 
    </h3> 
    <ul class="relations email"> 
    <li class="email"><a href="[email protected]" class="link"><span>[email protected]</span></a></li> 
    </ul> 
    <p class="type"><span class="family">Person: </span>Academic</p> 
</div>

從上面的代碼如何提取馬西莫Eraldo阿巴特？

請幫幫我。

來源

2017-08-29 rajeshbojja

您可以提取使用

response.xpath('//h3[@class="title"]/a/span/text()').extract_first()

同樣的名字，看看這個Scrapinghub的blogpost引入XPath的。

來源

2017-08-29 06:58:36

XPath和正則表達式的skillz的方式被用於獲取沃徹一個必須要......像前面XPath的深入瞭解將節省你這麼多hastle ......仰望XPath語法... ...和孩子們點頭等等......特別幫助我尋找分頁，我發現 – scriptso

請看看這個頁面。有很多中提取文本 scrapy docs

>>> body = '<html><body><span>good</span></body></html>' 
>>> Selector(text=body).xpath('//span/text()').extract() 

>>> response = HtmlResponse(url='http://example.com', body=body) 
>>> Selector(response=response).xpath('//span/text()').extract()

來源

2017-08-29 06:59:50 rafalf

如何使用scrapy從python跨度獲取文本？

回答

相關問題