網頁收穫 - 刮一個網址

我正在使用網頁收穫。不過，我想從URL報廢數據：網頁收穫 - 刮一個網址

http://derstandard.at/anzeiger/immoweb/Suchergebnis.aspx?Regionen=9&Bezirke=&Arten=&AngebotTyp=&timestamp=1363305908912

我的代碼是：

<?xml version="1.0" encoding="UTF-8"?> 

<config> 
    <var-def name="google"> 
    <html-to-xml> 
    <http url="http://derstandard.at/anzeiger/immoweb/Suchergebnis.aspx?Regionen=9&Bezirke=&Arten=&AngebotTyp=&timestamp=1363305908912"></http> 
    </html-to-xml> 
    </var-def> 
</config>

但是我得到：

參考實體Bezirke必須以';'結尾

我不明白web收穫是什麼意思，用';'？

來源

2013-03-15 user2051347

我不知道你如何去收穫網絡，但我會建議你使用Jsoup。這非常簡單而實用。 – cwhsu 2013-03-15 00:23:33

我不知道太多關於網絡的收穫，但他們的榜樣具有這樣的：

<xpath expression="//a[@shape='rect']/@href"> 
    <html-to-xml> 
     <http url="http://www.somesite.com/"/> 
    </html-to-xml> 
</xpath> 

<http url =".." />

而你的代碼有

<http url = ".."></http>

也許這是你的問題？不需要結束標記

來源

2013-03-15 00:17:03 ObjectNameDisplay

你應該在你的url中編碼ampresands ie。每&更換&。

來源

2013-04-26 11:04:09

網頁收穫 - 刮一個網址

回答

相關問題