2011-11-26 91 views
1

我想解析1個url和後綴,我想從中訪問一些數據。 html頁面的解析html頁面

try { 
     Document doc = Jsoup.connect("http://abc.com/en/currency/default.aspx").get();//abc is for example as i cant put site name 
     Elements td = doc.select("ctl00_ContentPlaceHolder1_currencylist_rptCurrencyList_ctl01_trList"); //this is the name of table row in html page i will show html page snippet also 
     String temp=td.val(); 
     info.setText(temp); 
    } 
    catch (IOException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } 

片段,我想分析如下:

 <tr id="ctl00_ContentPlaceHolder1_currencylist_rptCurrencyList_ctl01_trList"> 
<td width="400px" class="CurrencyListItems">    
     UK POUND 
     </td> 
<td width="60px;" class="CurrencyListItemsIN" align="center"> 
     5.72 
     </td> 
<td width="150px;" class="CurrencyListItemsLast"> 
      <table cellspacing ="0" cellpadding ="0" width="100%"> 
        <tr> 
        <td class="CurrencyListBANKNOTES" align="center">       
        5.625 
        </td> 
        <td class="CurrencyListBANKNOTES2" width="75px" align="center"> 

        5.75 
        </td> 
        </tr> 
      </table> 
     </td> 

我從上面的html英鎊想,5.625,5.75 我想上面的代碼,但THNG是不解析URL只是其強行出來,如果嘗試

+0

看到我的回答現在 – confucius

回答

2

試試這個:

Element tr = doc.getElementById("ctl00_ContentPlaceHolder1_currencylist_rptCurrencyList_ctl01_trList"); 

嘗試

String contents = tr.text().trim(); 
contents = contents.replaceAll("\\s+"," "); 
contents = contents. replaceAll("\\<.*?>","-"); 
String []values = contents.split("-"); 

Elements elements = tr.select("*"); 
for (Element element : elements) { 
    System.out.println(element.ownText()); 
} 
+0

@ Nammari-UR儘早解決wrked完美,絕對的THNG是達分裂DNT wrkd所以裏面的值數組整個4個項目的來源,但紋身這不是一個概率這個upvoting是爲了ans ans.thnx v多.. – sups

+0

@ Nammari-thi新的更新答你把它放在gvng errot上(Element element:elements) 作爲未找到的元素。 – sups

+0

@ Nammari-嘿thnx很多..它工作我GT笏是錯誤而不是foment中的元素r循環它應該是allElementInTr .. – sups