對於HTML的解析,東西我總是用HtmlCleaner:http://htmlcleaner.sourceforge.net/
真棒LIB與XPath和Android的路線的偉大工程。 :-)
這表明如何從網址下載一個XML和解析它從一個XML屬性(也顯示在文檔)獲得一定值:
public static String snapFromHtmlWithCookies(Context context, String xPath, String attrToSnap, String urlString,
String cookies) throws IOException, XPatherException {
String snap = "";
// create an instance of HtmlCleaner
HtmlCleaner cleaner = new HtmlCleaner();
// take default cleaner properties
CleanerProperties props = cleaner.getProperties();
props.setAllowHtmlInsideAttributes(true);
props.setAllowMultiWordAttributes(true);
props.setRecognizeUnicodeChars(true);
props.setOmitComments(true);
URL url = new URL(urlString);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setDoOutput(true);
// optional cookies
connection.setRequestProperty(context.getString(R.string.cookie_prefix), cookies);
connection.connect();
// use the cleaner to "clean" the HTML and return it as a TagNode object
TagNode root = cleaner.clean(new InputStreamReader(connection.getInputStream()));
Object[] foundNodes = root.evaluateXPath(xPath);
if (foundNodes.length > 0) {
TagNode foundNode = (TagNode) foundNodes[0];
snap = foundNode.getAttributeByName(attrToSnap);
}
return snap;
}
只是爲了您的需要進行修改。 :-)
http://jsoup.org/應該有Android的版本......並且關於你的錯誤/匹配失敗......也許你正在加載這個網站的移動版本的設備...... – Selvin 2012-01-03 09:55:31
那是非常好的一點。不過,我剛剛檢查了HTML,我在網站的移動版本中查找的內容是相同的。我現在將查看該鏈接並稍後回覆。謝謝 – 2012-01-03 10:20:29