我想解析網頁,iam使用htmlunit,當iam運行代碼時,iam正在獲取低於錯誤。獲取錯誤未知主機:www.google.com
import java.net.URL;
import java.util.List;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlImage;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
public class scrapImage {
public static void main(String[] args) throws Exception {
URL url = new URL("http://www.google.com");
//WebClient webClient = new WebClient(Opera);
WebClient webClient = new WebClient();
HtmlPage currentPage = (HtmlPage) webClient.getPage(url);
//get list of all divs
final List<?> images = currentPage.getByXPath("//img");
for (Object imageObject : images) {
HtmlImage image = (HtmlImage) imageObject;
System.out.println(image.getSrcAttribute());
} //webClient.closeAllWindows(); } }
}
}
錯誤消息:
Exception in thread "main" java.net.UnknownHostException: www.google.com
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:196)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:377)
at java.net.Socket.connect(Socket.java:530)
at java.net.Socket.connect(Socket.java:480)
at java.net.Socket.<init>(Socket.java:377)
at java.net.Socket.<init>(Socket.java:251)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346)
at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:97)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1430)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1388)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:325)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:386)
at htmlunit.scrapImage.main(scrapImage.java:16)
任何人都可以讓我知道了上面的異常的解決方案。
當我通過InetSocketAddress獲取代理名稱addr =(InetSocketAddress)proxy.address();和System.out.println(「proxy hostname:」+ addr.getHostName()); ,因爲addr本身爲null並且proxy爲null,所以獲得空指針。請你指導我 – developer 2012-04-11 05:45:44
給出你的代理服務器的名字,因爲所有的請求都是通過這個路由。給你的代理IP地址而不是上面的。 – UVM 2012-04-11 05:55:33