2012-04-11 99 views
1

我想解析網頁,iam使用htmlunit,當iam運行代碼時,iam正在獲取低於錯誤。獲取錯誤未知主機:www.google.com

import java.net.URL; 
import java.util.List; 

import com.gargoylesoftware.htmlunit.WebClient; 
import com.gargoylesoftware.htmlunit.html.HtmlImage; 
import com.gargoylesoftware.htmlunit.html.HtmlPage; 

public class scrapImage { 

     public static void main(String[] args) throws Exception  { 
      URL url = new URL("http://www.google.com"); 
      //WebClient webClient = new WebClient(Opera);  
      WebClient webClient = new WebClient();  
      HtmlPage currentPage = (HtmlPage) webClient.getPage(url);  
      //get list of all divs  
      final List<?> images = currentPage.getByXPath("//img");  
      for (Object imageObject : images) {   
       HtmlImage image = (HtmlImage) imageObject;    
       System.out.println(image.getSrcAttribute());  
       }   //webClient.closeAllWindows();   } } 
      } 
     } 

錯誤消息:

Exception in thread "main" java.net.UnknownHostException: www.google.com 
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:196) 
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:377) 
    at java.net.Socket.connect(Socket.java:530) 
    at java.net.Socket.connect(Socket.java:480) 
    at java.net.Socket.<init>(Socket.java:377) 
    at java.net.Socket.<init>(Socket.java:251) 
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80) 
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122) 
    at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707) 
    at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361) 
    at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387) 
    at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) 
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) 
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346) 
    at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:97) 
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1430) 
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1388) 
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:325) 
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:386) 
    at htmlunit.scrapImage.main(scrapImage.java:16) 

任何人都可以讓我知道了上面的異常的解決方案。

回答

1

我認爲它與您的網絡連接或防火牆的問題可能會阻止Java程序訪問互聯網。

1

我認爲你是在代理或防火牆後面。檢查您系統中當前的防火牆狀態。同時,如果它與代理服務器相關,則可以像這樣修改代碼。

System.getProperties().put("proxySet", "true"); 
System.getProperties().put("proxyHost", "your proxy host name"); 
System.getProperties().put("proxyPort", "85"); 

可能這會幫助你。

+0

當我通過InetSocketAddress獲取代理名稱addr =(InetSocketAddress)proxy.address();和System.out.println(「proxy hostname:」+ addr.getHostName()); ,因爲addr本身爲null並且proxy爲null,所以獲得空指針。請你指導我 – developer 2012-04-11 05:45:44

+0

給出你的代理服務器的名字,因爲所有的請求都是通過這個路由。給你的代理IP地址而不是上面的。 – UVM 2012-04-11 05:55:33

1

似乎有一些麻煩與Internet的連接,或者你使用了代理,

設置代理服務器設置(主機/端口/用戶名/密碼),如果是這種情況。