2017-01-09 41 views
1

我正在使用jsoup從網站上刮取數據。我想知道當我從哪裏抓取數據的網站關閉時會拋出哪個異常。
SocketException還是NoHttpResponseException或其他?
我看到NoHttpResponseException在服務器收到請求但沒有響應時拋出,這是正確的嗎?網站停機時會拋出哪個異常?

+0

我認爲它應該是'RequestTimeoutException',因爲客戶端無法在給定的超時時間內建立連接 –

+0

將您的程序指向停靠網站並自行查看。這裏有一個例子:http://cocacola.com:8989/ –

回答

1

我測試了我們自己的網站,我取下來的Tomcat我得到以下java.net.SocketTimeoutException後:

java.net.SocketTimeoutException: connect timed out 
    at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method) 
    at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85) 
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) 
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) 
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) 
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172) 
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) 
    at java.net.Socket.connect(Socket.java:589) 
    at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:668) 
    at sun.net.NetworkClient.doConnect(NetworkClient.java:175) 
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) 
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) 
    at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264) 
    at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367) 
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191) 
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138) 
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032) 
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177) 
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153) 
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:563) 
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540) 
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227) 
    at org.jsoup.helper.HttpConnection.get(HttpConnection.java:216) 
    at testing.Test.main(Test.java:19) 

這是我使用的代碼:

public static void main(String[] args) { 
    try { 
     Document document = Jsoup.connect("https://example/folder").validateTLSCertificates(false).timeout(1000).get(); 
     System.out.println(document); 
    } catch (Exception e) { 
     e.printStackTrace(); 
    } 
} 

NoHttpResponseException似乎是一個Apache的HttpClient例外(org.apache.commons.httpclient.NoHttpResponseException)。由於Jsoup沒有apache依賴關係,因此SocketTimeoutException可能是答案。