從文本解析時未知的字符

我正在從網站讀取一行文本。下面是例子我讀：從文本解析時未知的字符

11:28;26.02.12;6.7°C;6.7°C;67;0.7m/s; 6:45;17:40; Warm ;84;0.9;0.0;;

，一旦我讀字符串，而不是6.7℃，我得到6.7C。因爲看起來這個網站不是UTF-8編碼。我應該如何解決這個問題，我會讓instead而不是？有沒有可能在閱讀時解決這個問題，或者我可以解決這個問題，而我正在做字符串拆分？

下面是當前方法我使用從網站閱讀：

public static String getContentFromUrl(String url) throws ClientProtocolException, IOException { 

    HttpClient httpClient = new DefaultHttpClient(); 
    HttpGet httpGet = new HttpGet(url); 
    HttpResponse response; 

    response = httpClient.execute(httpGet); 
    HttpEntity entity = response.getEntity(); 

    if(entity != null) { 

     InputStream inStream = entity.getContent(); 

     String result = Weather.convertStreamToString(inStream); 
     inStream.close(); 

     return result; 
    } 

    return null; 

} 

private static String convertStreamToString(InputStream is) { 
    BufferedReader reader = new BufferedReader(new InputStreamReader(is)); 
    StringBuilder sb = new StringBuilder(); 

    String line = null; 

    try { 
     while ((line = reader.readLine()) != null) { 
      sb.append(line + "\n"); 
     } 
    } catch (IOException e) { 
     e.printStackTrace(); 
    } finally { 
     try { 
      is.close(); 
     } catch (IOException e) { 
      e.printStackTrace(); 
     } 
    } 
    return sb.toString(); 
}

來源

2012-02-26 HyperX

什麼編碼這臺服務器使用？你可以嘗試：

sb.append((new String(line, "UTF-8")) + "\n");

或

sb.append((new String(line, "iso-8859-1")) + "\n");

來源

2012-02-26 10:50:12 Knossos

嗯，我應該在構造函數中聲明這個新的字符串？ – HyperX 2012-02-26 11:50:18

您只需將其中一行替換爲您有的while循環。 – Knossos 2012-02-26 13:34:17

從文本解析時未知的字符

回答

相關問題