2011-12-29 109 views
4

我一直在嘗試在我的Java應用程序中爲一個小的實用程序片段檢索「unicode用戶輸入」。問題是,它似乎在Ubuntu「開箱即用」,我猜UTF-8的操作系統範圍編碼,但從「cmd」運行時無法在Windows上工作。考慮到代碼如下:在UbuntuJava中的控制檯應用程序中的Unicode輸入

public class SerTest { 

    public static void main(String[] args) throws Exception { 
     testUnicode(); 
    } 

    public static void testUnicode() throws Exception { 
     System.out.println("Default charset: " + 
      Charset.defaultCharset().name()); 
     BufferedReader in = 
      new BufferedReader(new InputStreamReader(System.in, "UTF-8")); 
     System.out.printf("Enter 'абвгд эюя': "); 
     String line = in.readLine(); 
     String s = "абвгд эюя"; 
     byte[] sBytes = s.getBytes(); 
     System.out.println("strg bytes: " + Arrays.toString(sBytes)); 
     byte[] lineBytes = line.getBytes(); 
     System.out.println("line bytes: " + Arrays.toString(lineBytes)); 
     PrintStream out = new PrintStream(System.out, true, "UTF-8"); 
     out.print("--->" + s + "<----\n"); 
     out.print("--->" + line + "<----\n"); 
    } 

} 

輸出(沒有任何變化配置):

[email protected]> javac SerTest.java && java SerTest 
Default charset: UTF-8 
Enter 'абвгд эюя': абвгд эюя 
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113] 
line bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113] 
--->абвгд эюя<---- 
--->абвгд эюя<---- 

的窗口上輸出CMD提示(不以任何方式受JAVA_TOOL_OPTIONS):全光照後

E:\>chcp 65001 
Active code page: 65001 

E:\>java -Dfile.encoding=utf8 SerTest 
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=utf8 
Default charset: UTF-8 
Enter 'абвгд эюя': юя': ': абвгд эюя 
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113] 
Exception in thread "main" java.lang.NullPointerException 
     at SerTest.testUnicode(SerTest.java:26) # byte[] lineBytes = line.getBytes(); 
     at SerTest.main(SerTest.java:15) 

輸出在Eclipse控制檯(克JAVA_TOOL_OPTIONS):

Default charset: UTF-8 
Enter 'абвгд эюя': абвгд эюя 
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113] 
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=utf8 
line bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113] 
--->абвгд эюя<---- 
--->абвгд эюя<---- 

在Eclipse控制檯,它是工作,因爲我已經添加了一個系統範圍內的環境變量(JAVA_TOOL_OPTIONS),其如果可能的話,我想避免。

輸出在Eclipse控制檯(除去 JAVA_TOOL_OPTIONS 後):

Default charset: UTF-8 
Enter 'абвгд эюя': абвгд эюя 
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113] 
line bytes: [-61, -112, -62, -80, -61, -112, -62, -79, -61, -112, -62, -78, -61, -112, -62, -77, -61, -112, -62, -76, 32, -61, -111, -17, -65, -67, -61, -111, -59, -67, -61, -111, -17, -65, -67] 
--->абвгд эюя<---- 
--->абвгд �ю�<---- 

所以我的問題是:什麼究竟是怎麼回事?需要進行哪些代碼更改才能確保此代碼段適用於各種「Unicode」輸入?

對不起提前長篇大論的問題和感謝,
佐助

回答

相關問題