4
我一直在嘗試在我的Java應用程序中爲一個小的實用程序片段檢索「unicode用戶輸入」。問題是,它似乎在Ubuntu「開箱即用」,我猜UTF-8的操作系統範圍編碼,但從「cmd」運行時無法在Windows上工作。考慮到代碼如下:在UbuntuJava中的控制檯應用程序中的Unicode輸入
public class SerTest {
public static void main(String[] args) throws Exception {
testUnicode();
}
public static void testUnicode() throws Exception {
System.out.println("Default charset: " +
Charset.defaultCharset().name());
BufferedReader in =
new BufferedReader(new InputStreamReader(System.in, "UTF-8"));
System.out.printf("Enter 'абвгд эюя': ");
String line = in.readLine();
String s = "абвгд эюя";
byte[] sBytes = s.getBytes();
System.out.println("strg bytes: " + Arrays.toString(sBytes));
byte[] lineBytes = line.getBytes();
System.out.println("line bytes: " + Arrays.toString(lineBytes));
PrintStream out = new PrintStream(System.out, true, "UTF-8");
out.print("--->" + s + "<----\n");
out.print("--->" + line + "<----\n");
}
}
輸出(沒有任何變化配置):
[email protected]> javac SerTest.java && java SerTest
Default charset: UTF-8
Enter 'абвгд эюя': абвгд эюя
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
line bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
--->абвгд эюя<----
--->абвгд эюя<----
的窗口上輸出CMD提示(不以任何方式受JAVA_TOOL_OPTIONS):全光照後
E:\>chcp 65001
Active code page: 65001
E:\>java -Dfile.encoding=utf8 SerTest
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=utf8
Default charset: UTF-8
Enter 'абвгд эюя': юя': ': абвгд эюя
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
Exception in thread "main" java.lang.NullPointerException
at SerTest.testUnicode(SerTest.java:26) # byte[] lineBytes = line.getBytes();
at SerTest.main(SerTest.java:15)
輸出在Eclipse控制檯(克JAVA_TOOL_OPTIONS):
Default charset: UTF-8
Enter 'абвгд эюя': абвгд эюя
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=utf8
line bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
--->абвгд эюя<----
--->абвгд эюя<----
在Eclipse控制檯,它是工作,因爲我已經添加了一個系統範圍內的環境變量(JAVA_TOOL_OPTIONS),其如果可能的話,我想避免。
輸出在Eclipse控制檯(除去 JAVA_TOOL_OPTIONS 後):
Default charset: UTF-8
Enter 'абвгд эюя': абвгд эюя
strg bytes: [-48, -80, -48, -79, -48, -78, -48, -77, -48, -76, 32, -47, -115, -47, -114, -47, -113]
line bytes: [-61, -112, -62, -80, -61, -112, -62, -79, -61, -112, -62, -78, -61, -112, -62, -77, -61, -112, -62, -76, 32, -61, -111, -17, -65, -67, -61, -111, -59, -67, -61, -111, -17, -65, -67]
--->абвгд эюя<----
--->абвгд �ю�<----
所以我的問題是:什麼究竟是怎麼回事?需要進行哪些代碼更改才能確保此代碼段適用於各種「Unicode」輸入?
對不起提前長篇大論的問題和感謝,
佐助