如何將十六進制的UTF-8轉換爲其代碼點？

我有一個字符串e2 80 99這是一個UTF-8字符的十六進制表示。該字符串代表如何將十六進制的UTF-8轉換爲其代碼點？

U+2019 ’ e2 80 99 RIGHT SINGLE QUOTATION MARK

我想e2 80 99其轉換爲對應的Unicode代碼點，這是U+2019甚至'（單引號）。

我該怎麼做？

來源

2015-11-03 shashank

基本上你需要獲得用utf-8編碼的字符的字符串表示形式，然後獲取結果字符串的第一個字符（或者如果結果字符表示爲UTF-16中的兩個替代字符，則爲第一個+第二個字符）。這是一個概念驗證：

public static void main(String[] args) throws Exception { 

    // Convert your representation of a char into a String object: 
    String utf8char = "e2 80 99"; 
    String[] strNumbers = utf8char.split(" "); 
    byte[] rawChars = new byte[strNumbers.length]; 
    int index = 0; 
    for(String strNumber: strNumbers) { 
     rawChars[index++] = (byte)(int)Integer.valueOf(strNumber, 16); 
    } 
    String utf16Char = new String(rawChars, Charset.forName("UTF-8")); 

    // get the resulting characters (Java Strings are "encoded" in UTF16) 
    int codePoint = utf16Char.charAt(0); 
    if(Character.isSurrogate(utf16Char.charAt(0))) { 
     codePoint = Character.toCodePoint(utf16Char.charAt(0), utf16Char.charAt(1)); 
    } 
    System.out.println("code point: " + Integer.toHexString(codePoint)); 
}

來源

2015-11-03 04:57:51 morgano

謝謝，會嘗試。 – shashank

如何將十六進制的UTF-8轉換爲其代碼點？

回答

相關問題