™字符沒有被GetStringChars（）正確翻譯

我注意到商標字符™似乎沒有被Java 8中的JNI的GetStringChars()函數正確地轉換，它被認爲是翻譯Unicode字符的函數。我有與GetStringUTFChars()方法相同的問題。™字符沒有被GetStringChars（）正確翻譯

這不是一個大問題，因爲有簡單的解決方法（在調用JNI函數之前從字符串中刪除Unicode字符）。

但是，由於我沒有發現類似的問題，而谷歌搜索，我來這裏看看有沒有人有關於此的解釋？（或者我可能在我的代碼中缺少某些東西？）

我在MinGW下使用Java 8和g ++ 4.8。

這裏是我的代碼片斷：

JNIEXPORT void JNICALL Java_MyClass_JNI_myMethod (JNIEnv * env , jobject obj, jstring input_string) 
{ 
    const jchar *inp_string = (*env).GetStringChars(input_string, NULL); 
    const jchar *jch_inp_string = inp_string;   
    (*env).ReleaseStringChars(input_string, inp_string);  

    std::cout << jch_inp_string <<'\n'; 
}

作爲一個例子，在該功能中，如果我輸入字符串：

Random String™

它輸出該：

Random Stringâ„¢

來源

2017-04-02 j.doe

如果使用'std :: wcout'而不是'std :: cout'，你有同樣的問題嗎？ – Michael

std :: wcout對我的編譯器來說是未知的，當我編譯它時出現這個錯誤：error：'wout'不是'std'的成員 –

_「錯誤：'wout'不是'std' 「_如果你真的寫了'wout'而不是'wcout'，那麼這個錯誤是可以預料的。 – Michael

我在docs深入瞭解之後發現了一個解決方法，這是因爲java只支持UTF8修改，這意味着它足夠好打印XML文檔，但不打印拉丁-1編碼字符而不會出錯。

要做到這一點，我從C++調用回java，並讓他將他的utf8修改後的字符轉換爲符合我需要的編碼。想到我不知道是否有最簡單的方法來做到這一點，我覺得很奇怪，JNI本身給出的字符串沒有完全匹配一個非常通用的標準。

JNIEXPORT void JNICALL Java_MyClass_JNI_myMethod (JNIEnv * env , jobject obj, jstring input_string){ 

    //this calls back JNI to reformat the string form java UTF8 modified encoding to something more common 
    const jclass stringClass = env->GetObjectClass(input_string); 
    const jmethodID getBytes = env->GetMethodID(stringClass, "getBytes", "(Ljava/lang/String;)[B"); 
    const jstring charsetName = env->NewStringUTF("windows-1252"); 
    const jbyteArray stringJbytes = (jbyteArray) env->CallObjectMethod(input_string, getBytes, charsetName); 
    env->DeleteLocalRef(charsetName); 
    const jsize length = env->GetArrayLength(stringJbytes); 
    const jbyte* strBytes = env->GetByteArrayElements(stringJbytes, NULL); 


    //this make sure our string is C/C++ compliant with null character 
    //but it seems to work well without too 
    char* my_string = malloc(length+1); 
    memcpy(my_string , strBytes, length); 
    my_string [length] = '\0'; 

    env->ReleaseByteArrayElements(stringJbytes, strBytes , JNI_ABORT); 
    env->DeleteLocalRef(stringJbytes); 

    std::cout << my_string << std::endl; 

}

來源

2017-04-02 14:55:55

這段代碼是完全不必要的。不僅Windows-1252不適合處理大多數Unicode字符（您應該在調用'String.getBytes（）'時使用'「utf-8」'），但是這個代碼也會泄漏分配的內存不會調用'free（my_string）'（你甚至不應該在C++中使用'malloc（）'，而應該使用'new []'或者更好的'std :: string'）。如果你覺得需要調用'String.getBytes（）'，你不需要分配第二個字節的副本，你可以直接將原始字節傳遞給'std :: cout'，例如：'std :: cout.write（（char *）strBytes，length）;' –

™字符沒有被GetStringChars（）正確翻譯

回答

相關問題