如何在Perl中讀取Little-endian UTF-16 Unicode文本？

的file命令告訴我：如何在Perl中讀取Little-endian UTF-16 Unicode文本？

tmp.txt: Little-endian UTF-16 Unicode text, with CRLF line terminators

cat，head等不能正確顯示此文件。

但是vim可以正確顯示它。 vim告訴我：

[~/tmp/tmp.txt] [utf-8,dos] 
"tmp.txt" [converted][dos]

和:set在vim說fileencoding=ucs-2le

所以在Perl：

open FH,'<:encoding(ucs-2le)',$file; 
while(<FH>){ 
    chomp; 
    # A start 
    print; 
    # Perl: Wide character in print at a.pl line 12, <FH> line 1 
    # And display incorrect 
    # A end 

    # B start 
    binmode STDOUT,":utf8"; 
    print; 
    # display incorrect too 
    # B end 

}

我怎樣才能讀取該文件正確在Perl？

來源

2012-03-15 everbox

對於它的價值，你的代碼完全適用於我對使用小尾數UTF-16，我剛剛創建的小文件。（我不得不手動刪除BOM，通過編寫's/^ \ x {FEFF} //'來防止「Wide print in print」警告，因爲UCS-2不使用BOM。） – ruakh 2012-03-15 02:05:44

你確定你的終端正在期待UTF-8？ – cjm 2012-03-15 02:34:20

'locale'是'en_US.UTF-8'，vim中的'：set'是'termencoding = utf-8'，SecureCRT也是utf8 – everbox 2012-03-15 03:08:34