如何在Python中獲取unicode月份名稱？

我想要得到一個unicode版本的calendar.month_abbr[6]。如果我沒有指定區域設置的編碼，我不知道如何將字符串轉換爲unicode。下面的示例代碼顯示我的問題：如何在Python中獲取unicode月份名稱？

>>> import locale 
>>> import calendar 
>>> locale.setlocale(locale.LC_ALL, ("ru_RU")) 
'ru_RU' 
>>> print repr(calendar.month_abbr[6]) 
'\xb8\xee\xdd' 
>>> print repr(calendar.month_abbr[6].decode("utf8")) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode 
    return codecs.utf_8_decode(input, errors, True) 
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb8 in position 0: unexpected code byte 
>>> locale.setlocale(locale.LC_ALL, ("ru_RU", "utf8")) 
'ru_RU.UTF8' 
>>> print repr(calendar.month_abbr[6]) 
'\xd0\x98\xd1\x8e\xd0\xbd' 
>>> print repr(calendar.month_abbr[6].decode("utf8")) 
u'\u0418\u044e\u043d'

任何想法如何解決這個問題？該解決方案不必看起來像這樣。任何解決方案，給我在unicode縮寫月份名稱是好的。

來源

2009-11-30 Rickard Lindberg

改變你的代碼的最後一行：

>>> print calendar.month_abbr[6].decode("utf8") 
Июн

使用不當repr()隱藏你，你已經得到你所需要的東西。

也getlocale()可以用來獲得編碼當前區域：

>>> locale.setlocale(locale.LC_ALL, 'en_US') 
'en_US' 
>>> locale.getlocale() 
('en_US', 'ISO8859-1')

另一個模塊可能對您有用：

PyICU - 國際化的一個更好的辦法。雖然locale根據操作系統中的語言環境數據庫生成月份名稱的初始或變形形式（因此您不能依賴它來獲取像俄語這樣的語言！）並使用某種編碼，但PyICU對於初始和變形形式具有不同的格式說明符（所以你可以選擇適合你的情況）並使用unicode。
pytils - 一套使用俄語的工具，包括日期。它具有硬編碼的月份名稱，作爲locale限制的解決方法。

來源

2009-11-30 17:48:29

如果Unicode轉換成功，我應該還是能夠做到就可以了再版。所以這不應該是問題。感謝您的鏈接。我會檢查出來。 – 2009-11-30 19:06:51

'locale.getlocale（）'工作。謝謝。 – 2009-12-01 18:57:55

你需要的是：

… 
myencoding= locale.getpreferredencoding() 
print repr(calendar.month_abbr[6].decode(myencoding)) 
…

來源

2009-11-30 21:03:47 tzot

在我的機器上'locale.getpreferredencoding（）'返回utf8。所以我仍然有同樣的問題。 – 2009-12-01 09:12:00

它似乎不像'locale.getpreferredencoding（）'返回'month_abbr'名稱編碼的編碼。 – 2009-12-01 09:15:55

如何在Python中獲取unicode月份名稱？

回答

相關問題