Python的編碼ISO爲utf8

我想讀使用Python腳本（Python的2.5和PyPy）我的郵件我的一些結果都不在ASCII和我得到的字符串是這樣的：Python的編碼ISO爲utf8

= ISO -8859-7 2 B 4 0OXm7/Dv8d/hIPP07 + 0gyuno4enx/u3h？=」

是否有任何方式對其進行解碼並轉換爲UTF-8，這樣我可以處理它？我試過.decode（'ISO-8859-7'）但我得到了相同的字符串

2010-04-27 PanosJee

import email.header as eh 

unicode_data= u''.join(
    str_data.decode(codec or 'ascii') 
    for str_data, codec 
    in eh.decode_header('=?ISO-8859-7?B?0OXm7/Dv8d/hIPP07+0gyuno4enx/u3h?=')) 
# unicode_data now is u'Πεζοπορία στον Κιθαιρώνα'

你應該在這裏使用unicode_data。但是，如果你（想你）需要UTF-8編碼的字符串，您可以：

utf8data= unicode_data.encode('utf-8')

更新：我改變了.decode呼籲，以應付情況下codec爲None（如eh.decode_header('plain text')）

來源

2010-05-27 03:23:37 tzot

@Tzotziou ：+1，但請勿將'unicodedata'用作「變量名稱」;這是一個模塊。 – 2010-05-27 03:50:45

@約翰：你是對的。感謝您的評論。 – tzot 2010-05-27 10:20:50

請閱讀MIME encoding和Base64 encoding。 base64 module將會很有用。

來源

2010-04-27 17:02:40

Python的編碼ISO爲utf8

回答

相關問題