如何將特殊字符轉換爲html實體？

我想在python中轉換特殊字符，如"%$!&@á é ©"，而不僅僅是'<&">'，因爲我發現的所有文檔和參考資料都顯示出來。 cgi.escape不能解決問題。如何將特殊字符轉換爲html實體？

例如，字符串"á ê ĩ &"應該轉換爲"á ê &itilde; &"。

是否anyboy知道如何解決它？我正在使用python 2.6。

來源

2012-03-08 Jayme Tosi Neto

請注意以下兩點：（1）名稱實體可能會導致問題，您應該使用數字實體。（2）爲什麼要使用實體？在大多數情況下，更好的解決方案是對文檔進行UTF-8編碼，以便它可以包含字母，而不是使用實體。 – 2012-03-08 11:30:50

http://wiki.python.org/moin/EscapingHtml – Quentin 2012-03-08 11:32:05

我同意你@KonradRudolph。我不喜歡使用實體，但我正在使用的系統使用實體，所以我別無選擇。 =/ – 2012-03-08 11:35:12

你可以建立一個使用字典自己的循環，你可以找到在http://docs.python.org/library/htmllib.html#module-htmlentitydefs

你要找的人是htmlentitydefs.codepoint2name

來源

2012-03-08 11:30:15

這是一個好主意！ ; D – 2012-03-08 11:35:47

鏈接不再有效。在Python 2中使用HTMLParser，在Python 3中使用等效的html.parser。 – oxidworks 2017-02-21 22:39:13

我發現了一個建在溶液中搜索的htmlentitydefs.codepoint2name該@Ruben Vermeersch在回答中說。該解決方案在這裏找到：http://bytes.com/topic/python/answers/594350-convert-unicode-chars-html-entities

這裏的功能：

def htmlescape(text): 
    text = (text).decode('utf-8') 

    from htmlentitydefs import codepoint2name 
    d = dict((unichr(code), u'&%s;' % name) for code,name in codepoint2name.iteritems() if code!=38) # exclude "&"  
    if u"&" in text: 
     text = text.replace(u"&", u"&amp;") 
    for key, value in d.iteritems(): 
     if key in text: 
      text = text.replace(key, value) 
    return text

謝謝大家的幫助！ ;）

來源

2012-03-08 11:46:05

如何將特殊字符轉換爲html實體？

回答

相關問題