從Python中的html文件中替換單詞

我正在嘗試讀取HTML文件，並用Excel表格中的相應單詞替換一些單詞。以下是我的代碼。從Python中的html文件中替換單詞

import urllib 
import xlrd 


workbook = xlrd.open_workbook('polish.xlsx', encoding_override="cp1252") 
worksheet = workbook.sheet_by_index(0) 
page = urllib.urlopen("source.html").read() 

for x in range(0,96): 

if not type(worksheet.cell(x, 2).value) is float: 
    print worksheet.cell(x, 2).value.encode("utf-8") 
    print worksheet.cell(x, 3).value.encode("utf-8") 

    page.replace(worksheet.cell(x, 2).value.encode("utf-8"), worksheet.cell(x, 3).value.encode("utf-8")) 
print page

但是替換功能不起作用。 page變量不顯示任何更改。如何替換HTML文件中的文本？

來源

2015-09-04 Sooraj Chandran

請顯示完整的回溯。那個錯誤發生在哪裏？ –

'Traceback（last recent call last）：文件「langscript.py」，第16行，在 page.replace（worksheet.cell（x，2）.value，worksheet.cell（x，3）.value） TypeError：預期一個字符緩衝區對象 –

你在cmd.exe shell中運行這個嗎？如果是：輸入'cp 1252'並在提示符下按回車，然後重新運行您的代碼。 – Hannu

在替換方法中，將您的變量類型轉換爲字符串。

fist_var = worksheet.cell(x, 2).value.encode('ascii', 'ignore') 
second_var = worksheet.cell(x, 3).value.encode('ascii', 'ignore') 
for x in range(0,90): 
page.encode('ascii', 'ignore').replace(first_var, second_var)

這應該起作用。希望這可以幫助。

來源

2015-09-04 09:28:16 navneet35371

我已經試過了。我的內容不只是ASCII值。 UnicodeEncodeError：'ascii'編解碼器不能在位置29編碼字符u'\ u2013'：序號不在範圍內（128）' –

然後如果可能的話使用「」來轉義這些值value.encode（'ascii'，'ignore'）「」然後嘗試。 – navneet35371

你可以改變你的答案嗎？ –

從Python中的html文件中替換單詞

回答

相關問題