忽略的Unicode錯誤

當我跑過來了一堆網址，找到所有鏈接的循環（在特定的div）在這些網頁上，我回去此錯誤：忽略的Unicode錯誤

Traceback (most recent call last): 
File "file_location", line 38, in <module> 
out.writerow(tag['href']) 
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 0: ordinal not in range(128)

我已經寫了與此相關的錯誤代碼是：

out = csv.writer(open("file_location", "ab"), delimiter=";") 
for tag in soup_3.findAll('a', href=True): 
    out.writerow(tag['href'])

是否有解決此問題的方法，可能if語句使用忽略使用Unicode錯誤的任何網址嗎？

在此先感謝您的幫助。

來源

2011-09-28 Mark Collier

您可以在try包裹writerow方法調用並捕獲該異常忽略它：

for tag in soup_3.findAll('a', href=True): 
    try: 
     out.writerow(tag['href']) 
    except UnicodeEncodeError: 
     pass

，但你幾乎可以肯定要挑非ASCII編碼的CSV文件（UTF-8，除非你有一個非常好的理由來使用別的東西），並用codecs.open()而不是內置的open來打開它。

來源

2011-09-28 19:38:58 geoffspear

非常感謝我使用了try：它工作的很棒。你如何改變編碼，你爲什麼要這麼做？請原諒這個基本的問題，但我很新的編程。 –

幾乎總是，你不想丟掉數據，因爲它碰巧使用了非ASCII字符。如果你用'open（「file_location」，「ab」，「utf-8」）打開文件，而不是拋出'UnicodeEncodeError'，'out.write'會寫出從網站上讀取的實際數據， 99％的時間是你真正想要的。 – geoffspear

啊，這會有所幫助，當我添加「utf-8」到打開的當前行的末尾時，出現以下錯誤：TypeError：需要整數我應該只使用open（「file_location」，「ab 「，」utf-8「），如果可以的話，我怎樣才能引入csv.writer，以便它可以在」try：「部分中使用。再次感謝您的幫助 –

忽略的Unicode錯誤

回答

相關問題