Python忽略字符並從循環列表中打印下一個字符

我正在使用BeautifulSoup4並請求從網站上刮取信息。Python忽略字符並從循環列表中打印下一個字符

然後，我將所需的信息存儲在列表中，有兩個列表，分別列出了從頁面抓取的兩種不同類型的信息。

try: 
     for i in range(0,1000): 
      location = dive_data1[((9*i)-7)].text 
      locations.append(location) 
      location = dive_data2[((9*i)-7)] 
      locations.append(location) 
      depth = dive_data1[((9*i)-6)].text 
      depths.append(depth) 
      depth = dive_data2[((9*i)-6)].text 
      depths.append(depth) 

    except: 
     pass

之後，我嘗試將這些列表傳遞給另一個循環以將內容寫入CSV文件。

try: 
     writer = csv.writer(dive_log) 
     writer.writerow(("Locations and depths")) 
     writer.writerow(("Sourced from:", str(url_page))) 
     writer.writerow(("Location", "Depth")) 
     for i in range(len(locations)): 
      writer.writerow((locations[i], depths[i]))

當我運行腳本我收到此錯誤：

writer.writerow((locations[i], depths[i])) 
UnicodeEncodeError: 'ascii' codec can't encode characters in position 65-66:  ordinal not in range(128)

我嘗試這樣通過字符它不能編碼：

writer = csv.writer(dive_log) 
    writer.writerow(("Locations and depths")) 
    writer.writerow(("Sourced from:", str(url_page))) 
    writer.writerow(("Location", "Depth")) 
    for i in range(len(locations)): 
     try: 
      writer.writerow((locations[i], depths[i])) 

     except: 
      pass

當運行此，只有線之前到for循環被執行，它完全通過for循環的重複。

我的腳本中的代碼的全部內容複製到下面，以防它與我在其餘部分中沒有看到的內容有關。

import csv 
from bs4 import BeautifulSoup 
import requests 

dive_log = open("divelog.csv", "wt") 
url_page = "https://en.divelogs.de/log/Mark_Gosling" 
r = requests.get(url_page) 
soup = BeautifulSoup(r.content) 

dive_data1 = soup.find_all("tr", {"class": "td2"}) 
dive_data2 = soup.find_all("td", {"class": "td"}) 
locations = [] 
depths = [] 

try: 
    for i in range(0,1000): 
     location = dive_data1[((9*i)-7)].text 
     locations.append(location) 
     location = dive_data2[((9*i)-7)] 
     locations.append(location) 
     depth = dive_data1[((9*i)-6)].text 
     depths.append(depth) 
     depth = dive_data2[((9*i)-6)].text 
     depths.append(depth) 

except: 
    pass 

try: 
    writer = csv.writer(dive_log) 
    writer.writerow(("Locations and depths")) 
    writer.writerow(("Sourced from:", str(url_page))) 
    writer.writerow(("Location", "Depth")) 
    for i in range(len(locations)): 
     try: 
      writer.writerow((locations[i], depths[i])) 

     except: 
      pass 

finally: 
    dive_log.close() 

print open("divelog.csv", "rt").read() 
print "\n\n" 
print locations

來源

2016-10-01 Menad Younsi

這應該跟charac它不能編碼：'湯= BeautifulSoup（response.content.decode（'utf-8'，'忽略'））' – yedpodtrzitko

不要忽視任何東西，除非你可以丟失數據，找出正確的編碼然後使用那。數據也是UTF-8編碼，所以問題在於別處。也不要使用毯子除外，抓住你的期望和記錄/打印錯誤。 –

-1

像@yedpodtriztko指出。您可以見好就收，它不能與下面的解碼字符：

，而不是做的：

soup = BeautifulSoup(r.content)

，你可以這樣做：

soup = BeautifulSoup(r.content.decode('utf-8', 'ignore'))

來源

2016-10-01 10:30:54

你需要編碼爲UTF-8在你寫的循環中：

for i in range(len(locations)): 
     writer.writerow((locations[i].encode("utf-8"), depths[i].encode("utf-8")))

來源

2016-10-01 11:09:55

Python忽略字符並從循環列表中打印下一個字符

回答

相關問題