尋求一些幫助。我正在研究一個項目,使用Python中的Beautiful Soup來抓取具體的Craigslist帖子。我可以成功顯示在帖子標題中發現的emojis,但在帖子正文中未成功。我嘗試了不同的變化,但迄今爲止沒有任何工作。任何幫助,將不勝感激。用美麗的湯編碼Emojis
代碼:從身體收到
f = open("clcondensed.txt", "w")
html2 = requests.get("https://raleigh.craigslist.org/wan/6078682335.html")
soup = BeautifulSoup(html2.content,"html.parser")
#Post Title
title = soup.find(id="titletextonly")
title1 = soup.title.string.encode("ascii","xmlcharrefreplace")
f.write(title1)
#Post Body
body = soup.find(id="postingbody")
body = str(body)
body = body.encode("ascii","xmlcharrefreplace")
f.write(body)
錯誤:
'ascii' codec can't decode byte 0xef in position 273: ordinal not in range(128)
可能與此類似:http://stackoverflow.com/questions/9644099/python-ascii-codec-cant-decode-byte – anonyXmous