2017-07-31 122 views
0

我正在嘗試編寫一個腳本,用於從此website中刪除房地產代理的名稱,角色和電話號碼。使用Python和BeautifulSoup的Webscrape - 保存到csv文件時出錯

我的代碼:

containers = page_soup.findAll("div",{"class":"card horizontal-split vcard"}) 

filename = "agents.csv" 
f = open(filename, "w") 

headers = "name, role, number\n" 

f.write(headers) 

for container in containers: 
    agent_name = container.findAll("li", {"class":"agent-name"}) 
    if agent_name: 
     name = agent_name[0].text 

    agent_role = container.findAll("li", {"class":"agent-role"}) 
    if agent_role: 
     role = agent_role[0].text 

    filterfn = lambda x: 'href' in x.attrs and x['href'].startswith("tel") 
    phones = list(map(lambda x: x.text,filter(filterfn,container.findAll("a")))) 

    print("name: " + name) 
    print("role: " + role) 
    print("phones:" + repr(phones)) 

    f.write(name + "," +role + "," + phones.replace(",", "|") + "," + "\n") 

f.close() 

我的代碼的終端內的工作試圖將其保存到一個CSV文件,我可以在Excel中打開之前。不過,現在我收到了兩個錯誤消息:

TypeError: must be str, not list 
f.write(name + "," +role + "," + phones.replace(",", "|") + "," + "\n") 

f.write(name + "," +role + "," + phones.replace(",", "|") + "," + "\n") 
AttributeError: 'list' object has no attribute 'replace' 

**注意,我更換 「」 用 「|」以避免在csv文件內創建額外的列。*

+0

最簡單的字符串等問題的解決方法是蟒蛇超級簡單但天才'str(yourVariable)'。這不是您正在尋找的答案,而是一個快速解決問題的方法;) – hansTheFranz

+0

phones是一個列表,並且列表沒有替換方法。 – slackmart

回答

0

由於錯誤提到,phones是一個沒有replace()方法的列表。您可以使用.join(),而不是加入與指定的分隔列表中的元素(在這種情況下|):

f.write(name + "," +role + "," + '|'.join(phones) + "," + "\n") 

例如:

>>> phones = ['123', '321', '123'] 
>>> '|'.join(phones) 
'123|321|123'