2017-08-16 124 views
0

我試圖抓取「http://www.landwatch.com/Philippines_land_for_sale/Land」的數據;我需要的是地址和價格信息。我的方法是在Python中使用美麗的湯模塊。當我檢查了html頁面時,我也陷入了困境。願你們中的一些人給我一些提示,讓我可以繼續前進。基本上卷材檢查表明我需要的信息是從DIV CLASS = clear屬性離開,這裏是代碼:Python:AttributeError和網頁抓取的挑戰

from lxml import html 
import requests 
import bs4 as bs 
from urllib.request import urlopen as uReq 
from bs4 import BeautifulSoup as soup 

my_url = 'http://www.landwatch.com/Philippines_land_for_sale/Land' 

#Openning up connection, grabbing the page 
uClient = uReq(my_url) 
page_html = uClient.read() 
soup = bs.BeautifulSoup(page_html,'lxml') 
g_data = soup.find_all("div",{"class": "clear property left"}) 
for item in g_data: 
    print(item).contents[0] 

感謝,

回答

1

你是幾乎沒有,地址和價格信息爲的<div class="propName"><div class="clear property left"><a>元素,你可以找到g_data更深處,像這樣:

import requests 
from bs4 import BeautifulSoup 
my_url = 'http://www.landwatch.com/Philippines_land_for_sale/Land' 
link=requests.get(my_url) 
soup = BeautifulSoup(link.content, 'lxml') 
g_data =soup.find_all('div',class_='clear property left') 
for item in g_data: 
    address_price_info = item.find("div",{"class":"propName"}).find('a').text 
    print(address_price_info) 

輸出將是:

Cebu City, Philippines 1185000, PHP 
    Tagaytay, Philippines $116,000 
    Quezon City, Philippines $2,837,000 
    Sta Rosa Laguna, Philippines 15500, PHP 
    Makati, Philippines $5,947,826 
    Puerto Princesa City, Philippines $358,813 
    Carcar, Philippines 35000000, PHP 
    Lipa City, Philippines $57,750 
    Makati, Philippines 6400000, PHP 
    Taytay, Philippines $2,300,000 
    Taguig, Philippines $504,208 
    Taguig City, Philippines $13,760 
    Quezon City, Philippines 58000000, PHP 
    Cebu City, Philippines 7799030, PHP 
    Las Pinas, Philippines $468,000 

更新:

如果您使用Chrome檢查地址和價格信息,它會告訴你的位置:

<div class="clear property left"> 

    <div class="margintop"> 

     ...    
     <div class="propName"> #Here is the address and price info 
      <a href="/Cebu-City-Philippines-Land-for-sale/pid/119211639" onclick="WC('119211639', '-1');"> &nbsp; Cebu City, Philippines <BR/> 1185000, PHP</a> 
     </div> 


      <div>PAYMENT SCHEMES:\r\rReservation Fee : P20,000 (non refundable)\r\r1. SCHEME 1\rCash - 100% with the following discounts\r* 10% for 7 days payment\r* 8%...&nbsp;</div> 

     ... 

    </div> 
    <div class="clear"></div> 
</div> 
+0

太謝謝你了!Tiny.D還有一個簡單的問題,你是如何找到價格和地址信息在propName的?我甚至看不到這一點。 –

+0

@ M.C檢查更新。 –