使用xpath在python腳本中循環。爲什麼我只能從最後一個網址獲得結果？

-2

爲什麼我只能得到最後一個網址的結果？這個想法是，我得到了兩個網址的結果列表。使用xpath在python腳本中循環。爲什麼我只能從最後一個網址獲得結果？

另外，在csv中打印時，我每次都得到一個空行。我如何刪除這一行？

import csv 
import requests 
from lxml import html 
import urllib 

TV_category = ["_108-tot-127-cm-43-tot-50-,98952,501090","_128-tot-150-cm-51-tot-59-,98952,501091"] 
url_pattern = 'http://www.mediamarkt.be/mcs/productlist/{}.html?langId=-17' 

for item in TV_category: 
    url = url_pattern.format(item) 
    page = requests.get(url) 
    tree = html.fromstring(page.content) 

    outfile = open("./tv_test1.csv", "wb") 
    writer = csv.writer(outfile)  

    rows = tree.xpath('//*[@id="category"]/ul[2]/li') 


for row in rows: 
    price = row.xpath('normalize-space(div/aside[2]/div[1]/div[1]/div/text())') 
    product_ref = row.xpath('normalize-space(div/div/h2/a/text())') 
    writer.writerow([product_ref,price])

來源

2016-04-24 Depekker

爲什麼你打開文件內循環？ – jonrsharpe

** rows **應該是一個列表，否則它將在每個循環中被替換。其他方法是在第一個循環內移動最後一個循環。 –

你是什麼意思安德烈斯？你能告訴我，因爲我無法弄清楚。謝謝 – Depekker

正如我在這個問題的評論說明，你需要把第二爲循環內（末）的第一個。否則，只有最後的行結果將被保存/寫入CSV格式的文件。

您不需要在每個循環中打開文件（一個與語句將自動關閉它）。同樣重要的是要強調，如果你打開一個帶有寫入標誌的文件，它將會覆蓋，如果它在一個循環內，每次打開時都會覆蓋它。

我重構你的代碼如下：

import csv 
import requests 
from lxml import html 
import urllib 

TV_category = ["_108-tot-127-cm-43-tot-50-,98952,501090","_128-tot-150-cm-51-tot-59-,98952,501091"] 
url_pattern = 'http://www.mediamarkt.be/mcs/productlist/{}.html?langId=-17' 

with open("./tv_test1.csv", "wb") as outfile: 
    writer = csv.writer(outfile) 

    for item in TV_category: 
     url = url_pattern.format(item) 
     page = requests.get(url) 
     tree = html.fromstring(page.content) 
     rows = tree.xpath('//*[@id="category"]/ul[2]/li') 

     for row in rows: 
      price = row.xpath('normalize-space(div/aside[2]/div[1]/div[1]/div/text())') 
      product_ref = row.xpath('normalize-space(div/div/h2/a/text())') 
      writer.writerow([product_ref,price])

來源

2016-04-24 21:19:13

使用xpath在python腳本中循環。爲什麼我只能從最後一個網址獲得結果？

回答

相關問題