2015-07-12 47 views
1

我正在研究一些使用Selenium Web驅動程序 - Firefox的代碼。大多數情況似乎都有效,但是當我嘗試將瀏覽器更改爲PhantomJS時,它開始表現不同。PhantomJS的行爲與Firefox webdriver的行爲不同

我正在處理的頁面需要慢慢滾動以加載越來越多的結果,這可能是問題所在。

這裏是一個與Firefox工程的webdriver的代碼,但不與PhantomJS工作:

def get_url(destination,start_date,end_date): #the date is like %Y-%m-%d 
    return "https://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s&rfc=C%s&rtc=%s&dd=%s&rd=%s&px=1000&ns=0&prc=&rng=0&rbd=0&ct=0&view=list" % ('CVIE%20BUD%20BTS',destination, destination,'CVIE%20BUD%20BTS', start_date, end_date) 



def load_whole_page(self,destination,start_date,end_date): 
     deb() 

     url = get_url(destination,start_date,end_date) 

     self.driver.maximize_window() 
     self.driver.get(url) 

     wait = WebDriverWait(self.driver, 60) 
     wait.until(EC.invisibility_of_element_located((By.XPATH, '//img[contains(@src, "loading")]'))) 
     wait.until(EC.invisibility_of_element_located((By.XPATH, 
                 u'//div[. = "Poprosíme o trpezlivosť, hľadáme pre Vás ešte viac letov"]/preceding-sibling::img'))) 
     i=0 
     old_driver_html = '' 
     end = False 
     while end==False: 
      i+=1 

      results = self.driver.find_elements_by_css_selector("div.flightbox") 
      print len(results) 
      if len(results)>=__THRESHOLD__: # for testing purposes. Default value: 999 
       break 
      try: 
       self.driver.execute_script("arguments[0].scrollIntoView();", results[0]) 
       self.driver.execute_script("arguments[0].scrollIntoView();", results[-1])    
      except: 
       self.driver.save_screenshot('screen_before_'+str()+'.png') 
       sleep(2) 

       print 'EXCEPTION<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<' 
       continue 

      new_driver_html = self.driver.page_source 
      if new_driver_html == old_driver_html: 
       print 'END OF PAGE' 
       break 
      old_driver_html = new_driver_html 

      wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, 'div.flightbox'), len(results))) 
     sleep(10) 

要檢測頁面時滿載,我比較HTML和新的HTML的舊副本這可能是不是我應該做的,而是使用Firefox,這就足夠了。

這裏是PhantomJS時加載停止屏幕:enter image description here

與Firefox,它加載越來越多的結果,但與PhantomJS它stucked上例如10個結果。

任何想法?這兩個驅動程序有什麼區別?幫助我

回答

2

兩個關鍵的東西來解決這個問題:

  • 不要使用自定義的等待,我幫你之前
  • 連續
  • 設置第一至0 window.document.body.scrollTop再到 document.body.scrollHeight

工作代碼:

results = [] 
while len(results) < 200: 
    results = driver.find_elements_by_css_selector("div.flightbox") 

    print len(results) 

    # scroll 
    driver.execute_script("arguments[0].scrollIntoView();", results[0]) 
    driver.execute_script("window.document.body.scrollTop = 0;") 
    driver.execute_script("window.document.body.scrollTop = document.body.scrollHeight;") 
    driver.execute_script("arguments[0].scrollIntoView();", results[-1]) 

2版(無限循環,停止,如果有什麼裝上滾動了):

results = [] 
while True: 
    try: 
     wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results))) 
    except TimeoutException: 
     break 

    results = self.driver.find_elements_by_css_selector("div.flightbox") 
    print len(results) 

    # scroll 
    for _ in xrange(5): 
     try: 
      self.driver.execute_script(""" 
       arguments[0].scrollIntoView(); 
       window.document.body.scrollTop = 0; 
       window.document.body.scrollTop = document.body.scrollHeight; 
       arguments[1].scrollIntoView(); 
      """, results[0], results[-1]) 
     except StaleElementReferenceException: 
      break # here it means more results were loaded 

print "DONE. Result count: %d" % len(results) 

請注意,我已經改變了wait_for_more_than_n_elements預期的條件進行比較。取代:

return count >= self.count 

有:

return count > self.count 

3版(從頭部滾動到頁腳多次):

header = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'header'))) 
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'footer'))) 

results = [] 
while True: 
    try: 
     wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results))) 
    except TimeoutException: 
     break 

    results = self.driver.find_elements_by_css_selector("div.flightbox") 
    print len(results) 

    # scroll 
    for _ in xrange(5): 
     self.driver.execute_script(""" 
      arguments[0].scrollIntoView(); 
      arguments[1].scrollIntoView(); 
     """, header, footer) 
     sleep(1) 
+0

它不會爲我工作。我試圖使用你的代碼:http://pastebin.com/tHBQu67i錯誤:http://pastebin.com/W8ktFaUR這是最後一行的東西,但我thinf結果[-1]必須存在,因爲它說那裏是10或15個結果... –

+1

@Milan好的,你使用的是哪個'PhantomJS'版本?如果不是最新的,請嘗試降級。謝謝。 – alecxe

+0

我查過了,它可能是最新版本 - 2.0.0 –