循環多個工具提示

我想從this page（您需要訪問Proquest以可視化它）的一系列文章中獲取作者的名稱和從屬關係。我想要做的是打開頁面頂部的所有工具提示，並從中提取一些HTML文本。這是我的代碼：循環多個工具提示

from selenium import webdriver 
from selenium.webdriver.common.action_chains import ActionChains 

browser = webdriver.Firefox() 

url = 'http://search.proquest.com/econlit/docview/56607849/citation/2876523144F544E0PQ/3?accountid=13042' 
browser.get(url) 

#insert your username and password here 

n_authors = browser.find_elements_by_class_name('zoom') #zoom is the class name of the three tooltips that I want to open in my loop 

author = [] 
institution = []  

for a in n_authors: 
    print(a) 
    ActionChains(browser).move_to_element(a).click().perform() 
    html_author = browser.find_element_by_xpath('//*[@id="authorResolveLinks"]/li/div/a').get_attribute('innerHTML') 
    html_institution = browser.find_element_by_xpath('//*[@id="authorResolveLinks"]/li/div/p').get_attribute('innerHTML') 
    author.append(html_author) 
    institution.append(html_institution)

雖然n_authors具有彼此明顯不同的三個項目，硒未能從所有的提示的信息，而不是返回此：

作者

#['Nuttall, William J.', 
#'Nuttall, William J.', 
#'Nuttall, William J.']

同樣的情況發生在機構。我錯了什麼？非常感謝

編輯：

包含工具提示的XPath的陣列：

n_authors

#[<selenium.webdriver.remote.webelement.WebElement (session="277c8abc-3883- 
#43a8-9e93-235a8ded80ff", element="{008a2ade-fc82-4114-b1bf-cc014d41c40f}")>, 
#<selenium.webdriver.remote.webelement.WebElement (session="277c8abc-3883-  
#43a8-9e93-235a8ded80ff", element="{c4c2d89f-3b8a-42cc-8570-735a4bd56c07}")>, 
#<selenium.webdriver.remote.webelement.WebElement (session="277c8abc-3883- 
#43a8-9e93-235a8ded80ff", element="{9d06cb60-df58-4f90-ad6a-43afeed49a87}")>]

其中有長度爲3，並且三個元件是不同的，這就是爲什麼我不明白爲什麼硒不會區分它們。

編輯2：下面是相關HTML

<span class="titleAuthorETC small"> 
    <span style="display:none" class="title">false</span> 
    Jamasb, Tooraj 
    <a class="zoom" onclick="return false;" href="#"> 
    <img style="margin-left:4px; border:none" alt="Visualizza profilo" id="resolverCitation_previewTrigger_0" title="Visualizza profilo" src="/assets/r20161.1.0-4/ctx/images/scholarUniverse/ar_button.gif"> 
    </a><script type="text/javascript">Tips.images = '/assets/r20161.1.0-4/pqc/javascript/prototip/images/prototip/';</script>; Nuttall, William J 
    <a class="zoom" onclick="return false;" href="#"> 
    <img style="margin-left:4px; border:none" alt="Visualizza profilo" id="resolverCitation_previewTrigger_1" title="Visualizza profilo" src="/assets/r20161.1.0-4/ctx/images/scholarUniverse/ar_button.gif"> 
    </a>; Pollitt, Michael G 
    <a class="zoom" onclick="return false;" href="#"> 
    <img style="margin-left:4px; border:none" alt="Visualizza profilo" id="resolverCitation_previewTrigger_2" title="Visualizza profilo" src="/assets/r20161.1.0-4/ctx/images/scholarUniverse/ar_button.gif"> 
    </a>.

UPDATE： @ parishodak的答案，因爲某些原因無法使用Firefox瀏覽器，除非我手動將鼠標懸停在提示第一。它的工作原理與chromedriver，但只有當我第一次懸停在工具提示，只有當我允許time.sleep（），如

for i in itertools.count(): 
    try: 
     tooltip = browser.find_element_by_xpath('//*[@id="resolverCitation_previewTrigger_' + str(i) + '"]') 
     print(tooltip) 
     ActionChains(browser).move_to_element(tooltip).perform() # 
    except NoSuchElementException: 
     break 

time.sleep(2) 

elements = browser.find_elements_by_xpath('//*[@id="authorResolveLinks"]/li/div/a') 
author = []  

for e in elements: 
    print(e) 
    attribute = e.get_attribute('innerHTML') 
    author.append(attribute)`

來源

2016-01-06 simone

嘗試'author.append（copy.copy（html_author））'，這是否工作？（先導入'copy'。） – kfx

謝謝，但它仍然返回相同的輸出。我將編輯我的問題以使問題更清楚 – simone

如果可能，您可以將相關的html添加到您的問題 - 這將有所幫助。 –

它返回相同的元素，是因爲XPath是不能改變的所有循環迭代。如下所述

使用陣列符號來表示的XPath：

兩種方式來處理

browser.find_elements_by_xpath('//*[@id="authorResolveLinks"]/li/div/a[1]').get_attribute('innerHTML') 
browser.find_elements_by_xpath('//*[@id="authorResolveLinks"]/li/div/a[2]').get_attribute('innerHTML') 
browser.find_elements_by_xpath('//*[@id="authorResolveLinks"]/li/div/a[3]').get_attribute('innerHTML')

或者

代替find_element_by_xpath使用find_elements_by_xpath

elements = browser.find_elements_by_xpath('//*[@id="authorResolveLinks"]/li/div/a')

遍歷元素，並在循環迭代中的每個元素上使用get_attribute('innerHTML')。

來源

2016-01-06 18:00:56 parishodak

循環多個工具提示

回答

相關問題