這個運行和抓取的鏈接完全是我想要的,除非python不能識別「scraped_pages」的值,當我在終端中運行它時,scraped頁面會每增加1個循環但是當整數高於「page_nums」時它纔會繼續。當我將「page_nums」設置爲5以下的整數時,它將運行並停在5處,但會再次崩潰。我很抱歉,如果我沒有把這個問題說成是我整晚都在做的最好的問題。 以上所有代碼正在工作這是問題代碼。所有模塊也正確導入。 它使用硒,我不確定明確的等待是否工作,因爲它會在它達到「page_nums」值之前崩潰。python不會接受變量的值
page_nums = raw_input("how many pages to scrape?: ")
urls_list = []
scraped_pages = 0
scraped_links = 0
while scraped_pages <= page_nums:
for li in list_items:
for a in li.find_all('a', href=True):
url = a['href']
if slicer(url,'http'):
url1 = slicer(url,'http')
urls_list.append(url1)
scraped_links += 1
elif slicer(url,'www'):
url1 = slicer(url,'www')
urls_list.append(url1)
scraped_links += 1
else:
pass
scraped_pages += 1
WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "/html/body/div[5]/div[4]/div[9]/div[1]/div[3]/div/div[5]/div/span[1]/div/table/tbody/tr/td[12]")))
driver.find_element_by_xpath("/html/body/div[5]/div[4]/div[9]/div[1]/div[3]/div/div[5]/div/span[1]/div/table/tbody/tr/td[12]").click()
print scraped_links
print urls_list
這是返回的錯誤的一部分。
1
2
Traceback (most recent call last):
File "google page click 2.py", line 51, in <module>
driver.find_element_by_xpath("/html/body/div[5]/div[4]/div[9]/div[1]/div[3]/div/div[5]/div/span[1]/div/table/tbody/tr/td[12]").click()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 75, in click
self._execute(Command.CLICK_ELEMENT)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 454, in _execute
return self._parent.execute(command, params)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 201, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 181, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotVisibleException: Message: Element is not currently visible and so may not be interacted with
Stacktrace:
at fxdriver.preconditions.visible (file:///tmp/tmpzSHEeb/extensions/[email protected]/components/command-processor.js:9981)
at DelayedCommand.prototype.checkPreconditions_ (file:///tmp/tmpzSHEeb/extensions/[email protected]/components/command-processor.js:12517)
at DelayedCommand.prototype.executeInternal_/h (file:///tmp/tmpzSHEeb/extensions/[email protected]/components/command-processor.js:12534)
at DelayedCommand.prototype.executeInternal_ (file:///tmp/tmpzSHEeb/extensions/[email protected]/components/command-processor.js:12539)
at DelayedCommand.prototype.execute/< (file:///tmp/tmpzSHEeb/extensions/[email protected]/components/command-processor.js:12481)
它實際上在問題中說過我使用了impliit和explicit等待。 –
同意 - 不排除我的潛在答案,但?你能給我提供更多的信息,所以我可以複製你的問題? – gtlambert
http://pastebin.com/F7DQKStc –