duckduckgo API不返回結果

編輯我現在認識到API簡直是不足夠，甚至沒有工作。我想重定向我的問題，我希望能夠使用他們的「我感覺很迷人」來自動搜索duckduckgo。因此，我可以搜索「stackoverflow」作爲例子，並獲得主頁（「https://stackoverflow.com/」）作爲我的結果。duckduckgo API不返回結果

我正在使用duckduckgo API。 Here

而且我發現使用時：

r = duckduckgo.query("example")

結果並不能反映一個手動搜索，即：

for result in r.results: 
    print result

結果：

>>> 
>>>

沒有。

並且尋找results中的索引會導致出界限錯誤，因爲它是空的。

我該如何獲得搜索結果？

看來（根據其記錄的例子）的API應該回答問題，並在r.answer.text

形式給予一種「我感覺鴨子」，但該網站以這樣的方式進行我無法搜索它並使用常規方法解析結果。

我想知道我應該如何解析搜索結果與這個API或任何其他方法從這個網站。

謝謝。

來源

2012-07-30 Inbar Rose

如果你訪問DuckDuck Go API Page，你會發現有關使用API的一些注意事項。第一個註釋清楚地指出：

由於這是一個Zero-click Info API，大多數深度查詢（非主題名稱）將爲空白。

的下面是這些字段的列表：

Abstract: "" 
AbstractText: "" 
AbstractSource: "" 
AbstractURL: "" 
Image: "" 
Heading: "" 
Answer: "" 
Redirect: "" 
AnswerType: "" 
Definition: "" 
DefinitionSource: "" 
DefinitionURL: "" 
RelatedTopics: [ ] 
Results: [ ] 
Type: ""

因此，它可能是一個遺憾，但他們的API只是截斷了一堆結果，並沒有給予他們給你;可能會更快地工作，似乎除了使用DuckDuckGo.com之外，什麼都不能做。

因此，很顯然，在這種情況下，API不是要走的路。

至於我，我看到只剩下一條出路：從duckduckgo.com檢索原始html並使用（例如） html5lib（值得一提的是他們的html結構很好）。

值得一提的是，解析html頁面並不是最可靠的報廢數據的方式，因爲html結構可以改變，而API通常保持穩定，直到公佈更改。

這裏是和舉例如何與BeautifulSoup可以這樣解析實現：

from BeautifulSoup import BeautifulSoup 
import urllib 
import re 

site = urllib.urlopen('http://duckduckgo.com/?q=example') 
data = site.read() 

parsed = BeautifulSoup(data) 
topics = parsed.findAll('div', {'id': 'zero_click_topics'})[0] 
results = topics.findAll('div', {'class': re.compile('results_*')}) 

print results[0].text

此腳本會打印：

u'Eixample, an inner suburb of Barcelona with distinctive architecture'

直接查詢的主頁上的問題是，它使用JavaScript來產生所需的結果（不相關的主題），所以你可以使用HTML版本來獲得結果。 HTML的版本有不同的鏈接：

http://duckduckgo.com/?q=example＃JavaScript版本
http://duckduckgo.com/html/?q=example＃HTML的唯一版本

讓我們看看我們可以得到：

site = urllib.urlopen('http://duckduckgo.com/html/?q=example') 
data = site.read() 
parsed = BeautifulSoup(data) 

first_link = parsed.findAll('div', {'class': re.compile('links_main*')})[0].a['href']

存儲在first_link結果變量是第一個鏈接結果（不是關聯編輯搜索），搜索引擎輸出：

http://www.iana.org/domains/example

要得到所有你可以遍歷的鏈接上找到了標記（其它數據除了鏈接可以收到類似的方式）

for i in parsed.findAll('div', {'class': re.compile('links_main*')}): 
    print i.a['href'] 

http://www.iana.org/domains/example 
https://twitter.com/example 
https://www.facebook.com/leadingbyexample 
http://www.trythisforexample.com/ 
http://www.myspace.com/leadingbyexample?_escaped_fragment_= 
https://www.youtube.com/watch?v=CLXt3yh2g0s 
https://en.wikipedia.org/wiki/Example_(musician) 
http://www.merriam-webster.com/dictionary/example 
...

請注意，純HTML版本只包含結果和相關搜索您必須使用JavaScript版本。（在url中沒有html部分）。

來源

2012-08-12 16:27:57

謝謝。這有助於我理解問題所在，你是從哪裏找到的？：P我試着爲duckduckgo的常規html頁面編寫一個解析器，但是我遇到了問題，因爲它使用java或其他東西，結果沒有以適當的html格式出來...... – 2012-08-13 07:25:05

它對BeautifulSoup適用於我。將更新答案 – 2012-08-13 09:53:53

好，那是錯誤的，你得到的結果是從相關的搜索。 – 2012-08-13 10:01:46

嘗試：

for result in r.results: 
    print result.text

來源

2012-07-30 14:35:33 couchemar

同樣的結果，什麼都沒有。問題是r.results是一個空數組，API根本沒有返回結果。 – 2012-07-30 14:40:13

是的，我現在看到。 r.related [0] .text例如工作正常 – couchemar 2012-07-30 14:45:20

r.related返回相關的搜索/查詢這不是我想要通過儘管...即使在某些情況下，它可能是有用的。顯然它是一種「管道膠帶解決方案」 – 2012-07-30 14:47:31

如果它適合你的應用程序，你也可以嘗試相關搜索

r = duckduckgo.query("example") 
for i in r.related_searches: 
    if i.text: 
     print i.text

這產生了：

Eixample, an inner suburb of Barcelona with distinctive architecture 
Example (musician), a British musician 
example.com, example.net, example.org, example.edu and .example, domain names reserved for use in documentation as examples 
HMS Example (P165), an Archer-class patrol and training vessel of the British Royal Navy 
The Example, a 1634 play by James Shirley 
The Example (comics), a 2009 graphic novel by Tom Taylor and Colin Wilson

來源

2012-08-12 18:07:51

後已經得到一個回答我的問題，我接受了賞金 - 我找到了一個不同的解決方案，我想在此補充完整性。非常感謝所有幫助我達成此解決方案的人。儘管這不是我要求的解決方案，但它可能會在未來幫助某人。

在這個網站，並與一些支持郵件一個漫長而艱難的談話後發現：https://duck.co/topic/strange-problem-when-searching-intel-with-my-script

這裏是解決方案代碼（從答案中張貼以上的線程）：

>>> import duckduckgo 
>>> print duckduckgo.query('! Example').redirect.url 
http://www.iana.org/domains/example

來源

2012-08-19 13:54:08

該鏈接似乎已經死亡 – 2013-11-10 16:34:30

是的，似乎是這樣。對不起 - 我在這裏發佈的帖子的主要觀點。其餘的大部分只是對問題的反覆討論。 – 2013-11-10 16:43:47

對於python 3用戶，@Rostyslav Dzinko的代碼轉錄：

import re, urllib 
import pandas as pd 
from bs4 import BeautifulSoup 

query = "your query" 
site = urllib.request.urlopen("http://duckduckgo.com/html/?q="+query) 
data = site.read() 
soup = BeautifulSoup(data, "html.parser") 

my_list = soup.find("div", {"id": "links"}).find_all("div", {'class': re.compile('.*web-result*.')})[0:15] 


(result__snippet, result_url) = ([] for i in range(2)) 

for i in my_list:   
     try: 
      result__snippet.append(i.find("a", {"class": "result__snippet"}).get_text().strip("\n").strip()) 
     except: 
      result__snippet.append(None) 
     try: 
      result_url.append(i.find("a", {"class": "result__url"}).get_text().strip("\n").strip()) 
     except: 
      result_url.append(None)

來源

2017-08-16 14:39:15

duckduckgo API不返回結果

回答

相關問題