這是馬虎,我想聽聽反饋意見,但這裏的代碼是這樣的想法,即最新的pubmed id與最新的論文(我不知道是否屬實)是一樣的。基本上二進制搜索最新的PMID,然後給出最近的n
的列表。這不看日期,只返回PMID,所以我不確定這是一個合適的答案,但也許這個想法可以適應。
CODE:
import urllib2
def pmid_exists(pmid):
url_stem = 'https://www.ncbi.nlm.nih.gov/pubmed/'
query = url_stem+str(pmid)
try:
request = urllib2.urlopen(query)
return True
except urllib2.HTTPError:
return False
def get_latest_pmid(max_exists = 27239557, min_missing = -1):
#print max_exists,'-->',min_missing
if abs(min_missing-max_exists) <= 1:
return max_exists
guess = (max_exists+min_missing)/2
if min_missing == -1:
guess = 2*max_exists
if pmid_exists(guess):
return get_latest_pmid(guess, min_missing)
else:
return get_latest_pmid(max_exists, guess)
#Start of program
if __name__ == '__main__':
n = 5
latest_pmid = get_latest_pmid()
most_recent_n_pmids = range(latest_pmid-n, latest_pmid)
print most_recent_n_pmids
OUTPUT:
[28245638, 28245639, 28245640, 28245641, 28245642]
我看到......非常感謝引用.. – carl