2011-08-21 84 views

回答

1

檢索修訂並實現一個方法來計數它們(它只是XML)。

MediaWiki Revisions: Example

api.php ? action=query & prop=revisions & titles=API|Main%20Page & rvprop=timestamp|user|comment|content 

<api> 
<query> 
<pages> 
    <page pageid="1191" ns="0" title="API"> 
    <revisions> 
     <rev user="Harryboyles" timestamp="2006-10-31T05:39:01Z" comment="revert unexplained change: see talk ..."> 
     ...content... 
     </rev> 
    </revisions> 
    </page> 
    <page pageid="11105676" ns="0" title="Main Page"> 
    <revisions> 
     <rev user="Ryan Postlethwaite" timestamp="2007-06-26T19:05:06Z" comment="rv - what was that for?"> 
     ...content... 
     </rev> 
    </revisions> 
    </page> 
</pages> 

+0

這是獲取修訂數量的非常昂貴的方式。你正在請求很多數據(頁面內容,編輯摘要),你只是扔掉了。 – Mark

1

下面是代碼來獲得頁面的版本數(在這種情況下,JSON wiki page):

import requests 

BASE_URL = "http://en.wikipedia.org/w/api.php" 
TITLE = 'JSON' 

parameters = { 'action': 'query', 
      'format': 'json', 
      'continue': '', 
      'titles': TITLE, 
      'prop': 'revisions', 
      'rvprop': 'ids|userid', 
      'rvlimit': 'max'} 

wp_call = requests.get(BASE_URL, params=parameters) 
response = wp_call.json() 

total_revisions = 0 

while True: 
    wp_call = requests.get(BASE_URL, params=parameters) 
    response = wp_call.json() 

    for page_id in response['query']['pages']: 
    total_revisions += len(response['query']['pages'][page_id]['revisions']) 

    if 'continue' in response: 
    parameters['continue'] = response['continue']['continue'] 
    parameters['rvcontinue'] = response['continue']['rvcontinue'] 

    else: 
    break 

print parameters['titles'], total_revisions 

您可以檢查這裏的結果:https://en.wikipedia.org/w/index.php?title=JSON&action=info#Edit_history

(可從相應的維基百科頁面側欄訪問:工具 - 頁面信息)