2013-05-01 50 views
0

這裏有一種Python noob。我從Matthew Russell的書籍「21採礦Twitter」和「挖掘社交網絡」中找到Python代碼,用於從Twitter API收集各種數據的項目。請參閱他的github頁面:https://github.com/ptwobrussell如何從我從Twitter API收集的數據生成網絡矩陣?

我無法弄清楚的一件事是如何根據用戶和他/她的追隨者/朋友之間的關係來生成網絡矩陣/圖。因此,舉例來說,這裏是他收集的Twitter用戶的朋友Python代碼(也在這裏:https://github.com/ptwobrussell/Recipes-for-Mining-Twitter/blob/master/recipe__get_friends_followers.py):

# -*- coding: utf-8 -*- 

import sys 
import twitter 
from recipe__make_twitter_request import make_twitter_request 
import functools 

SCREEN_NAME = sys.argv[1] 
MAX_IDS = int(sys.argv[2]) 

if __name__ == '__main__': 

    # Not authenticating lowers your rate limit to 150 requests per hr. 
    # Authenticate to get 350 requests per hour. 

    t = twitter.Twitter(domain='api.twitter.com', api_version='1') 

    # You could call make_twitter_request(t, t.friends.ids, *args, **kw) or 
    # use functools to "partially bind" a new callable with these parameters 

    get_friends_ids = functools.partial(make_twitter_request, t, t.friends.ids) 

    # Ditto if you want to do the same thing to get followers... 

    # getFollowerIds = functools.partial(make_twitter_request, t, t.followers.ids) 

    cursor = -1 
    ids = [] 
    while cursor != 0: 

     # Use make_twitter_request via the partially bound callable... 

     response = get_friends_ids(screen_name=SCREEN_NAME, cursor=cursor) 
     ids += response['ids'] 
     cursor = response['next_cursor'] 

     print >> sys.stderr, 'Fetched %i total ids for %s' % (len(ids), SCREEN_NAME) 

     # Consider storing the ids to disk during each iteration to provide an 
     # an additional layer of protection from exceptional circumstances 

     if len(ids) >= MAX_IDS: 
     break 

    # Do something useful with the ids like store them to disk... 

    print ids 

所以我設法成功運行該代碼與給定用戶爲初級用戶命令行參數。但是,我如何才能將這些數據放入一個矩陣中,然後我可以分析,運行公式(如中心性)等......?到目前爲止,我認爲我可能需要使用可能包含NetworkX,Redis和Matplotlib的軟件包組合,但實際生成此矩陣的步驟無法實現。

+0

看看我的不同巨大的twitter可視化:www.twittercensus.se/graph2013 www.finnishtwitter.com和www.twittercensus.dk - 如果您有任何問題,請詢問! – 2013-05-01 19:34:50

+0

偉大的可視化。你有一個公開可用的腳本來收集數據嗎?你使用什麼軟件語言和軟件包? – TJE 2013-05-01 21:45:26

回答

0

您可以將數據存儲在數據庫或文件中。根據您將用於分析數據支持的軟件更好地進行選擇。

這裏是.gdf格式的文件的例子,讓您存儲節點和邊緣數據:

nodedef> id VARCHAR, label VARCHAR, followerCount VARCHAR 
1623,jchris,5610 
13348,Scobleizer,319673 
21213,tlg,1141 
... 
edgedef> user VARCHAR,friend VARCHAR 
1623,13348 
1623,621713 
... 

你比如在引用的代碼並提取邊緣的一部分,你還需要另一個提取一步來提取節點。