2
節點計數我有一個關於論壇的帖子下列屬性的MDB數據庫:的圖形不匹配
thread
author (posted in the thread)
children (a list of authors who replied to the post)
child_count (number of children in the list)
我試着去建立與以下節點的圖:
thread
author
child authors
總我的數據庫中不同作者的數量超過了30,000,但生成作者數的圖形大約爲3000.或者,在總共33000個節點中,以下代碼產生大約5000個。這裏發生了什麼?
for doc in coll.find():
thread = doc['thread'].encode('utf-8')
author_parent = doc['author'].encode('utf-8')
children = doc['children']
children_count = len(children)
#print G.nodes()
#print post_parent, author, doc['thread']
try:
if thread in G:
continue
else:
G.add_node(thread, color='red')
thread_count+=1
if author_parent in G:
G.add_edge(author_parent, thread)
else:
G.add_node(author_parent, color='green')
G.add_edge(author_parent, thread, weight=0)
author_count+=1
if doc['child_count']!=0:
for doc in children:
if doc['author'].encode("utf-8") in G:
print doc['author'].encode("utf-8"), 'in G'
G.add_edge(doc['author'].encode("utf-8"), author_parent)
else:
G.add_node(doc['author'].encode("utf-8"),color='green')
G.add_edge(doc['author'].encode("utf-8"), author_parent, weight=0)
author_count+=1
except:
print "failed"
nx.write_dot(G,PATH)
print thread_count, author_count, children_count
你確定'coll.find()'返回30000結果嗎? – brice 2012-04-23 09:55:25
@brice coll.find()返回超過500,000個結果。不同的作者約爲30,000。 – codious 2012-04-23 11:22:00