2

節點計數我有一個關於論壇的帖子下列屬性的MDB數據庫:的圖形不匹配

thread 
author (posted in the thread) 
children (a list of authors who replied to the post) 
child_count (number of children in the list) 

我試着去建立與以下節點的圖:

thread 
author 
child authors 

總我的數據庫中不同作者的數量超過了30,000,但生成作者數的圖形大約爲3000.或者,在總共33000個節點中,以下代碼產生大約5000個。這裏發生了什麼?

for doc in coll.find(): 

    thread = doc['thread'].encode('utf-8') 
    author_parent = doc['author'].encode('utf-8') 
    children = doc['children'] 
    children_count = len(children) 
    #print G.nodes() 

    #print post_parent, author, doc['thread'] 
    try: 
     if thread in G: 
      continue 
     else: 
      G.add_node(thread, color='red') 
      thread_count+=1 


     if author_parent in G: 
      G.add_edge(author_parent, thread) 
     else: 
      G.add_node(author_parent, color='green') 
      G.add_edge(author_parent, thread, weight=0) 
      author_count+=1 


     if doc['child_count']!=0:   
      for doc in children: 
       if doc['author'].encode("utf-8") in G: 
        print doc['author'].encode("utf-8"), 'in G' 
        G.add_edge(doc['author'].encode("utf-8"), author_parent) 
       else: 
        G.add_node(doc['author'].encode("utf-8"),color='green') 
        G.add_edge(doc['author'].encode("utf-8"), author_parent, weight=0) 
        author_count+=1  

    except: 
     print "failed" 
     nx.write_dot(G,PATH) 

    print thread_count, author_count, children_count 
+0

你確定'coll.find()'返回30000結果嗎? – brice 2012-04-23 09:55:25

+0

@brice coll.find()返回超過500,000個結果。不同的作者約爲30,000。 – codious 2012-04-23 11:22:00

回答

1

我得到了答案。 continue語句跳到下一個迭代,所以我以這種方式丟失了許多節點。