讀二分圖

我試圖讀取具有以下結構的.txt文件的圖形：讀二分圖

115564928125997351000, ah.creativecodeapps.tiempo,1 
117923818995099650007, air.com.agg.popcornmakermarket,-1 
104000841215686444437, air.com.zahdoo.cadie,1 
. 
. 
.

我一直在使用下面的命令：

g=nx.read_weighted_edgelist('siatoy.txt', delimiter=',',nodetype=str,encoding='utf-8')

但是，當我g.edges(data=True)，我得到這個：

[('106784557494786869271', ' com.map2app.U5635321228165120A5661458385862656'), 
('106784557494786869271',' com.jb.gokeyboard.theme.mzdevelopment.americankeyboard'), 
('106784557494786869271', ' com.benbasha.whoopeecushion'), 
(' com.airplaneflighttakeoff', '115981152169430603941'),...]

但我想總的數字ID爲t的第一要素他元組。請注意，這不會發生在我在示例中顯示的列表的最後一個元素上。

我該如何做到這一點？我需要稍後迭代邊，我需要考慮邊的順序，這意味着我需要元組的第一個元素始終是數字ID。

問題是如何在閱讀圖表或完成後實現這一目標？

來源

2015-11-04 Nestorghh

一個想法是使用str.isdigit來測試節點是否是數字。（例如，見this SO answer）。然後你可以創建邊緣的列表，排序每個邊緣，使得數字節點是第一位的：

edges = [] 
for u, v, d in G.edges(data=True): # note that d contains each edge's data 
    if u.isdigit(): # if u is numeric put it first 
     edges.append((u, v, d)) 
    else: 
     edges.append((v, u, d))

或者在一個班輪形式：

edges = [ (u, v, d) if u.isdigit() else (v, u, d) for u, v, d in G.edges(data=True) ] 
print edges

這輸出：

[('117923818995099650007', ' air.com.agg.popcornmakermarket', {'weight': -1.0}), 
('104000841215686444437', ' air.com.zahdoo.cadie', {'weight': 1.0}), 
('115564928125997351000', ' ah.creativecodeapps.tiempo', {'weight': 1.0})]

來源

2015-11-04 19:25:26 mdml

回答

相關問題