2017-08-24 32 views
2

我一直在嘗試合併/解析這個列表與多個列表裏面只有一個列表。在條件子句列表裏面合併列表

名單我想分析/合併具有以下格式:

list_one = [ [['id1'],['value']], 
      [['id1'],['value1'],['value2'],['value3'],['value4'],['value5']], 
      [['id1'],['value6']], 
      [['id1'],['value7'],['value8']], 
      [['id2'],['value']], 
      [['id2'],['value1'],['value2'],['value3'],['value4'],['value5']], 
      [['id2'],['value6']], 
      [['id2'],['value7'],['value8']] 
] 

我想出了一些谷歌上搜索這個代碼後:

pre_info = list(set(i[0] for i in itertools.chain.from_iterable(list_one))) 
final_info = list(map(lambda x: [x], sorted(pre_info, key=len))) 
print final_info 

但只打印了我的ID

該病的輸出是:

final_list = [ 
       [['id'],['value'],['value1'],['value2'],['value3'],['value4'],['value5'],['value6'],['value7'],['value8']], 
       [['id2'],['value'],['value1'],['value2'],['value3'],['value4'],['value5'],['value6'],['value7'],['value8']] 
] 

每行的條件顯然是'id',它總是每個列表中的第一個位置。

+1

爲什麼仍然堅持到每個都有一個元素的嵌套列表?爲什麼不是'['id','value','value1','value2','value3','value4','value5','value6','value7','value8']'? –

+0

這些'id1'和'id2'總是分組在一起(所以連續列表具有相同的id值,沒有id的混合)? –

回答

0

你可以試試這個:

import collections 

list_one = [ [['id1'],['value']], 
     [['id1'],['value1'],['value2'],['value3'],['value4'],['value5']], 
     [['id1'],['value6']], 
     [['id1'],['value7'],['value8']], 
     [['id2'],['value']], 
     [['id2'],['value1'],['value2'],['value3'],['value4'],['value5']], 
     [['id2'],['value6']], 
     [['id2'],['value7'],['value8']] 
] 

d = collections.defaultdict(list) 
for row in list_one: 
    d[row[0][0]].extend(row[1:]) 

final_output = sorted([[[a]]+b for a, b in d.items()], key = lambda x: int(x[0][0][-1])) 

最終輸出:

[[['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]] 
+0

那麼..在我的測試中,它通過了 - 試試真實的數據!非常感謝! –

3

您需要將您的值按每個獨特的id分組,您不能只是弄平東西。您必須使用字典才能將列表按id分組,或者,如果每個唯一的id的列表都是連續的,請使用itertools.groupby()

使用詞典:

by_id = {} 
for id, *values in list_one: 
    # unwrap values as we add them to the id group 
    by_id.setdefault(id[0], []).extend(v[0] for v in values) 

# extract all IDs an value lists into a new list 
final_list = [[id] + values for id, values in sorted(by_id.items())] 

或Python的版本2:

by_id = {} 
for row in list_one: 
    # unwrap values as we add them to the id group 
    id, values = row[0][0], row[1:] 
    by_id.setdefault(id, []).extend(v[0] for v in values) 

# extract all IDs an value lists into a new list 
final_list = [[id] + values for id, values in sorted(by_id.items())] 

我整理的ID輸出列表;字典沒有內在的秩序。請注意我刪除了包裝單例列表對象;這些正在佔用你不需要使用的記憶,並且使算法複雜化。

如果您需要按照首次出現的順序包含這些列表,您可以使用collections.OrderedDict() object作爲list_one

如前所述,如果id名單已經連續的,你可以使用itertools.groupby()做分組一步到位:

from itertools import groupby 

[[id] + [value[0] for sublist in group for value in sublist[1:]] 
for id, group in groupby(list_one, lambda s: s[0][0])] 

演示:

>>> by_id = {} 
>>> for id, *values in list_one: 
...  # unwrap values as we add them to the id group 
...  by_id.setdefault(id[0], []).extend(v[0] for v in values) 
... 
>>> [[id] + values for id, values in sorted(by_id.items())] 
[['id1', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8'], ['id2', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8']] 
>>> 
>>> from itertools import groupby 
>>> [[id] + [value[0] for sublist in group for value in sublist[1:]] 
... for id, group in groupby(list_one, lambda s: s[0][0])] 
[['id1', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8'], ['id2', 'value', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8']] 

如果你覺得你必須在你的輸出中有這些單例列表,隨意添加它們。

+0

@MartijnPieters它給我的* 爲id,*值列表中的SyntaxError: 非常感謝您的幫助 –

+0

@RicardoRibeiro:您是否正在使用Python 2?我爲較老的Pythons添加了一個版本。 –

+0

@MartijnPieters,是的,我使用2.0版本。很好的帖子回覆,有很多細節和很棒的信息。非常感謝! –

0

以上回答提供了良好的解決方案,這裏是另一種方式來做到這一點,但我同意@的Martijn Pieters的♦和他在清晰讀數方面的解決方案

import itertools 

chained = itertools.chain.from_iterable(list_one) 

schain = set([tuple(c) for c in chained]) 

{('id',), 
('value',), 
('value1',), 
('value2',), 
('value3',), 
('value4',), 
('value5',), 
('value6',), 
('value7',), 
('value8',)} 


list(sorted([list(v) for v in schain])) 

[['id'], 
['value'], 
['value1'], 
['value2'], 
['value3'], 
['value4'], 
['value5'], 
['value6'], 
['value7'], 
['value8']] 

基於其他值編輯

temp = [list(v) for v in schain] 

temp.pop(temp.index(['id'])) 

temp.sort() 

temp.insert(0, ['id']) 

[['id'], 
['abc'], 
['value'], 
['value1'], 
['value2'], 
['value3'], 
['value4'], 
['value5'], 
['value6'], 
['value7'], 
['value8']] 
0

我有這樣的解決方案,但它僅如果ID字符串工作或詮釋,且必須在每個列表的頭:

l=[ [['id1'],['value']], 
      [['id1'],['value1'],['value2'],['value3'],['value4'],['value5']], 
      [['id1'],['value6']], 
      [['id1'],['value7'],['value8']], 
      [['id2'],['value']], 
      [['id2'],['value1'],['value2'],['value3'],['value4'],['value5']], 
      [['id2'],['value6']], 
      [['id2'],['value7'],['value8']] 
] 
d={} 

for ll in l: 
    d[ll[0][0]]=[] 
for i,ll in enumerate(l): 
    for lll in ll[1:]: 
     d[ll[0][0]].append(lll) 
result=[] 
for key,items in d.iteritems(): 
    result.append([[key]]+items) 

print result 

結果:

[[['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]]