2011-06-13 179 views
11

如何比較dict的兩個列表?結果應該是奇數的了,從字典B.名單如何比較Python中的兩個字典列表?

例子:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9}, 
     {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}] 


ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7}, 
     {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5}, 
     {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}] 

在這裏,我想和1dB的比較LDA。它應該打印下面的輸出。

ldB -> {user:"nameA", b:99.9, d:43.7} 
ldB -> {user:"nameB", a:67.7, c:1.1 } 
ldb -> {user:"nameC", a:89.9, b:77.3, c:2.2, d:6.5} 

我已經通過了下面的鏈接,但它只返回名稱,但我想要名稱和值如上。

List of Dicts comparision to match between lists and detect value changes in Python

+0

這裏沒有任意結構的層次差異,所以你需要寫基於你對數據的瞭解,一個更爲複雜的算法。 '用戶'是一個特殊的密鑰?是否用於在列表中的項目之間建立對應關係(假設'ldB'出現故障,結果應該如何)? – 2011-06-13 16:56:23

+0

是的,這裏用戶特殊鍵 – newbe 2011-06-13 16:59:27

+0

對於程序的其餘部分以及這裏,可能更有意義的是讓結構更像'ldA = {'userA':{'a':1,'b': 2,...},...}'。 – 2011-06-13 17:27:32

回答

7

對於一般的解決方案,請考慮以下內容。即使用戶在列表中出現故障,它也會適當地進行區分。

def dict_diff (merge, lhs, rhs): 
    """Generic dictionary difference.""" 
    diff = {} 
    for key in lhs.keys(): 
      # auto-merge for missing key on right-hand-side. 
     if (not rhs.has_key(key)): 
      diff[key] = lhs[key] 
      # on collision, invoke custom merge function. 
     elif (lhs[key] != rhs[key]): 
      diff[key] = merge(lhs[key], rhs[key]) 
    for key in rhs.keys(): 
      # auto-merge for missing key on left-hand-side. 
     if (not lhs.has_key(key)): 
      diff[key] = rhs[key] 
    return diff 

def user_diff (lhs, rhs): 
    """Merge dictionaries using value from right-hand-side on conflict.""" 
    merge = lambda l,r: r 
    return dict_diff(merge, lhs, rhs) 

import copy 

def push (x, k, v): 
    """Returns copy of dict `x` with key `k` set to `v`.""" 
    x = copy.copy(x); x[k] = v; return x 

def pop (x, k): 
    """Returns copy of dict `x` without key `k`.""" 
    x = copy.copy(x); del x[k]; return x 

def special_diff (lhs, rhs, k): 
     # transform list of dicts into 2 levels of dicts, 1st level index by k. 
    lhs = dict([(D[k],pop(D,k)) for D in lhs]) 
    rhs = dict([(D[k],pop(D,k)) for D in rhs]) 
     # diff at the 1st level. 
    c = dict_diff(user_diff, lhs, rhs) 
     # transform to back to initial format. 
    return [push(D,k,K) for (K,D) in c.items()] 

然後,您可以檢查的解決方案:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9}, 
     {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}] 
ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7}, 
     {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5}, 
     {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}] 
import pprint 
if __name__ == '__main__': 
    pprint.pprint(special_diff(ldA, ldB, 'user')) 
+0

正如Karl在他的回答中指出的那樣,您需要在自定義比較運算符中替換'dict_diff'函數中的'!='比較,因爲您正在比較浮點值。或者,在這種情況下,您可以用'min'或'max'(或任何適合您的需要)替換'lambda l,r:r'。 – 2011-06-13 17:39:01

+0

現在,這是工業實力!我認爲'dict_diff'中的'merge'的調用應該是'user_diff'。 – 2011-06-13 17:49:57

+0

@Karl:它像廣告中那樣工作,我實際上測試了它。合併函數是'user_diff',就像在'special_diff'中傳遞給'dict_diff'一樣。這種間接方式允許使用相同的算法來區分列表並區分各個用戶。 – 2011-06-13 17:51:55

2

我會假設相應dict s爲在這兩個列表的順序相同。

根據這一假設,您可以使用下面的代碼:

def diffs(L1, L2): 
    answer = [] 
    for i, d1 in enumerate(L1): 
     d = {} 
     d2 = L2[i] 
     for key in d1: 
      if key not in d1: 
       print key, "is in d1 but not in d2" 
      elif d1[key] != d2[key]: 
       d[key] = d2[key] 
     answer.append(d) 
    return answer 

未經檢驗。請評論是否有錯誤,我會修復它們

+0

首先,我想感謝您的回覆,這裏只返回不同的值,但我需要從用戶特定的不同值ldB – newbe 2011-06-13 17:10:05

+0

你是什麼意思「用戶特定」?你的意思是你想比較字典中「User」的值是否相同,或者你的意思是你想只比較將作爲輸入提供給函數的某些鍵嗎? – inspectorG4dget 2011-06-13 18:28:29

3

我的方法:根據要排除的值的ldA構建查找,然後確定從ldB中排除每個列表中適當值的結果。

lookup = dict((x['user'], dict(x)) for x in ldA) 
# 'dict(x)' is used here to make a copy 
for v in lookup.values(): del v['user'] 

result = [ 
    dict(
     (k, v) 
     for (k, v) in item.items() 
     if item['user'] not in lookup or lookup[item['user']].get(k, v) == v 
    ) 
    for item in ldB 
] 

You should, however, be aware that comparing floating-point values like that can't be relied upon

+0

感謝您的回覆+1 – newbe 2011-06-14 01:37:23

0

這絕對需要從您的示例數據中進行一些假設,主要是ldA中不會有用戶不在ldB中,如果這是一個無效假設,請告訴我。你可以稱之爲dict_diff(ldA, ldB, user)

def dict_diff(ldA, ldB, key): 
    for i, dA in enumerate(ldA): 
     d = {key: dA[key]} 
     d.update(dict((k, v) for k, v in ldB[i].items() if v != dA[k])) 
     print "ldB -> " + str(d) 
    for dB in ldB[i+1:]: 
     print "ldB -> " + str(dB) 
+0

+1,非常感謝您的回覆 – newbe 2011-06-14 01:35:08

1

還有一個解決方案有點怪異(對不起,如果我錯過的東西),但它也可以讓你配置自己的平等檢查(你只需要修改拉姆達的isEqual此),以及給你如何處理的情況下,兩種不同的選擇時,按鍵的不同:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9}, 
     {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}] 


ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7}, 
     {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5}, 
     {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}] 

ldA.extend((ldB.pop() for i in xrange(len(ldB)))) # get the only one list here 

output = [] 

isEqual = lambda x,y: x != y # add your custom equality check here, for example rounding values before comparison and so on 

while len(ldA) > 0: # iterate through list 
    row = ldA.pop(0) # get the first element in list and remove it from list 
    for i, srow in enumerate(ldA): 
     if row['user'] != srow['user']: 
      continue 
     res = {'user': srow['user']} # 
     # next line will ignore all keys of srow which are not in row 
     res.update(dict((key,val) for key,val in ldA.pop(i).iteritems() if key in row and isEqual(val, row[key]))) 
     # next line will include the srow.key and srow.value into the results even in a case when there is no such pair in a row 
     #res.update(dict(filter(lambda d: isEqual(d[1], row[d[0]]) if d[0] in row else True ,ldA.pop(i).items()))) 
     output.append(res) 
     break 
    else: 
     output.append(row) 

print output 
+0

@andrew +1,非常感謝回覆 – newbe 2011-06-14 01:35:51

0

我寫this tool而回,它目前可以合作pe嵌套列表,字典和集合。爲您提供了一個更簡潔的輸出(在. > i:1 > 'c'.指頂層和i:1指列表索引1進行比較):

compare(ldA, ldB) 
. > i:0 > 'b' dict value is different: 
100.0 
99.9 

. > i:0 > 'd' dict value is different: 
48.9 
43.7 

. > i:1 > 'a' dict value is different: 
46.7 
67.7 

. > i:1 > 'c' dict value is different: 
0.0 
1.1 

. lists differed at positions: 2 
['<not present>'] 
[{'c': 2.2, 'd': 6.5, 'a': 89.9, 'user': 'nameC', 'b': 77.3}]