由於你說的成本是你可以使用的整數:
def neardup(items):
forbidden = set()
for elem in items:
key = elem['name'], elem['code'], int(elem['cost'])
if key not in forbidden:
yield elem
for diff in (-1,0,1): # add all keys invalidated by this
key = elem['name'], elem['code'], int(elem['cost'])-diff
forbidden.add(key)
這是一個不那麼棘手的方式,r eally計算差異:
from collections import defaultdict
def neardup2(items):
# this is a mapping `(name, code) -> [cost1, cost2, ... ]`
forbidden = defaultdict(list)
for elem in items:
key = elem['name'], elem['code']
curcost = float(elem['cost'])
# a item is new if we never saw the key before
if (key not in forbidden or
# or if all the known costs differ by more than 2
all(abs(cost-curcost) >= 2 for cost in forbidden[key])):
yield elem
forbidden[key].append(curcost)
這兩種解決方案都避免重新掃描每個項目的整個列表。畢竟,如果(name, code)
是平等的,成本纔會變得有趣,因此您可以使用字典快速查找所有候選項。
雖然有更好的技術方法(尤其是Jochen的答案,它使用'yield'來減少大型列表中的內存使用量),但我更喜歡您的方法的可讀性。 – Pranab 2011-04-17 20:59:23