這裏有一個基於NumPy的方法 -
def all_combs(a): # Parte-1
num_combs = np.prod(list(map(len,a)))
return np.array(np.meshgrid(*a)).reshape(-1,num_combs).T
def get_minrep_combs(a): # Parte-2
out = all_combs(a)
counts = (np.diff(np.sort(out,axis=1),axis=1)==0).sum(1)
return out[counts == counts.min()]
採樣運行 -
In [161]: a = [np.array([1]),np.array([2]),np.array([2,3]),np.array([3]),\
...: np.array([4]),np.array([3,4,5])]
In [162]: all_combs(a) # Part-1 results
Out[162]:
array([[1, 2, 2, 3, 4, 3],
[1, 2, 2, 3, 4, 4],
[1, 2, 2, 3, 4, 5],
[1, 2, 3, 3, 4, 3],
[1, 2, 3, 3, 4, 4],
[1, 2, 3, 3, 4, 5]])
In [163]: get_minrep_combs(a) # Part-2 results
Out[163]:
array([[1, 2, 2, 3, 4, 5],
[1, 2, 3, 3, 4, 5]])
只給你們的all_combs
感,這裏有一個更有點 「正常」 的樣品的具體運行情況 -
In [166]: a = [np.array([2,3]), np.array([5,6,7])]
In [167]: all_combs(a)
Out[167]:
array([[2, 5],
[3, 5],
[2, 6],
[3, 6],
[2, 7],
[3, 7]])
In [164]: a = [np.array([2,3,4]), np.array([5,6,7,9])]
In [165]: all_combs(a)
Out[165]:
array([[2, 5],
[3, 5],
[4, 5],
[2, 6],
[3, 6],
[4, 6],
[2, 7],
[3, 7],
[4, 7],
[2, 9],
[3, 9],
[4, 9]])
出於性能
出於性能考慮,我們才能避免在part-1
轉置和part-2
沿着列(axis=0)
執行的操作,還可以使用切片以避免np.diff
,因此有一個優化的版本,像這樣 -
def get_minrep_combs_optimized(a): # Parte-1,2
num_combs = np.prod(list(map(len,a)))
out = np.array(np.meshgrid(*a)).reshape(-1,num_combs)
sorted_out = np.sort(out,axis=0)
counts = (sorted_out[1:] == sorted_out[:-1]).sum(0)
return out[:,counts == counts.min()].T
採樣運行 -
In [188]: a = [np.array([1]),np.array([2]),np.array([2,3]),np.array([3]),\
...: np.array([4]),np.array([3,4,5])]
In [189]: get_minrep_combs_optimized(a)
Out[189]:
array([[1, 2, 2, 3, 4, 5],
[1, 2, 3, 3, 4, 5]])
運行測試
這裏有一種方法來創建一個示例輸入數據,其中有高達3
elems的每個子列表具有跨越其他子列表元素一些比賽 -
In [42]: lens = np.random.randint(1,4,(20))
In [43]: a = [np.random.randint(1,10,L) for L in lens]
In [44]: lens
Out[44]: array([1, 1, 2, 2, 2, 2, 1, 2, 3, 2, 1, 2, 2, 3, 1, 1, 3, 2, 2, 3])
In [45]: a
Out[45]:
[array([8]),
array([8]),
array([7, 9]),
array([5, 5]),
array([6, 4]),
array([3, 1]),
array([8]),
array([1, 9]),
array([9, 5, 7]),
array([1, 1]),
array([3]),
array([1, 5]),
array([5, 5]),
array([7, 9, 2]),
array([5]),
array([1]),
array([3, 2, 9]),
array([3, 7]),
array([5, 3]),
array([2, 7, 3])]
計時 -
In [46]: %timeit leastReps(combinations(a)) #@Daniel Forsman's soln
1 loops, best of 3: 330 ms per loop
In [47]: %timeit get_minrep_combs_optimized(a)
10 loops, best of 3: 28.7 ms per loop
讓我們有更多的比賽 -
In [50]: a = [np.random.randint(1,4,L) for L in lens]
In [51]: %timeit leastReps(combinations(a)) #@Daniel Forsman's soln
1 loops, best of 3: 328 ms per loop
In [52]: %timeit get_minrep_combs_optimized(a)
10 loops, best of 3: 29.5 ms per loop
性能差異不會有太大變化。礦山和Divakar的之間
爲什麼問題得到了downvoted?請詳細說明,所以我可以改進它。 – Make42
@Divakar:我用過你的。 – Make42