快速（ER）的方法來確定列表中的所有非支配項

我有Ñ陣列與每個4個元素，即（Ñ = 2）的列表：快速（ER）的方法來確定列表中的所有非支配項

l = [[1, 2, 3, 4], [5, 6, 7, 8]]

和很試圖找到列表中「非支配」的所有元素 - 也就是說，它們不受列表中的任何其他元素支配。如果數組中的每個項目小於或等於另一個數組中的相應項目，則數組將支配另一個數組。所以

dominates([1, 2, 3, 4], [5, 6, 7, 8]) == True

as 1 <= 5 and 2 <= 6 and 3 <= 7 and 4 <= 8。但是

dominates([1, 2, 3, 9], [5, 6, 7, 8]) == False

as 9 > 8。這個功能是比較容易編寫，例如：

def dominates(a, b): 
    return all(i <= j for i, j in zip(a, b))

更簡潔，給l = [a1, a2, a3, .., an]其中a是長度爲4列，我想找找不受任何其他a in l主宰一切a。

我有以下解決方案：

def get_non_dominated(l): 
    to_remove = set() 
    for ind, item_1 in enumerate(l): 
     if item_2 in to_remove: 
      continue 
     for item_2 in l[ind + 1:]: 
      if dominates(item_2, item_1): 
       to_remove.add(item_1) 
       break 
      elif dominates(item_1, item_2): 
       to_remove.add(item_2) 
    return [i for i in l if i not in to_remove]

所以get_non_dominated([[1, 2, 3, 4], [5, 6, 7, 8]])應該返回[[1, 2, 3, 4]]。同樣get_non_dominated([[1, 2, 3, 9], [5, 6, 7, 8]])應該返回上述邏輯不變的列表（沒有任何東西支配其他）。

但是這項檢查發生了很多，l可能相當大。我想知道是否有人有想法來加速這個速度？我的第一個想法是嘗試使用numpy向量化這些代碼，但我對它的經驗相對較少，並且掙扎了一下。你可以假設l有所有獨特的數組。任何想法都非常感謝！

來源

2017-08-04 user8415803

這是我不清楚你要實現的目標。請更新您的問題，以包含具有示例輸入和輸出的工作代碼。 – kazemakase

我添加了一些額外的細節，希望能稍微澄清一點。 – user8415803

更新了答案以反映您想要的過濾（偶然是我展示的示例）。 –

如何：

import numpy as np 
np.all((np.asarry(l[1])-np.asarry(l[0]))>=0)

你可以去的情況下，你可以創建你的列表，numpy的陣列馬上一個simliar方式，即type(l) == np.ndarray。那麼語法是：

np.all(p[1])-p[0])>=0)

來源

2017-08-04 08:06:48 Nyps

'np.all'應該更快。 –

@ Imanol Luengo：感謝您的評論 – Nyps

@Nyps答案的另一個版本：

def dominates(a, b): 
    return (np.asarray(a) <= b).all()

它是使用numpy的代碼的量化方法。

如果您必須循環遍歷所有行，這可能仍然很慢。如果你有一個包含所有行的列表，並且你想要兩兩比較，你可以使用scipy創建一個N x N數組（其中N是行數）。

import numpy as np 
a = np.random.randint(0, 10, size=(1000, 10))

a這裏是一個1000 x 10陣列，模擬1000行10元素它的：

from scipy.spatial.distance import cdist 
X = cdist(a, a, metric=dominates).astype(np.bool)

X現在是包含的所有條目之間的成對比較的1000 x 1000矩陣。這是，X[i, j]包含True，如果樣本i支配樣本j或False否則。

您現在可以提取X花哨的結果，如支配他們所有的樣品：

>>> a[50] = 0 # set a row to all 0s to fake a dominant row 
>>> X = cdist(a, a, metric=dominates).astype(np.bool) 
>>> non_dominated = np.where(X.all(axis=1))[0] 
>>> non_dominated 
array([50])

樣品在50位置是統治者，如果你的人口，你應該密切關注它。

現在，如果你想保留只有主宰你可以這樣做：

if non_dominated.size > 0: 
    return [a[i] for i in non_dominated] 
else: # no one dominates every other 
    return a

這樣概括：

import numpy as np 
from scipy.spatial.distance import cdist 

def get_ruler(a): 
    X = cdist(a, a, metric=dominates).astype(np.bool) 
    rulers = np.where(X.all(axis=1))[0] 
    if rulers.size > 0: 
     return [a[i] for i in rulers] 
    else: # no one dominates every other 
     return a

來源

2017-08-04 08:19:11

感謝您的迴應！使用scipy的第二部分處於我正在尋找的領域，但不是查找支配其他行的行，而是需要所有不受**任何其他行支配的行。我認爲這應該是相當直接的從您擁有的東西中提取 - 非常感謝！ – user8415803

@ user8415803 Ops true！例如，我確實將名稱更改爲'get_ruler'。在你的情況下，如果我理解正確，你想檢查列'j'而不是'i'行（改爲'axis = 0'）。雖然，它不等同？我的意思是，如果一行是**沒有被任何其他行支配**，是否並不意味着它支配每隔一行？（只是好奇，我的邏輯可能會和我一起玩）。 –

快速（ER）的方法來確定列表中的所有非支配項

回答

相關問題