爲numpy的陣列

說我有一個大的2D numpy的陣列，具有多個平等條件語句（稱之爲一個），其中包含從0整數9.爲numpy的陣列

我想寫一個返回二進制numpy的陣列功能（稱爲B），其形狀與A相同，具有以下特徵。

在乙一個條目是1，如果在相應元素A出現在一個給定的列表大號;否則，條目爲零。

下面顯示了可能不是最有效的代碼。

A = np.random.randint(0, 10, (5, 5)) 
L = [3,4,5] 

B = np.zeros(A.shape) 
for e in L: 
    B[A==e] = 1

有沒有更快的方法？

謝謝！

來源

2017-02-17 Curious

這裏有兩個numpy選項np.in1d這是從基礎python in矢量化版本。當陣列是大的，第一個選項顯示了一些加速：

選項一個（快速一個）：

np.in1d(A, L).reshape(A.shape).astype(int)

選擇二（慢一個）：

np.apply_along_axis(np.in1d, 0, A, L).astype(int)

計時：

A = np.random.randint(0, 10, (1000, 1000)) 
L = [3,4,5] 
 
def loop(): 
    B = np.zeros(A.shape) 
    for e in L: 
     B[A==e] = 1 
    return B 

%timeit np.in1d(A, L).reshape(A.shape).astype(int) 
# 100 loops, best of 3: 6.4 ms per loop 

%timeit loop() 
# 100 loops, best of 3: 16.8 ms per loop 

%timeit np.apply_along_axis(np.in1d, 1, A, L).astype(int) 
# 10 loops, best of 3: 21.5 ms per loop 

%timeit np.apply_along_axis(np.in1d, 0, A, L).astype(int) 
# 10 loops, best of 3: 35.1 ms per loop

結果檢查：

B1 = loop() 
B2 = np.apply_along_axis(np.in1d, 0, A, L).astype(int) 
B3 = np.apply_along_axis(np.in1d, 1, A, L).astype(int) 
B4 = np.in1d(A, arrL).reshape(A.shape).astype(int) 

(B1 == B2).all() 
# True 

(B1 == B3).all() 
# True 

(B1 == B4).all() 
# True

來源

2017-02-17 00:30:02 Psidom

我時序結果提供了不同的敘事。我在anaconda中使用python 3.5和numpy 1.10.4，原始循環大約比使用in1d快3倍。編輯：我現在看到你使用了一個1000x1000的隨機矩陣。我將用相同大小的矩陣進行重新測試。 –

@ HAL9001我剛剛測試了python 3，我得到了相當一致的結果。第一個np.in1d選項更快。你正在測試的陣列有多大？ – Psidom

Psidom的結果在使用1000x1000時得到確認。 –

使用@ Psidom的1000×1000矩陣，我介紹了另外兩種方法，並列入@Psidom提供的np.in1d方法。

一個使用迭代求和，另一個使用逐位迭代或。

迭代按位或trial2()證明自己在下面，提供的結果大約比原始快4倍，比numpy的in3d快2倍，但要注意它提供了一個布爾類型的矩陣結果。

當按位方法修改爲返回整數結果trial2_int()時，其速度基本上等於numpy的in1d。

A = np.random.randint(0,10,(1000,1000)) 
L = [3,4,5] 
def original(): 
    B = np.zeros(A.shape) 
    for e in L: 
     B[A==e] = 1 
    return B 

def trial1(): 
    B = np.empty(A.shape) 
    for e in L: 
    B += A == e 
    return B 

def trial2(): 
    B = A==L[0] 
    for e in L[1:]: 
    B |= A == e 
    return B 

def trial2_int(): 
    B = trial2() 
    return B.astype(int) 

def trial_Psidom(): 
    B = np.in1d(A,L).reshape(A.shape).astype(int) 
    return B

結果：

%timeit original() 
# 100 loops, best of 3: 10.5 ms per loop 
%timeit trial1() 
# 100 loops, best of 3: 9.43 ms per loop 
%timeit trial2() 
# 100 loops, best of 3: 2.37 ms per loop 
%timeit trial2_int() 
# 100 loops, best of 3: 5.31 ms per loop 
%timeit trial_Psidom() 
# 100 loops, best of 3: 5.37 ms per loop

來源

2017-02-17 00:57:21

如果'L'相對於'A'小，'np.in1d'做了'trial2'的迭代版本 - 在'L'上迭代並使用'B | = ...'。 – hpaulj

看起來它落在我指出明顯：

def AinL(A, L): 
    B = np.zeros((10,), int) 
    B[L] = 1 
    return B[A]

基準：

10x10 #L=3 
orig  0.6665631101932377 
HAL  0.4370500799268484 
Psidom 1.13961720908992 
PP  0.23527960386127234 

100x100 #L=3 
orig  0.3015591569710523 
HAL  0.29902734607458115 
Psidom 0.4470538650639355 
PP  0.18963343487121165 

1000x1000 #L=4 
orig  0.5516874771565199 
HAL  0.5967503408901393 
Psidom 0.6331975681241602 
PP  0.23225238709710538 

10000x1000 #L=2 
orig  0.8539429588709027 
HAL  0.9840140701271594 
Psidom 1.0392512339167297 
PP  0.7203555379528552

來源

2017-02-17 01:50:20

爲numpy的陣列

回答

相關問題