2016-06-09 43 views
2

我有一個二維數組numpy的說如下:查找跨越每一行的唯一值

[["cat","dog","dog","mouse","man"], 
["rhino","rhino","bat","rhino","dino","dino"], 
["zebra","alien","alien","alien","alien"]] 

我想,以沿每一行執行numpy.unique計算每個標籤的出現次數,可惜我不認爲這是可能的,因爲numpy.unique會返回不同長度的向量:

[["cat","dog","mouse","man"] 
["rhino","bat","dino"] 
["zebra","alien"]] 
(similar then for the counts) 

所以這不會工作明顯。

有人知道我可以解決這個問題嗎?

+0

使用陣列的獨特作用。 doc:http://php.net/manual/en/function.array-unique.php 如果這不是你正在尋找的請添加預期的結果在你的問題。謝謝。 –

+0

@NaveedAhmed'numpy'是一個* python *庫。 –

回答

1

試試這個:

a = pd.DataFrame([["cat","dog","dog","mouse","man"], 
        ["rhino","rhino","bat","rhino","dino","dino"], 
        ["zebra","alien","alien","alien","alien"]]) 

a.apply(lambda x: pd.Series(x.unique()), axis=1) 
+0

謝謝你這樣做是爲了獲得獨特的價值,但爲了獲得計數,我發現我實際使用的數據集很棘手。如果我使用pd.value_counts方法類似於你已經詳細說明的方法,由於我有大量的潛在值,它將返回一個包含大量列的矩陣。不確定如何繼續。 – Colin