2016-07-26 120 views
2

我有以下數據框:如何從Pandas DataFrame中提取索引/列/數據基於邏輯運算?

import numpy as np 
import pandas as pd 
data = np.random.rand(5,5) 
df = pd.DataFrame(data, index = list('abcde'), columns = list('ABCDE')) 
df = df[df>0] 
df 
      A   B   C   D E 
a  NaN 2.038740 1.371158  NaN NaN 
b 0.575567  NaN 0.462007  NaN NaN 
c 0.984802 0.049818 0.129836  NaN NaN 
d  NaN  NaN  NaN  NaN NaN 
e 0.789563 1.846402  NaN 0.340902 NaN 

我想所有的(指數,COL_NAME,值)非NAN數據。我該怎麼做?

我預期的結果是:

[('b','A', 0.575567), ('c', 'A', 0.984802), ('e', 'A', 0.789563),...] 
+0

我認爲'data'應該是'np.random.randn'而不是'np.random.rand'。後者永遠不會是消極的。 – ayhan

回答

4

可以疊加的數據幀,它會自動下降NA值,然後重新索引是列後,它會很容易轉換成列表元組:

[tuple(r) for r in df.stack().reset_index().values] 

# [('a', 'B', 2.03874), 
# ('a', 'C', 1.371158), 
# ('b', 'A', 0.575567), 
# ('b', 'C', 0.46200699999999995), 
# ('c', 'A', 0.9848020000000001), 
# ('c', 'B', 0.049818), 
# ('c', 'C', 0.12983599999999998), 
# ('e', 'A', 0.789563), 
# ('e', 'B', 1.846402), 
# ('e', 'D', 0.340902)] 

或者使用數據幀to_records()方法:

list(df.stack().reset_index().to_records(index = False))