2016-11-19 94 views
1

我有一個熊貓數據框芯板(data_r3000),其中包含了幾個工業部門的股票數據切片特定的記錄...從熊貓DF芯板

{'capital_goods': <class 'pandas.core.panel.Panel'> 
Dimensions: 6 (items) x 13820 (major_axis) x 423 (minor_axis) 
Items axis: OPEN to ADJ_CLOSE 
Major_axis axis: 1962-01-02 00:00:00 to 2016-11-18 00:00:00 
Minor_axis axis: A to ZEUS, 'consumer': <class 'pandas.core.panel.Panel'> 
Dimensions: 6 (items) x 11832 (major_axis) x 94 (minor_axis) 
Items axis: OPEN to ADJ_CLOSE 
Major_axis axis: 1970-01-02 00:00:00 to 2016-11-18 00:00:00 
Minor_axis axis: ABG to WSO, 'consumer_non_durables': <class 'pandas.core.panel.Panel'> 
Dimensions: 6 (items) x 13819 (major_axis) x 138 (minor_axis) 

等我隔離扇區,其中之一我想對df中的一些值做一些修改。

x = data_r3000['capital_goods'].to_frame().unstack(level=1) 

這將產生以下DF:

enter image description here

我很少有經驗的大熊貓多指標的工作,並且我是個有隔離「關閉」和「ADJ_CLOSE」的'記錄問題AA」。我該如何隔離這些記錄,以便創建一個AA_df,僅用於打開OPEN和ADJ_CLOSE的計時器系列?

我試過x.xs(['CLOSE','ADJ_CLOSE'], axis=1),,它正確隔離了我所尋找的兩個特徵,但我不知道如何隔離'AA'。 感謝

回答

2

我認爲你可以使用slicers

idx = pd.IndexSlice 
print (df.loc[:, idx[['CLOSE','ADJ_CLOSE'], 'AA']]) 

或者:

print (df.loc[:, (['CLOSE','ADJ_CLOSE'],'AA')]) 

樣品:

cols = pd.MultiIndex.from_product((['ADJ','ADJ_CLOSE', 'CLOSE'], 
            ['A','AA','AEPI'])) 
df = pd.DataFrame(np.arange(27).reshape(3,9),columns=cols) 

print (df) 
    ADJ   ADJ_CLOSE   CLOSE   
    A AA AEPI   A AA AEPI  A AA AEPI 
0 0 1 2   3 4 5  6 7 8 
1 9 10 11  12 13 14 15 16 17 
2 18 19 20  21 22 23 24 25 26 

idx = pd.IndexSlice 
print (df.loc[:, idx[['CLOSE','ADJ_CLOSE'], 'AA']]) 
    ADJ_CLOSE CLOSE 
     AA AA 
0   4  7 
1  13 16 
2  22 25 

print (df.loc[:, (['CLOSE','ADJ_CLOSE'],'AA')]) 
    ADJ_CLOSE CLOSE 
     AA AA 
0   4  7 
1  13 16 
2  22 25 

解決方案與Panel

np.random.seed(1234) 
rng = pd.date_range('1/1/2013',periods=10,freq='D') 

data = np.random.randn(10, 4) 

cols = ['A','AA','AAON','ABAX'] 

df1, df2, df3 = pd.DataFrame(data, rng, cols), 
       pd.DataFrame(data, rng, cols), 
       pd.DataFrame(data, rng, cols) 

pf = pd.Panel({'OPEN':df1,'ADJ':df2,'ADJ_CLOSE':df3});pf 
print (pf) 
<class 'pandas.core.panel.Panel'> 
Dimensions: 3 (items) x 10 (major_axis) x 4 (minor_axis) 
Items axis: ADJ to OPEN 
Major_axis axis: 2013-01-01 00:00:00 to 2013-01-10 00:00:00 
Minor_axis axis: A to ABAX 

print (pf.loc[['OPEN', 'ADJ_CLOSE'], :,'AA']) 
       OPEN ADJ_CLOSE 
2013-01-01 -1.190976 -1.190976 
2013-01-02 0.887163 0.887163 
2013-01-03 -2.242685 -2.242685 
2013-01-04 -2.021255 -2.021255 
2013-01-05 0.289092 0.289092 
2013-01-06 -0.655969 -0.655969 
2013-01-07 -0.469305 -0.469305 
2013-01-08 1.058969 1.058969 
2013-01-09 1.045938 1.045938 
2013-01-10 -0.322795 -0.322795