2017-08-25 74 views
0

我試圖將下面的代碼確定的列的累積最大

df = pd.DataFrame([[23, 52], [36, 49], [52, 61], [75, 82], [97, 12]], columns=['A', 'B']) 
df['C'] = np.where(df['A'] > df['C'].shift(), df['A'], df['C'].shift()) 
print(df) 

假設是第一df['C].shift()操作應假定爲0(由於df['C']是不存在的)

預期輸出

A B C 
0 23 52 23 
1 36 49 36 
2 12 61 36 
3 75 82 75 
4 70 12 75 

但我得到一個KeyError異常。

Traceback (most recent call last): 
    File "C:\Program Files\Python36\lib\site-packages\pandas\core\indexes\base.py", line 2442, in get_loc 
    return self._engine.get_loc(key) 
    File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5280) 
    File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5126) 
    File "pandas\_libs\hashtable_class_helper.pxi", line 1210, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20523) 
    File "pandas\_libs\hashtable_class_helper.pxi", line 1218, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20477) 
KeyError: 'C' 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "C:\Users\Development\workspace\TestPython\TestPython.py", line 6, in <module> 
    df['C'] = np.where(df['A'] > df['C'].shift(), df['B'].shift(), df['A']) 
    File "C:\Program Files\Python36\lib\site-packages\pandas\core\frame.py", line 1964, in __getitem__ 
    return self._getitem_column(key) 
    File "C:\Program Files\Python36\lib\site-packages\pandas\core\frame.py", line 1971, in _getitem_column 
    return self._get_item_cache(key) 
    File "C:\Program Files\Python36\lib\site-packages\pandas\core\generic.py", line 1645, in _get_item_cache 
    values = self._data.get(item) 
    File "C:\Program Files\Python36\lib\site-packages\pandas\core\internals.py", line 3590, in get 
    loc = self.items.get_loc(item) 
    File "C:\Program Files\Python36\lib\site-packages\pandas\core\indexes\base.py", line 2444, in get_loc 
    return self._engine.get_loc(self._maybe_cast_indexer(key)) 
    File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5280) 
    File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5126) 
    File "pandas\_libs\hashtable_class_helper.pxi", line 1210, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20523) 
    File "pandas\_libs\hashtable_class_helper.pxi", line 1218, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20477) 
KeyError: 'C' 

根據我的理解,這是發生的,因爲第一次列C不存在,所以移動列拋出此異常。

我的問題是否有解決此問題的替代方法?

+1

呃..你創建了一個DF沒有 'C' 欄,讓你得到一個'KeyError'意料之中。什麼是預期的輸出?你不應該使用'B'列嗎? – EdChum

+0

您是不是指'np.where(df ['A']> df ['B']。shift(),df ['B']。shift(),df ['A'])'? –

+0

我需要這種方式。因此,'df.loc [0:'C']'可以是'0'或'df ['A']',然後是'df.loc [1:'C']''df ['C' ] = np.where(df ['A']> df ['C']。shift(),df ['C']。shift(),df ['A']]'接管 – arkochhar

回答

5

需要cummax

df['C'] = df.A.cummax() 
+0

不錯!不知道這件事。 (+1) –

+0

@cᴏʟᴅsᴘᴇᴇᴅ和acushner,感謝您的輸入。我能否引起你關注的另一個相關話題? [鏈接](https://stackoverflow.com/questions/44935269/supertrend-code-using-pandas-python) – arkochhar