在Pandas DataFrame中對子序列進行編號

我有一個讀數 DataFrame由兩列experiment和value組成。 experiment鍵入實驗 DataFrame;在同一實驗中，連續500行具有相同的experiment和不同的value，其中DF的順序是數據的順序。然後500爲下一個實驗，等等。在Pandas DataFrame中對子序列進行編號

我想在實驗中尋找基於時間的趨勢，所以我假設我想在0-499標記每個點pos然後groupby('pos')。如何創建pos列，每次重置experiment重置次數遞增值爲0？我猜，這與experiment一直保持不變的行數相同。

來源

2017-10-09 Rgaddi

你的問題並不能說明你的問題非常好。請看http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples，並瞭解如何編寫一個好的熊貓問題。謝謝。 –

如果我理解正確...

>>> df = pd.DataFrame({'Experiment' : [1,1,1,2,2,2,2,3,3,3], 
         'Value' : np.random.randn(10)}) 
>>> df 

    Experiment  Value 
0   1 -0.924851 
1   1 -0.599875 
2   1 0.069982 
3   2 -1.106909 
4   2 0.463922 
5   2 0.210568 
6   2 -0.171456 
7   3 -0.768618 
8   3 -0.269928 
9   3 0.055613

您將使用groupby隨後cumcount()以獲得所需的效果：

>>> df['Position'] = df.groupby('Experiment').cumcount() 
>>> df 

    Experiment  Value Position 
0   1 -0.924851   0 
1   1 -0.599875   1 
2   1 0.069982   2 
3   2 -1.106909   0 
4   2 0.463922   1 
5   2 0.210568   2 
6   2 -0.171456   3 
7   3 -0.768618   0 
8   3 -0.269928   1 
9   3 0.055613   2

來源

2017-10-09 17:33:09 kev8484

這正是我所要求的;和解決方案正是我需要的。謝謝你的幫助。 – Rgaddi

在Pandas DataFrame中對子序列進行編號

回答

相關問題