過濾數據幀

標題可能有點所以這裏混亂是一個例子：過濾數據幀

來源：

id |  timestamp 
1 | 2015-12-02 00:00:00 
1 | 2015-12-03 00:00:00 <--- latest for id 1 
2 | 2015-12-02 00:00:00 
2 | 2015-12-04 00:00:00 
2 | 2015-12-06 00:00:00 <--- latest for id 2

要這樣：

id |  timestamp 
1 | 2015-12-03 00:00:00 
2 | 2015-12-06 00:00:00

來源

2017-10-10 mlwh

'df.groupby（ 'ID'）。尾部（1）'？ – jezrael

使用nth

In [599]: df.groupby('id', as_index=False).nth(-1) 
Out[599]: 
    id   timestamp 
1 1 2015-12-03 00:00:00 
4 2 2015-12-06 00:00:00

理想情況下，max，因爲你需要最新的日期。

In [601]: df.groupby('id', as_index=False).max() 
Out[601]: 
    id   timestamp 
0 1 2015-12-03 00:00:00 
1 2 2015-12-06 00:00:00

此外，tail如在評論中提到

In [602]: df.groupby('id').tail(1) 
Out[602]: 
    id   timestamp 
1 1 2015-12-03 00:00:00 
4 2 2015-12-06 00:00:00

來源

2017-10-10 11:33:33 Zero

max（）取每列最大值，是否正確？ – kbball

回答

相關問題