保留最新值並丟棄較舊的行（熊貓）

我有一個數據幀表，其中包含新值和舊值。我想在保持新值的同時刪除所有舊值。保留最新值並丟棄較舊的行（熊貓）

ID Name  Time Comment 
0  Foo 12:17:37 Rand 
1  Foo 12:17:37 Rand1 
2  Foo 08:20:00 Rand2 
3  Foo 08:20:00 Rand3 
4  Bar 09:01:00 Rand4 
5  Bar 09:01:00 Rand5 
6  Bar 08:50:50 Rand6 
7  Bar 08:50:00 Rand7

因此，它應該是這樣的：

ID Name  Time Comment 
0  Foo 12:17:37 Rand 
1  Foo 12:17:37 Rand1 
4  Bar 09:01:00 Rand4 
5  Bar 09:01:00 Rand5

我試着用下面的代碼，但這種刪除1新1舊值。

df[~df[['Time', 'Comment']].duplicated(keep='first')]

任何人都可以提供正確的解決方案嗎？

來源

2017-01-10 germanfox

我想你可以使用此解決方案與to_timedelta，如果Time列的最大值需要過濾：

df.Time = pd.to_timedelta(df.Time) 
df = df[df.Time == df.Time.max()] 
print (df) 
    ID Name  Time Comment 
0 0 Foo 12:17:37 Rand 
1 1 Foo 12:17:37 Rand1

編輯解決方案類似，只是增加groupby：

df = df.groupby('Name', sort=False) 
     .apply(lambda x: x[x.Time == x.Time.max()]) 
     .reset_index(drop=True) 
print (df) 
    ID Name  Time Comment 
0 0 Foo 12:17:37 Rand 
1 1 Foo 12:17:37 Rand1 
2 4 Bar 09:01:00 Rand4 
3 5 Bar 09:01:00 Rand5

來源

2017-01-10 08:42:39 jezrael

您是否可以編輯問題，因爲評論的格式不合適？ – jezrael

如果解決方案無法正常工作，請嘗試使用所需的輸出創建[最小，完整和可驗證的示例]（http://stackoverflow.com/help/mcve）。 – jezrael

會做。順便說一下，這工作，但不是我在找什麼。讓我更新這個問題。 – germanfox

您可以合併組的最大值回到原來的DF：

df['Time'] = pd.to_timedelta(df['Time']) 

In [35]: pd.merge(df, df.groupby('Name', as_index=False)['Time'].max(), on=['Name','Time']) 
Out[35]: 
    ID Name  Time Comment 
0 0 Foo 12:17:37 Rand 
1 1 Foo 12:17:37 Rand1 
2 4 Bar 09:01:00 Rand4 
3 5 Bar 09:01:00 Rand5

說明：

In [36]: df.groupby('Name', as_index=False)['Time'].max() 
Out[36]: 
    Name  Time 
0 Bar 09:01:00 
1 Foo 12:17:37

來源

2017-01-10 08:59:51 MaxU

保留最新值並丟棄較舊的行（熊貓）

回答

相關問題