熊貓manupulation：非不同的羣體

我有一個數據幀：熊貓manupulation：非不同的羣體

Time c_1  c_2 
t1  x  1 
t2  x  2 
t3  y  1 
t4  y  2 
t5  1  x 
t6  2  x 
t7  1  y 
t8  2  y

我需要形成2列不循環，使得：

new_1：一個最早的時間c_1.value看來，在C_2（例如對於T1，new_1 = T5，因爲C_1值是 'X'，而下一個時間的 'x' 出現在C_2是在t5）
new_2：即c_2.value出現在C_1接着最早時間（例如，對於T1，new_1 = T5，因爲C_2值是「1」，而下一個時間「1」出現在C_1是在t3）

因此，對於上面的輸入，輸出應該是：

Time c_1  c_2 new_1  new_2 
t1  x  1  t5  t5 
t2  x  2  t5  t6 
t3  y  1  t7  t5    
t4  y  2  t7  t6  
t5  1  x  NaT  NaT 
t6  2  x  NaT  NaT 
t7  1  y  NaT  NaT 
t8  2  y  NaT  NaT

你會如何處理這個？

來源

2017-04-01 Yeile

請準確顯示的'DataFrame'你希望得到的輸出 – splinter

下面是一個使用apply()和lambda函數從原始數據幀的每一行選擇正確的數據的解決方案。

import pandas as pd 

data = {'Time': pd.date_range('1/1/2000', periods=16, freq='D'), 
     'c_1': ['x', 'x', 'y', 'y', '1', '2', '1', '2']*2, 
     'c_2': ['1', '2', '1', '2', 'x', 'x', 'y', 'y']*2 } 

df = pd.DataFrame(data)  
df['new_1'] = df.apply(lambda r: (df.Time[(df.Time>r.Time) & (df.c_2 == r.c_1)].head(1).reset_index(drop=True)), axis=1) 
df['new_2'] = df.apply(lambda r: (df.Time[(df.Time>r.Time) & (df.c_1 == r.c_2)].head(1).reset_index(drop=True)), axis=1) 
print(df)

輸出是：

  Time c_1 c_2  new_1  new_2 
0 2000-01-01 x 1 2000-01-05 2000-01-05 
1 2000-01-02 x 2 2000-01-05 2000-01-06 
2 2000-01-03 y 1 2000-01-07 2000-01-05 
3 2000-01-04 y 2 2000-01-07 2000-01-06 
4 2000-01-05 1 x 2000-01-09 2000-01-09 
5 2000-01-06 2 x 2000-01-10 2000-01-09 
6 2000-01-07 1 y 2000-01-09 2000-01-11 
7 2000-01-08 2 y 2000-01-10 2000-01-11 
8 2000-01-09 x 1 2000-01-13 2000-01-13 
9 2000-01-10 x 2 2000-01-13 2000-01-14 
10 2000-01-11 y 1 2000-01-15 2000-01-13 
11 2000-01-12 y 2 2000-01-15 2000-01-14 
12 2000-01-13 1 x  NaT  NaT 
13 2000-01-14 2 x  NaT  NaT 
14 2000-01-15 1 y  NaT  NaT 
15 2000-01-16 2 y  NaT  NaT

的apply與axis=1這樣做它一次迭代一行。 lambda函數僅選擇在當前行之後發生並且在列中具有正確值的數據幀的行。可能有多個行匹配這些條件。所述head(1)選擇第一匹配和reset_index(drop=True)確保每個系列返回具有相同的索引（0），使得apply()地方它們全部關閉返回值成一個單一的塔。

來源

2017-04-01 16:36:07 Craig

有沒有辦法來概括呢？如果我們將解決方案擴展到16個週期，並且這些值重複，則只有前4個日期將填充您的解決方案。 – Yeile

@Yeile我不能一概而論的'groupby'版本，但我想出了一個使用'apply'和一般性地重複值的地址版本。 – Craig

熊貓manupulation：非不同的羣體

回答

相關問題