基於Python中的現有值創建新列

我想根據日期範圍創建新列，以查看每個條目每個月會花費多少EMI。在蟒蛇請告知如何可以做到這一點基於Python中的現有值創建新列

輸入文件

Start Date End Date EMI 
01/12/16 01/12/17 4800 
09/01/16 09/01/17 3000 
01/07/15 01/05/16 2300

，我想輸出文件看起來像這樣

Start Date End Date  EMI 06/16 07/16 08/16 09/16 10/16 11/16 12/16 01/17 02/17 
01/12/16 01/12/17 4800 4800 4800 4800 4800 4800 4800 4800 4800 0 
09/01/16 09/01/17 3000 0  0  0  3000 3000 3000 3000 3000 3000 
01/07/15 01/05/16 2300 0  0  0  0  0  0  0  0  0

請告訴我你在實施這一使用Python建議。

來源

2016-11-09 yasin mohammed

我完全糊塗了！你是如何來到輸出中的列的？什麼決定了價值？ – piRSquared

我已經編輯了基本的樣本文件，如果EMI落在數據範圍內，那麼它必須爲該月的字段填充EMI值 –

IIUC你需要：

#reshape datetime columns to one, create datetimeindex 
df1 = pd.melt(df.reset_index(), id_vars=['EMI', 'index'], value_name='date') 
     .set_index('date') 
#convert index to periodindex by month 
df1.index = pd.to_datetime(df1.index, format='%d/%m/%y', errors='coerce') 
       .to_period('M') 
#groupby by column index nad resample by month 
df1 = df1.groupby('index') 
     .resample('M') 
     .ffill() 
     .drop(['variable', 'index'], axis=1) 
     .reset_index() 
#pivoting, fill NaN with 0, cast floats to int 
df1 = df1.pivot(index='index', columns='date', values='EMI') 
     .fillna(0) 
     .astype(int) 
#change format of columns 
df1.columns = df1.columns.strftime('%m/%y') 
#concat original dataframe 
df = pd.concat([df,df1], axis=1) 

print (df) 
    Start Date End Date EMI 07/15 08/15 09/15 10/15 11/15 12/15 01/16 \ 
0 01/12/16 01/12/17 4800  0  0  0  0  0  0  0 
1 09/01/16 09/01/17 3000  0  0  0  0  0  0 3000 
2 01/07/15 01/05/16 2300 2300 2300 2300 2300 2300 2300 2300 

    03/17 04/17 05/17 06/17 07/17 08/17 09/17 10/17 11/17 12/17 
0 ...  4800 4800 4800 4800 4800 4800 4800 4800 4800 4800 
1 ...  0  0  0  0  0  0  0  0  0  0 
2 ...  0  0  0  0  0  0  0  0  0  0 

[3 rows x 33 columns]

來源

2016-11-09 18:20:10 jezrael

您可以檢查我的解決方案嗎？ '01/12/16'是'DDMMYY'還是'MMDDYY'？ – jezrael

日期格式爲MMDDYY我對語句'df1.index = pd.to_datetime（df1.index，format ='％d /％m /％y'，errors ='脅迫'）進行了此更改' –

另外當我我正在執行這段代碼它執行了3小時我的總文件大小隻有180 MB'df1 = df1.groupby（'index'） .resample（'M'） .ffill（） .drop（[變量'，'索引']，軸= 1） .reset_index（）' –

基於Python中的現有值創建新列

回答

相關問題