2017-07-14 34 views
0

我是python和熊貓的新手,並且擁有讀取到熊貓數據框的csv文件。在下面找到它。如果基於同一數據框中的其他兩列的行值滿足條件,則填充數據框中列的行中的值

我想根據PLDATE中的行值填充列OND_ORIGIN和OND_DEST。

的邏輯是飛行在同一天每次飛行中,OND_ORIGIN和OND_DEST應該是相同departure_from和Arr_to列

import pandas as pd 
import numpy as np 
import csv 


location = r'C:\Users\bi.reports\Desktop\output.csv' 
df = pd.read_csv(location,sep='\s*,\s*',engine='python') 
for i, row in df.iterrows(): 
    if row['COUPON_NUMBER'] == 1: 
     df.OND_ORIGIN = df.DEP_FROM 
     #df.OND_DEST = df.DEP_FROM 
    elif row['COUPON_NUMBER'] == 2: 
     #df.OND_ORIGIN = df.DEP_FROM 
     df.OND_DEST = df.ARR_TO 
    elif row['COUPON_NUMBER'] == 3: 
     #df.OND_ORIGIN = df.DEP_FROM 
     df.OND_DEST = df.ARR_TO 
    else: 
    df.OND_ORIGIN = df.DEP_FROM 
    #df.OND_DEST = df.ARR_TO 

    df.to_csv('out.csv', sep=',',index = False) 

csv file in use

回答

0

試試這個:

df.loc[df['COUPON_NUMBER'] == 1, 'OND_ORIGIN'] = df.DEP_FROM 
df.loc[df['COUPON_NUMBER'].isin([2,3]), 'OND_DEST'] = df.ARR_TO 
df.loc[~df['COUPON_NUMBER'].isin([1,2,3]), 'OND_ORIGIN'] = df.DEP_FROM 

或位優化:

df.loc[df['COUPON_NUMBER'].isin([2,3]), 'OND_DEST'] = df.ARR_TO 
df.loc[~df['COUPON_NUMBER'].isin([2,3]), 'OND_ORIGIN'] = df.DEP_FROM 
+0

感謝您的快速回復,但是當我運行它時,對於每一行只有一列被填充。即(如果OND_DEST被填充,OND_ORIGIN是空白的,反之亦然) – MTK

+0

@MTK,你不應該爲每一行運行它 - 這是一個矢量化的解決方案,只需用你的'for ... loop'替換這兩行... – MaxU

+0

I如果你可以看一下CSV,我想根據PLDATE列提出OND_,例如,由於優惠券1和2在同一天飛行,所以ond_origin應該是HRE和OND_Destination KGL對於兩個優惠券,以及對於優惠券3和4,ond_origin KGL和ond_destination HRE都是。 – MTK

相關問題