2
我想用我的熊貓重新索引功能填補我時間系列數據中缺失的行。 我的數據是這樣的:用熊貓重新索引功能填補缺失的數據行
100,2007,239,4,29.588,-30.851,-999.0,-999.0,-999.0,-999.00,13.125,-999.00
100,2007,239,5,29.573,-30.843,-999.0,-999.0,-999.0,-999.00,13.126,-999.00
100,2007,239,14,29.389,-30.880,-999.0,-999.0,-999.0,-999.00,13.131,-999.00
100,2007,239,15,29.367,-30.901,-999.0,-999.0,-999.0,-999.00,13.131,-999.00
100,2007,239,24,29.374,-30.920,-999.0,-999.0,-999.0,-999.00,13.135,-999.00
.
.
這一天與第四列指示一個分時段的時間序列數據。對於正常的時間序列指標不太可能,該數據的時間索引看起來像0到59,100到159 .... 2300到2359,因爲1天是24小時,1小時是60分鐘。所以,填充「男」值的差距,我提出的代碼波紋管:
S = []
for i in range(0,24):
s = np.arange(i*100,i*100+60)
s = list(s)
S = S + s
pd.set_option('max_rows',10)
for INPUT in FileList:
output = INPUT + "result" # set the output files
data=pd.read_csv(INPUT,sep=',',index_col=[3],parse_dates=[3])
index = 'S'#make the reference index to fill
df = data
sk_f = df.reindex(index)
sk_f.to_csv(output,na_rep='nan')
通過該代碼,我意要填補的「男」的行中的間隙設在第四列中的指數之以下S是參考指標。 但結果是「南」的只是行,而不是填充間隙如下:
,100,2007,241,22.471,-31.002,-999.0,-999.0.1,-999.0.2,-999.00,13.294,-999.00 .1
0,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
1,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
2,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
3,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
4,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
5,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
6,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
7,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
8,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
9,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
10,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
11,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
我的期望是,以填補在原始數據丟失線的差距。例如,在原始數據中,0到3索引行之間沒有低位。所以我想用原始數據格式填充這些行。 我可能會錯過一些東西。 如果你能提供任何想法或幫助,我會非常感激。
謝謝 艾薩克