1
我有一個時間序列(第1列)中,用值(第2欄),這是時間序列中的每個子系列的特徵的列數據幀。 如何刪除符合條件的子系列?刪除子系列(在數據幀中的行),其滿足條件
我試圖使循環創建一個額外的列與功能,指出要刪除的行,但這種解決方案是非常計算成本昂貴(我有一列10毫米記錄)。代碼(慢溶液):
import numpy as np
import pandas as pd
# sample data (smaller than actual df)
# length of df = 100; should be 10000000 in the actual data frame
time_ser = 100*[25]
max_num = 20
distance = np.random.uniform(0,max_num,100)
to_remove= 100*[np.nan]
data_dict = {'time_ser':time_ser,
'distance':distance,
'to_remove': to_remove
}
df = pd.DataFrame(data_dict)
subser_size = 3
maxdist = 18
# loop which creates an additional column which indicates which indexes should be removed.
# Takes first value in a subseries and checks if it meets the condition.
# If it does, all values in subseries (i.e. rows) should be removed ('wrong').
for i,d in zip(range(len(df)), df.distance):
if d >= maxdist:
df.to_remove.iloc[i:i+subser_size] = 'wrong'
else:
df.to_remove.iloc[i] ='good'
感謝您接受。您也可以註冊 - 點擊接受標記上方'0'上方的小三角。謝謝。 – jezrael