Python：使用Pandas從數據框中選擇特定日期

以下簡短腳本使用findatapy從Dukascopy網站收集數據。請注意，此軟件包使用Pandas，並且不需要單獨導入它。Python：使用Pandas從數據框中選擇特定日期

from findatapy.market import Market, MarketDataRequest, MarketDataGenerator 

market = Market(market_data_generator=MarketDataGenerator()) 
md_request = MarketDataRequest(start_date='08 Feb 2017', finish_date='09 Feb 2017', category='fx', fields=['bid', 'ask'], freq='tick', data_source='dukascopy', tickers=['EURUSD']) 

df = market.fetch_market(md_request) 

#Group everything by an hourly frequency. 
df=df.groupby(pd.TimeGrouper('1H')).head(1) 

#Deleting the milliseconds from the Dateframe 
df.index =df.index.map(lambda t: t.strftime('%Y-%m-%d %H:%M:%S')) 

#Computing Average between columns 1 and 2, and storing it in a new one. 
df['Avg'] = (df['EURUSD.bid'] + df['EURUSD.ask'])/2

的結果是這樣的：

直到此時，一切運行正常，但我需要從這個數據幀提取特定小時。我想在某個時間點上午10:00:00選擇所有的值（出價，詢問，平均值或其中一個）。

通過觀察其他posts，我想我可以做這樣的事情：

match_timestamp = "10:00:00" 
df.loc[(df.index.strftime("%H:%M:%S") == match_timestamp)]

但結果是一個錯誤消息說：

AttributeError: 'Index' object has no attribute 'strftime'

我甚至不能執行df.index .hour，它曾經在刪除毫秒的行之前工作（dtype是datetime64 [ns]直到該點），然後dtype是'Object'。看起來我需要反轉這種格式才能使用strftime。

你能幫我嗎？

來源

2017-10-10 Aquiles Páez

你應該resample看一看：

df = df.resample('H').first() # resample for each hour and use first value of hour

則：

df.loc[df.index.hour == 10] # index is still a date object, play with it

如果你不喜歡，你可以只設置您的日期時間對象的索引，如下所示：

df.index = pd.to_datetime(df.index)

那麼你的代碼應工作因爲是

來源

2017-10-10 20:03:39

我喜歡這個，因爲有一行我可以擺脫毫秒，並在整個數據幀操作中保持相同的dtype。我也可以使用df.loc [（df.index.strftime（「％H：％M：％S」）==「10:00:00」）]，這對我正在嘗試做的更好。謝謝！ :) –

@AquilesPáez沒有問題，你也會得到大約10％的速度增加使用resample vs groupby大集 –

嘗試重置索引

match_timestamp = "10:00:00" 
df = df.reset_index() 
df = df.assign(Date=pd.to_datetime(df.Date)) 
df.loc[(df.Date.strftime("%H:%M:%S") == match_timestamp)]

來源

2017-10-10 19:58:21 galaxyan

AttributeError的： 'RangeIndex' 對象有沒有屬性 '的strftime'。我認爲reset_index（）函數必須填充我希望數據框索引所具有的數據類型，對吧？無論如何，如果我更改df = df.resample（'H'）的第一行（）（由@ steven-g建議），並且在腳本中刪除了我的行，那麼解決方案的第三行可以正常工作。毫秒被刪除，因爲該命令已經完成了。 –

@AquilesPáez你需要將它轉換爲datetime第一個 – galaxyan

Python：使用Pandas從數據框中選擇特定日期

回答

相關問題