組通過在大熊貓數據幀

多個時間單元I具有由具有15秒間隔的時間序列數據的數據幀：組通過在大熊貓數據幀

date_time    value  
2012-12-28 11:11:00 103.2 
2012-12-28 11:11:15 103.1 
2012-12-28 11:11:30 103.4 
2012-12-28 11:11:45 103.5 
2012-12-28 11:12:00 103.3

數據跨越許多年。我希望能夠通過一年一次的時間來分析多年來的時間效應分佈。例如，我可能想要計算每天15秒間隔的均值和標準差，並查看平均值和標準偏差如何從2010年，2011年，2012年等變化。我天真地試過data.groupby(lambda x: [x.year, x.time])，但它沒有工作。我怎麼做這樣的分組？

dfts = df.set_index('date_time')

您可以通過間隔組使用

dfts.groupby(lambda x : x.month).mean()

看到每個平均值從那裏：

來源

2013-01-13 ezbentley

如果date_time是不是你的索引，date_time -indexed數據框可以與創建月。同樣，你可以做

dfts.groupby(lambda x : x.year).std()

多年來的標準偏差。

如果我理解您希望實現的示例任務，則可以使用xs簡單地將數據拆分成幾年，對它們進行分組並將結果連接起來，並將其存儲在新的DataFrame中。

years = range(2012, 2015) 
yearly_month_stats = [dfts.xs(str(year)).groupby(lambda x : x.month).mean() for year in years] 
df2 = pd.concat(yearly_month_stats, axis=1, keys = years)

從中你喜歡的東西

 2012  2013  2014 
     value  value  value 
1  NaN 5.324165 15.747767 
2  NaN -23.193429 9.193217 
3  NaN -14.144287 23.896030 
4  NaN -21.877975 16.310195 
5  NaN -3.079910 -6.093905 
6  NaN -2.106847 -23.253183 
7  NaN 10.644636 6.542562 
8  NaN -9.763087 14.335956 
9  NaN -3.529646 2.607973 
10  NaN -18.633832 0.083575 
11  NaN 10.297902 14.059286 
12 33.95442 13.692435 22.293245

來源

2013-01-13 17:52:51 metakermit

你接近：

data.groupby([lambda x: x.year, lambda x: x.time])

此外，一定要設置date_time爲索引，如kermit666的回答

來源

2013-06-14 14:31:24 joeb1415

組通過在大熊貓數據幀

回答

相關問題