相同月份的值的總和

data = {'dates': ['2010-01-29', '2011-06-14', '2012-01-18'], 'values': [4, 3, 8]} 
df = pd.DataFrame(data) 
df.set_index('dates') 
df.index = df.index.astype('datetime64[ns]')

有一個數據框的索引是一個日期，我將如何去添加一個新的列caled'月'，即該月的所有值的總和，但不會「進入未來「，因爲它只是在它的日期前幾天加起來。相同月份的值的總和

這是該列的樣子。

'Month': [4, 3, 12]

來源

2016-06-28 user6162407

這些都是對應其d值ay，4代表'2010-01-29'，8代表'2012-01-18' – user6162407

apply這裏是你的朋友

def sum_from_months_prior(row, df): 
    '''returns sum of values in row month, 
    from all dates in df prior to row date''' 

    month = pd.to_datetime(row).month 

    all_dates_prior = df[df.index <= row] 
    same_month = all_dates_prior[all_dates_prior.index.month == month] 

    return same_month["values"].sum() 

data = {'dates': ['2010-01-29', '2011-06-14', '2012-01-18'], 'values': [4, 3, 8]} 
df = pd.DataFrame(data) 
df.set_index('dates', inplace = True) 
df.index = pd.to_datetime(df.index) 
df["dates"] = df.index 
df.sort_index(inplace = True) 

df["Month"] = df["dates"].apply(lambda row: sum_from_months_prior (row, df)) 
df.drop("dates", axis = 1, inplace = True)

所需DF：

  values Month 
dates 
2010-01-29  4  4 
2011-06-14  3  3 
2012-01-18  8  12

來源

2016-06-28 16:54:27

有幾種方法可以做到這一點。首先將使用df.resample(...).sum()重新採樣到每月。

您也可以使用df['month'] = df.index.month從索引創建月份列，然後執行groupby操作df.groupby('month').sum() - 哪種方法最好取決於您想要對數據執行的操作。

來源

2016-06-28 15:23:41 Jeff

您可以使用熊貓TimeGrouper

df.groupby(pd.TimeGrouper('M')).sum()

來源

2016-06-28 15:24:20 piRSquared

忘記'TimeGrouper'，這是做到這一點的方法。 – Jeff

相同月份的值的總和

回答

相關問題