添加.75基於分位數的分組的列由

我有df索引作爲日期和列也稱爲分數。現在我想保持原來的df值，但是添加了一個給出當天0.7分位數分數的列。分位數的方法需要是中點，也可以四捨五入到最接近的整數。添加.75基於分位數的分組的列由

2017-02-24 MysterioProgrammer91

我在下面概述了一種方法。

請注意，要將值舍入爲最接近的整數，您應該使用Python的內置round()函數。有關詳細信息，請參見Python documentation中的round()。

import pandas as pd 
import numpy as np 
# set random seed for reproducibility 
np.random.seed(748) 

# initialize base example dataframe 
df = pd.DataFrame({"date":np.arange(10), 
        "score":np.random.uniform(size=10)}) 

duplicate_dates = np.random.choice(df.index, 5) 

df_dup = pd.DataFrame({"date":np.random.choice(df.index, 5), 
         "score":np.random.uniform(size=5)}) 

# finish compiling example data 
df = df.append(df_dup, ignore_index=True) 

# calculate 0.7 quantile result with specified parameters 
result = df.groupby("date").quantile(q=0.7, axis=0, interpolation='midpoint') 

# print resulting dataframe 
# contains one unique 0.7 quantile value per date 
print(result) 

""" 
0.7  score 
date   
0  0.585087 
1  0.476404 
2  0.426252 
3  0.363376 
4  0.165013 
5  0.927199 
6  0.575510 
7  0.576636 
8  0.831572 
9  0.932183 
""" 

# to apply the resulting quantile information to 
# a new column in our original dataframe `df` 
# we can apply a dictionary to our "date" column 

# create dictionary 
mapping = result.to_dict()["score"] 

# apply to `df` to produce desired new column 
df["quantile_0.7"] = [mapping[x] for x in df["date"]] 

print(df) 

""" 
    date  score quantile_0.7 
0  0 0.920895  0.585087 
1  1 0.476404  0.476404 
2  2 0.380771  0.426252 
3  3 0.363376  0.363376 
4  4 0.165013  0.165013 
5  5 0.927199  0.927199 
6  6 0.340008  0.575510 
7  7 0.695818  0.576636 
8  8 0.831572  0.831572 
9  9 0.932183  0.932183 
10  7 0.457455  0.576636 
11  6 0.650666  0.575510 
12  6 0.500353  0.575510 
13  0 0.249280  0.585087 
14  2 0.471733  0.426252 
"""

來源

2017-02-24 17:19:26 Brian

添加.75基於分位數的分組的列由

回答

相關問題