使用normalize
參數value_counts
df.groupby('school')['Race/Ethnicity'].value_counts(normalize=True)
school Race/Ethnicity
school1 Latino/a 0.697368
African American/Black 0.197368
Bi-racial/Multi-racial 0.052632
White 0.026316
American Indian/Alaska Native 0.013158
Other - Write In (Required) 0.013158
school2 Latino/a 0.764706
American Indian/Alaska Native 0.147059
African American/Black 0.029412
Asian 0.029412
Bi-Racial/Multi-Racial 0.029412
Name: Race/Ethnicity, dtype: float64
您也可以跳過排序
df.groupby('school')['Race/Ethnicity'].value_counts(normalize=True, sort=False)
school Race/Ethnicity
school1 African American/Black 0.197368
American Indian/Alaska Native 0.013158
Bi-racial/Multi-racial 0.052632
Latino/a 0.697368
Other - Write In (Required) 0.013158
White 0.026316
school2 African American/Black 0.029412
American Indian/Alaska Native 0.147059
Asian 0.029412
Bi-Racial/Multi-Racial 0.029412
Latino/a 0.764706
Name: Race/Ethnicity, dtype: float64
設置
df = pd.DataFrame(
[['school1', 'African American/Black']] * 15 +
[['school1', 'American Indian/Alaska Native']] * 1 +
[['school1', 'Bi-racial/Multi-racial']] * 4 +
[['school1', 'Latino/a']] * 53 +
[['school1', 'Other - Write In (Required)']] * 1 +
[['school1', 'White']] * 2 +
[['school2', 'African American/Black']] * 1 +
[['school2', 'American Indian/Alaska Native']] * 5 +
[['school2', 'Asian']] * 1 +
[['school2', 'Bi-Racial/Multi-Racial']] * 1 +
[['school2', 'Latino/a']] * 26,
columns=['school', 'Race/Ethnicity']
)
查看示例數據會很有幫助,但聽起來您可以用'df.groupby('school')。size()'來區分'df2'。 –
@AndrewL謝謝你,那正是我需要的!我知道我讓事情變得比他們需要的更艱難。 – Cameron