我有出租車數據的兩列,看起來像這樣一個數據幀:GROUP BY和發現前n value_counts大熊貓
Neighborhood Borough Time
Midtown Manhattan X
Melrose Bronx Y
Grant City Staten Island Z
Midtown Manhattan A
Lincoln Square Manhattan B
基本上,每一行代表在市鎮在附近出租車皮卡。現在,我想找到每個行政區的前五個社區,其中皮卡的數量最多。我嘗試這樣做:
df['Neighborhood'].groupby(df['Borough']).value_counts()
,給了我這樣的事情:
borough
Bronx High Bridge 3424
Mott Haven 2515
Concourse Village 1443
Port Morris 1153
Melrose 492
North Riverdale 463
Eastchester 434
Concourse 395
Fordham 252
Wakefield 214
Kingsbridge 212
Mount Hope 200
Parkchester 191
......
Staten Island Castleton Corners 4
Dongan Hills 4
Eltingville 4
Graniteville 4
Great Kills 4
Castleton 3
Woodrow 1
如何過濾它,這樣我只得到了前5名從各個?我知道有幾個問題有相似的標題,但對我的案例沒有幫助。
它正在l = 0創建一個額外的級別,只需添加s.index.droplevel(level = 0) –
@Nemish Kanwar - 謝謝你的好主意。或者'print s.groupby(level = 0).nlargest(1).reset_index(level = 0,drop = True)' – jezrael