2017-02-15 101 views

回答

0

name列的情況下,由獨特的價值觀,

print df 

    name   address number 
0 Bob    bob No.56 
1 NaN  @gmail.com  NaN 
2 Carly [email protected] No.90 
3 Gorge  [email protected]  NaN 
4 NaN    .com  NaN 
5 NaN    NaN No.100 

df['name'] = df['name'].ffill() 
print df.fillna('').groupby(['name'], as_index=False).sum() 

    name   address number 
0 Bob [email protected] No.56 
1 Carly [email protected] No.90 
2 Gorge [email protected] No.100 

你可能需要ffill()bfill()[::-1].groupby('name').apply(lambda x: ' '.join(x['address']))strip()lstrip()rstrip()replace()種事情擴展上面的代碼更復雜的數據。

0

如果要轉換性行的數據幀(每列中可能有NaN條目),則可能沒有直接的pandas方法。

你需要一些代碼在name列賦值,使大熊貓能夠知道bob的分離行和@gmail.com屬於同一用戶Bob

您可以使用fillnaffill方法填寫第name列中的每個空條目,請參閱pandas dataframe missing data

df ['name'] = df['name'].ffill() 

# gives 
    name address number 
0 Bob bob No.56 
1 Bob @gmail.com 
2 Carly [email protected] No.90 
3 Gorge [email protected] 
4 Gorge .com  
5 Gorge  No.100 

然後你可以使用groupbysum作爲聚合功能。

df.groupby(['name']).sum().reset_index() 

# gives 
    name address number 
0 Bob [email protected] No.56 
1 Carly [email protected] No.90 
2 Gorge [email protected] No.100 

您可能會發現NaN和空白有用之間的轉換,見Replacing blank values (white space) with NaN in pandaspandas.DataFrame.fillna