2016-05-11 30 views
2

假設數據幀df有三列c1, c2, c3python:分離出在熊貓數據幀中有重複的行

df=pd.DataFrame() 
df['c1']=[1,2,3,3,4] 
df['c2']=["a1","a2","a2","a2","a1"] 
df['c3']=[1,2,3,3,5] 
print df 
df1=df[df.duplicated()] 
print df1 

DF1僅有一行,這是

c1 c2 c3 
3 3 a2 3 

,但我想有

c1 c2 c3 
2 3 a2 3 
3 3 a2 3 

如何獲得呢?還有一兩件事,如果我嘗試使用的說法「保持」爲df1 = df[df.duplicated(keep=False)],它給我的錯誤

Traceback (most recent call last): 

File "<ipython-input-572-188a22102b3e>", line 1, in <module> 
df1 = df[df.duplicated(keep=False)] 

File "C:\Users\Kanika\Anaconda\lib\site-packages\pandas\util\decorators.py", line 88, in wrapper 
    return func(*args, **kwargs) 

TypeError: duplicated() got an unexpected keyword argument 'keep' 

回答

2

什麼是您爲保持指定的值。我認爲,在你的情況下,通過False作爲保留值可能會解決問題。 Pandas Duplicated Doc's。希望能幫助到你。

df1 = df[df.duplicated(keep=False)] 
+0

我也通過False'保持'..它給我錯誤 –

0
df1=df[df.duplicated(keep=False)] 

此選項刪除所有重複,defalult大熊貓不斷首先出現。