2015-01-09 159 views
4

我在Pandas中做了一個看起來很簡單的小組。該列是一個沒有NaN或奇怪字符串的字符串列。但是,我不斷收到下面的錯誤。有誰知道爲什麼這些戰鬥發生?我覺得喜歡它可能有一些與我的數據,但是這一切似乎是確定...Python pandas groupby pandas.hashtable.PyObjectHashTable.get_item中的關鍵錯誤

我正在by_user = df.groupby('User')

和堆棧跟蹤:

by_user = df.groupby('User') 
File "c:\Anaconda\lib\site-packages\pandas\core\generic.py", line 2773, in groupby 
sort=sort, group_keys=group_keys, squeeze=squeeze) 
File "c:\Anaconda\lib\site-packages\pandas\core\groupby.py", line 1142, in groupby 
return klass(obj, by, **kwds) 
File "c:\Anaconda\lib\site-packages\pandas\core\groupby.py", line 388, in __init__ level=level, sort=sort) 
File "c:\Anaconda\lib\site-packages\pandas\core\groupby.py", line 2041, in _get_grouper 
gpr = obj[gpr] 
File "c:\Anaconda\lib\site-packages\pandas\core\frame.py", line 1678, in __getitem__ 
return self._getitem_column(key) 
File "c:\Anaconda\lib\site-packages\pandas\core\frame.py", line 1685, in _get  item_column 
return self._get_item_cache(key) 
File "c:\Anaconda\lib\site-packages\pandas\core\generic.py", line 1052, in _ge 
t_item_cache 
values = self._data.get(item) 
File "c:\Anaconda\lib\site-packages\pandas\core\internals.py", line 2565, in get 
loc = self.items.get_loc(item) 
File "c:\Anaconda\lib\site-packages\pandas\core\index.py", line 1181, in get_loc 
return self._engine.get_loc(_values_from_object(key)) 
File "index.pyx", line 129, in pandas.index.IndexEngine.get_loc (pandas\index. 
c:3656) 
File "index.pyx", line 149, in pandas.index.IndexEngine.get_loc (pandas\index. 
c:3534) 
File "hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item 
(pandas\hashtable.c:11911) 
File "hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item 
(pandas\hashtable.c:11864) 
KeyError: 'User' 

df.info():

User Code  175167 non-null object 
Version   175167 non-null object 
Date Accessed 175167 non-null datetime64[ns] 
Series   175167 non-null object 
Software   175167 non-null object 
User    175167 non-null object 
+2

你可以發佈'df.info'的輸出,也是''User''其中一列? – EdChum 2015-01-09 22:48:24

+0

@EdChum奇怪(?)即使列沒有找到,這應該不會引發。 – 2015-01-10 00:00:19

+0

@EdChum我添加了'df.info'。 「用戶」在那裏,沒有空值,它是一個簡單的名稱集合,而這些名稱中沒有任何奇怪的字符。這個df是通過'concat'在一堆* .xlsx文件上創建的。 – RedRaven 2015-01-10 05:07:27

回答

6

我[從評論移動] T的容易錯過尾隨在列名的空白,但你可以手動檢查df.columns

>>> df = pd.DataFrame({"User": [1,2]}) 
>>> df2 = pd.DataFrame({"User ": [1,2]}) 
>>> df 
    User 
0  1 
1  2 
>>> df2 
    User 
0  1 
1  2 
>>> df.columns 
Index([u'User'], dtype='object') 
>>> df2.columns 
Index([u'User '], dtype='object') 

(要剝開帷幕了一下,我懷疑這樣的事情可能因爲當我嘲笑了我自己的數據幀進行回事並看着df.info(),我沒有看到你的輸出顯示的列名和數字之間的空間太多。)