2015-03-02 101 views
0

我已搜索並找不到此人遇到此問題。我正在嘗試創建一個彙總csv文件的數據透視表,然後將該數據透露給我自己。我已經構建了執行此過程的代碼,但它並不普遍。我一直在我的列名上得到一個KeyError,但是如果我刪除了不是表的一部分的所有列和行,它就會奇蹟般地工作。在熊貓中創建數據透視表時出錯

這裏是我的代碼:

df = pandas.read_csv('/path/to/file'),encoding='utf-8') 
pivot = pandas.pivot_table(df,index=['ClientID','ClientName','Branch'], 
          values=['EmailAddress'],aggfunc='count',margins=True) 
pivotlocation = '/path/to/save' 
pivot.to_csv(pivotlocation) 

對於我的生活,我想不出什麼錯誤,或者爲什麼這個工程上的一些文件,而不是其他。

而且,這裏是拋出的錯誤:

Traceback (most recent call last): 
File "C:\Users\rfulton\Desktop\Automation\Reports\UniversalUpload.py", line 86, in create_pivot 
    pivot = pandas.pivot_table(df,index=columns,values=aggvalue,aggfunc='count',margins=True) 
File "C:\Python34\lib\site-packages\pandas\util\decorators.py", line 88, in wrapper 
    return func(*args, **kwargs) 
File "C:\Python34\lib\site-packages\pandas\util\decorators.py", line 88, in wrapper 
    return func(*args, **kwargs) 
File "C:\Python34\lib\site-packages\pandas\tools\pivot.py", line 114, in pivot_table 
    grouped = data.groupby(keys) 
File "C:\Python34\lib\site-packages\pandas\core\generic.py", line 2898, in groupby 
    sort=sort, group_keys=group_keys, squeeze=squeeze) 
File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 1193, in groupby 
    return klass(obj, by, **kwds) 
File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 383, in __init__ 
    level=level, sort=sort) 
File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 2131, in _get_grouper 
    in_axis, name, gpr = True, gpr, obj[gpr] 
File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 1780, in __getitem__ 
    return self._getitem_column(key) 
File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 1787, in _getitem_column 
    return self._get_item_cache(key) 
File "C:\Python34\lib\site-packages\pandas\core\generic.py", line 1068, in _get_item_cache 
    values = self._data.get(item) 
File "C:\Python34\lib\site-packages\pandas\core\internals.py", line 2849, in get 
    loc = self.items.get_loc(item) 
File "C:\Python34\lib\site-packages\pandas\core\index.py", line 1402, in get_loc 
    return self._engine.get_loc(_values_from_object(key)) 
File "pandas\index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas\index.c:3807) 
File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:3687) 
File "pandas\hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12310) 
File "pandas\hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12261) 
KeyError: 'ClientID' 

正如我上文所述,如果我刪除該表的範圍之外的所有細胞中,這個錯誤不再拋出。但是,我不確定如何使用csv或pandas模塊來做到這一點。

回答

0

原來,問題在於文件的編碼。
設置編碼爲utf-8-sig修復了這個問題。