2016-07-07 69 views
6

pd.DataFrame文檔字符串指定爲整個數據幀的標參數:初始化大熊貓數據幀具有定義dtypes

dtype : dtype, default None Data type to force, otherwise infer

看似它確實旨在是一個標量,如以下導致的錯誤:

dfbinseq = pd.DataFrame([], 
         columns = ["chr", "centre", "seq_binary"], 
         dtype = ["O", pd.np.int64, "O"]) 

dfbinseq = pd.DataFrame([], 
         columns = ["chr", "centre", "seq_binary"], 
         dtype = [pd.np.object, pd.np.int64, pd.np.object]) 

對我而言,創建一個空數據框(我需要在HDF5存儲中進一步存儲append s)的唯一解決方法是

dfbinseq.centre.dtype = np.int64 

有沒有辦法一次設置dtypes參數?

回答

9

您可以設置dtypeSeries

import pandas as pd 

df = pd.DataFrame({'A':pd.Series([], dtype='str'), 
        'B':pd.Series([], dtype='int'), 
        'C':pd.Series([], dtype='float')}) 

print (df) 
Empty DataFrame 
Columns: [A, B, C] 
Index: [] 

print (df.dtypes) 
A  object 
B  int32 
C float64 
dtype: object 

隨着數據:

df = pd.DataFrame({'A':pd.Series([1,2,3], dtype='str'), 
        'B':pd.Series([4,5,6], dtype='int'), 
        'C':pd.Series([7,8,9], dtype='float')}) 

print (df) 
    A B C 
0 1 4 7.0 
1 2 5 8.0 
2 3 6 9.0 

print (df.dtypes) 
A  object 
B  int32 
C float64 
dtype: object