難以導入.dat文件

我在某種程度上閱讀這個文件到python與熊貓read_table函數有困難。 http://www.ssc.wisc.edu/~bhansen/econometrics/invest.dat 難以導入.dat文件

這是我的代碼：

pd.read_table(f,skiprows=[0], sep="")

其中產量錯誤：

TypeError: ord() expected a character, but string of length 0 found

來源

2014-12-11 zsljulius

不知道關於read_table，但是你可以按照如下直接讀取這個文件：

import pandas as pd  

with open('/tmp/invest.dat','r') as f: 
    next(f) # skip first row 
    df = pd.DataFrame(l.rstrip().split() for l in f) 

print(df)

打印：

   0   1    2   3 
0  17.749000 0.66007000 0.15122000 0.33150000 
1  3.9480000 0.52889000 0.11523000 0.56233000 
2  14.810000 3.7480300 0.57099000 0.12111000 
... 
...

同樣可以按如下方式獲得：

df = pd.read_csv('/tmp/invest.dat', sep='\s+', header=None, skiprows=1)

來源

2014-12-11 01:30:54 Marcin

只是覺得這可能更有效，內置的功能來做到這一點。你知道用內置函數來做嗎？ – zsljulius 2014-12-11 01:35:08

你可以簡單地使用''skiprows = 0'' – Jeff 2014-12-11 01:35:35

謝謝了。我認爲訣竅是在sep參數中使用正則表達式。因爲當我使用「\ s +」時，即使對於read_table，也是如此。 – zsljulius 2014-12-11 01:42:47

難以導入.dat文件

回答

相關問題