使用odo在AWS上加載CSV - > postgres在AWS

我正在嘗試做一些相當簡單的事情，但是odo已被破壞，或者我不明白datashapes如何在此軟件包的上下文中工作。使用odo在AWS上加載CSV - > postgres在AWS

CSV文件：

email,dob 
[email protected],1982-07-13 
[email protected],1997-01-01 
...

代碼：

from odo import odo 
import pandas as pd 

df = pd.read_csv("...") 
connection_str = "postgresql+psycopg2:// ... " 

t = odo('path/to/data.csv', connection_str, dshape='var * {email: string, dob: datetime}')

錯誤：

AssertionError: datashape must be Record type, got 0 * {email: string, dob: datetime}

這是同樣的錯誤，如果我嘗試直接從數據幀去 - > Postgres以及：

t = odo(df, connection_str, dshape='var * {email: string, dob: datetime}')

其他一些問題不能解決問題：1）從CSV文件中刪除標題行，2）將var更改爲DataFrame中的實際行數。

我在這裏做錯了什麼？

來源

2017-09-18 lollercoaster

你試過pd.to_sql？似乎你只是想將csv保存到postgres表中？ https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html – wkzhu

是的，它真的很慢。 'odo'應該使用postgres的副本內部來做很多，更快：http://odo.pydata.org/en/latest/perf.html – lollercoaster

我不熟悉'odo'，但你可以做快速加載自己https://stackoverflow.com/questions/41875817/write-fast-pandas-dataframe-to-postgres/ – Michael

connection_str是否有表名？當我遇到一個類似的問題，但與一個SQLite數據庫時，它修復了我。

應該是這樣的：

connection_str = "postgresql+psycopg2://your_database_name::data" 
t = odo(df, connection_str, dshape='var * {email: string, dob: datetime}')

其中 'connection_str' 數據'是你的新表名。

參見：

python odo sql AssertionError: datashape must be Record type, got 0 * {...}

https://github.com/blaze/odo/issues/580

來源

2017-10-03 23:36:52 mbyim

使用odo在AWS上加載CSV - > postgres在AWS

回答

相關問題