如何從IBM的Data Science Experience的pyspark訪問postgres表？

這裏是我的代碼：如何從IBM的Data Science Experience的pyspark訪問postgres表？

uname = "xxxxx" 
pword = "xxxxx" 
dbUrl = "jdbc:postgresql:dbserver" 
table = "xxxxx" 
jdbcDF = spark.read.format("jdbc").option("url", dbUrl).option("dbtable",table).option("user", uname).option("password", pword).load()

添加Postgres的驅動程序jar（％Addjar -f https://jdbc.postgresql.org/download/postgresql-9.4.1207.jre7.jar）後，我得到一個「沒有合適的驅動程序」的錯誤。是否有一個在DSX上的pyspark 2.0中從postgres加載數據的工作示例？

來源

2017-02-14 Ross Lewis

請使用濾鏡PixieDust軟件包管理器安裝在火花服務水平的Postgres驅動程序。

http://datascience.ibm.com/docs/content/analyze-data/Package-Manager.html

由於濾鏡PixieDust只支持火花1.6，運行

pixiedust.installPackage("https://jdbc.postgresql.org/download/postgresql-9.4.1207.jre7.jar")

一旦你安裝了這個，重啓內核，然後切換到火花2.0運行你的Postgres連接來獲取火花數據幀使用sparksession。

uname = "username"

pword = "xxxxxx"

dbUrl = "jdbc:postgresql://hostname:10635/compose?user="+uname+"&password="+pword

table = "tablename"

Df = spark.read.format('jdbc').options(url=dbUrl,database='compose',dbtable=table).load()

houseDf.take(1)

工作筆記本： -

https://apsportal.ibm.com/analytics/notebooks/8b220408-6fc7-48a9-8350-246fbbf10ac8/view?access_token=7297af80b2e4109087a78365e7df3205f6ed9d0840c0c46d2208bc00ed0b0274

感謝，查爾斯。

來源

2017-02-14 01:07:36

這工作！謝謝。 –

只需提供驅動程序選項

option("driver", "org.postgresql.Driver")

來源

2017-02-14 01:04:31 21d12d29d0

你有沒有一個例子可以在Data Science Experience的筆記本上做到這一點？ –

如何從IBM的Data Science Experience的pyspark訪問postgres表？

回答

相關問題