2
我在嘗試從RDD創建DataFrame時遇到了錯誤。
我的代碼:unbound方法createDataFrame()
from pyspark import SparkConf, SparkContext
from pyspark import sql
conf = SparkConf()
conf.setMaster('local')
conf.setAppName('Test')
sc = SparkContext(conf = conf)
print sc.version
rdd = sc.parallelize([(0,1), (0,1), (0,2), (1,2), (1,10), (1,20), (3,18), (3,18), (3,18)])
df = sql.SQLContext.createDataFrame(rdd, ["id", "score"]).collect()
print df
錯誤:
df = sql.SQLContext.createDataFrame(rdd, ["id", "score"]).collect()
TypeError: unbound method createDataFrame() must be called with SQLContext
instance as first argument (got RDD instance instead)
我完成火花外殼相同的任務,其中一個直接的最後三行代碼將打印值。我主要懷疑導入語句,因爲這是IDE和Shell之間的區別。