7
在線性迴歸簡單示例中使用版本2.0.0中的pySpark ML API時,出現新ML庫的錯誤。如何將ML VectorUDT功能從.mllib轉換爲.ml類型
的代碼是:
from pyspark.sql import SQLContext
sqlContext =SQLContext(sc)
from pyspark.mllib.linalg import Vectors
data=sc.parallelize(([1,2],[2,4],[3,6],[4,8]))
def f2Lp(inStr):
return (float(inStr[0]), Vectors.dense(inStr[1]))
Lp = data.map(f2Lp)
testDF=sqlContext.createDataFrame(Lp,["label","features"])
(trainingData, testData) = testDF.randomSplit([0.8,0.2])
from pyspark.ml.regression import LinearRegression
lr=LinearRegression()
model=lr.fit(trainingData)
和錯誤:
IllegalArgumentException: u'requirement failed: Column features must be of type [email protected] but was actually [email protected]'
我應該如何變換,從.mllib載體功能.ml類型?