我是非常新的Spark機器學習(2天大)我在Spark中執行下面的代碼殼牌我試圖預測某個值,我看到在#1提供這個錯誤後,但我不能夠再修復我的代碼以適當的解決方案,以便張貼問題爲同一java.lang.IllegalArgumentException:需求失敗:列功能必須是類型org.apache.spark.ml.linalg.VectorUDT
輸入數據的道歉:
1.00,1.00,9.00
1.00,2.00,10.00
1.00,3.00,9.00
1.00,4.00,9.00
1.00,5.00,9.00
1.00,6.00,9.45
1.00,7.00,9.45
1.00,8.00,9.45
1.00,9.00,9.45
代碼:
val df = spark.read.csv("/root/Predictiondata.csv").toDF("Userid", "Date", "Intime")
import org.apache.spark.sql.types.DoubleType
val featureDf = df.select(df("Userid").cast(DoubleType).as("Userid"),df("Date").cast(DoubleType).as("Date"),df("Intime").cast(DoubleType).as("Intime")).toDF()
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.LabeledPoint
val data = featureDf.select("Userid","Date","Intime").map(r => LabeledPoint(r(0).toString.toDouble,Vectors.dense(r(1).toString.toDouble,r(2).toString.toDouble))).toDF()
import org.apache.spark.ml.regression.LinearRegression
val lr = new LinearRegression()
val lrModel = lr.fit(data)
錯誤:
scala> val lrModel = lr.fit(data)
java.lang.IllegalArgumentException: requirement failed: Column features must be of type [email protected] but was actually [email protected]
at scala.Predef$.require(Predef.scala:224)
at org.apache.spark.ml.util.SchemaUtils$.checkColumnType(SchemaUtils.scala:42)
at org.apache.spark.ml.PredictorParams$class.validateAndTransformSchema(Predictor.scala:51)
at org.apache.spark.ml.Predictor.validateAndTransformSchema(Predictor.scala:72)
at org.apache.spark.ml.Predictor.transformSchema(Predictor.scala:122)
at org.apache.spark.ml.PipelineStage.transformSchema(Pipeline.scala:74)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
... 48 elided
任何幫助或建議是高度讚賞。
由於提前
非常感謝您的幫助!...這使得一些修改之後的工作。 ...再次感謝您的幫助!!! – Bhavesh