2017-10-09 705 views
1

怎麼把 轉換成org.apache.spark.ml.linalg.SparseVector如何轉換org.apache.spark.mllib.linalg.SparseVector至org.apache.spark.ml.linalg.SparseVector?

我將代碼從mllib轉換爲ml api。

import org.apache.spark.mllib.linalg.{DenseVector, Vector} 
import org.apache.spark.ml.linalg.{DenseVector => NewDenseVector, Vector => NewVector} 
import org.apache.spark.mllib.regression.LabeledPoint 
import org.apache.spark.ml.feature.{LabeledPoint => NewLabeledPoint} 

val labelPointData = limitedTable.rdd.map { row => 
    new NewLabeledPoint(convertToDouble(row.head), row(1).asInstanceOf[org.apache.spark.ml.linalg.SparseVector]) 
} 

聲明row(1).asInstanceOf[org.apache.spark.ml.linalg.SparseVector] 不工作,因爲以下異常:

org.apache.spark.mllib.linalg.SparseVector cannot be cast to org.apache.spark.ml.linalg.SparseVector

如何克服的?

我發現代碼從mllibml但不反之亦然轉換。

回答

3

可以在兩個方向上轉換。首先,讓我們創建一個mllib SparseVector

import org.apache.spark.mllib.linalg.Vectors 
val mllibVec: org.apache.spark.mllib.linalg.Vector = Vectors.sparse(3, Array(1,2,3), Array(1,2,3)) 

要轉換到ML SparseVector,只需使用asML

val mlVec: org.apache.spark.ml.linalg.Vector = mllibVec.asML 

要再次進行轉換,最簡單的方法是使用Vectors.fromML()

val mllibVec2: org.apache.spark.mllib.linalg.Vector = Vectors.fromML(mlVec) 

另外,在你的公司de,而不是row(1).asInstanceOf[SparseVector]你可以試試row.getAs[SparseVector](1)。嘗試將載體讀取爲mllib載體,然後將其轉換爲asML並傳遞到基於ML的LabeledPoint,即:

val labelPointData = limitedTable.rdd.map { row => 
    NewLabeledPoint(convertToDouble(row.head), row.getAs[org.apache.spark.mllb.linalg.SparseVector](1).asML) 
}