2016-07-08 311 views
6

我執行以下操作:如何迭代scala wrappedArray? (星火)

val tempDict = sqlContext.sql("select words.pName_token,collect_set(words.pID) as docids 
           from words 
           group by words.pName_token").toDF() 

val wordDocs = tempDict.filter(newDict("pName_token")===word) 

val listDocs = wordDocs.map(t => t(1)).collect() 

listDocs: Array 

[Any] = Array(WrappedArray(123, 234, 205876618, 456)) 

我的問題是我怎麼遍歷這個包裹陣列或轉換成榜單。我得到的listDocs選項有:applyasInstanceOfcloneisInstanceOflengthtoStringupdate 我該如何繼續?

回答

6

以下是解決此問題的一種方法。

import org.apache.spark.sql.Row 
import org.apache.spark.sql.functions._ 
import scala.collection.mutable.WrappedArray 

val data = Seq((Seq(1,2,3),Seq(4,5,6),Seq(7,8,9))) 
val df = sqlContext.createDataFrame(data) 
val first = df.first 

// use a pattern match to deferral the type 
val mapped = first.getAs[WrappedArray[Int]](0) 

// now we can use it like normal collection 
mapped.mkString("\n") 

// get rows where has array 
val rows = df.collect.map { 
    case Row(a: Seq[Any], b: Seq[Any], c: Seq[Any]) => 
     (a, b, c) 
} 
rows.mkString("\n") 
+0

其實我這樣做,這似乎解決了我的情況: VAL arrDocs = listDocs(0) VAL TEMP = arrDocs.asInstanceOf [mutable.WrappedArray [龍] 的** **溫度現在基本上給了我一個迭代器。 – boY

+0

謝謝@boY,我更新了答案。前一個有點冗長。 –

+0

我在代碼中遇到了WrappedArray的問題,並能夠用Seq [Int]替換它。 – jspooner