2017-04-04 100 views
-1

加入兩個數組我有2列的格式如下星火斯卡拉 - 由VertexID

scala> cPV.take(5) 
res18: Array[(org.apache.spark.graphx.VertexId, String)] = Array((-496366541,7804412), (183389035,11517829), (1300761459,36164965), (978932066,32135154), (370291237,40355685)) 

scala> fC.take(5) 
res19: Array[(org.apache.spark.graphx.VertexId, Int)] = Array((386253628,1), (-1141923433,1), (1871855296,7), (1938255756,1), (-749015657,5)) 

我需要加入他們進入的格式 - Array[(org.apache.spark.graphx.VertexId, Int, String)]

我都試過。加入(),但它拋出以下錯誤

val mVP = fC.join(cPV) 
<console>:64: error: value join is not a member of Array[(org.apache.spark.graphx.VertexId, Int)] 
     val mVP = fC.join(cPV) 

我也試過this,它沒有工作。

回答

1

我嘗試了以下內容和它的工作

val fCRDD = sc.parallelize(fC) 
scala> val mVP = fCRDD.join(cPV) 
mVP: org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, (Int, String))] = MapPartitionsRDD[106] at join at <console>:67 

scala> mVP.take(5) 
res21: Array[(org.apache.spark.graphx.VertexId, (Int, String))] = Array((-891966589,(4,D)), (166544732,(74,V)), (1871855296,(7,LG)), (1416009424,(6,Dck)), (-241988197,(4,L))) 

對不起,這裏菜鳥 - 我應該在這裏張貼問題之前已經試過這一點。