2016-01-17 15 views
0

如何在不將結果轉換爲DataFrame的情況下重命名count操作的列?如何重命名通過Apache Spark中的GroupedDataset操作創建的新列?

case class LogRow(id: String, location: String, time: Long) 
case class KeyValue(key: (String, String), value: Long) 

val log = LogRow("1", "a", 1) :: LogRow("1", "a", 2) :: LogRow("1", "b", 3) :: LogRow("1", "a", 4) :: LogRow("1", "b", 5) :: LogRow("1", "b", 6) :: LogRow("1", "c", 7) :: LogRow("2", "a", 1) :: LogRow("2", "b", 2) :: LogRow("2", "b", 3) :: LogRow("2", "a", 4) :: LogRow("2", "a", 5) :: LogRow("2", "a", 6) :: LogRow("2", "c", 7) :: Nil 
log.toDS().groupBy(l => { 
    (l.id, l.location) 
}).count().toDF().toDF("key", "value").as[KeyValue].show 

+-----+-----+ 
| key|value| 
+-----+-----+ 
|[1,a]| 3| 
|[1,b]| 3| 
|[1,c]| 1| 
|[2,a]| 4| 
|[2,b]| 2| 
|[2,c]| 1| 
+-----+-----+ 
+0

什麼ü通過更改列是什麼意思?改名? –

+0

對不起,是重命名。 –

回答

1

只是它映射到直接需要的類型:

log.toDS.groupBy(l => { 
    (l.id, l.location) 
}).count.as[KeyValue]