2017-12-03 256 views
0

我有以下結構的數據幀:PySpark:數據幀 - 轉換結構數組

root 
|-- index: long (nullable = true) 
|-- text: string (nullable = true) 
|-- topicDistribution: struct (nullable = true) 
| |-- type: long (nullable = true) 
| |-- values: array (nullable = true) 
| | |-- element: double (containsNull = true) 
|-- wiki_index: string (nullable = true) 

我需要將其更改爲:

root 
|-- index: long (nullable = true) 
|-- text: string (nullable = true) 
|-- topicDistribution: array (nullable = true) 
| |-- element: double (containsNull = true) 
|-- wiki_index: string (nullable = true) 

請問我該怎麼辦呢?

非常感謝。

回答

3

我認爲你正在尋找

df.withColumn("topicDistribution", col("topicDistribution").getField("values"))