2017-10-10 50 views
0

我在火花中有以下模式並希望將其弄平。如何在火花中弄平嵌套結構

root 
|-- binlog_read_timestamp: string (nullable = true) 
|-- row: struct (nullable = true) 
| |-- after_values: struct (nullable = true) 
| | |-- id: long (nullable = true) 
| |-- before_values: struct (nullable = true) 
| | |-- id: long (nullable = true) 
| |-- values: struct (nullable = true) 
| | |-- id: long (nullable = true) 
|-- schema: string (nullable = true) 
|-- table: string (nullable = true) 
|-- type: string (nullable = true) 

所以取決於type,我想要做的事情如下價值:

IF type == A THEN add new column with after_values.id 
IF type == B THEN add new column with before_values.id 
IF type == C THEN add new column with values.id 

關於如何做到這一點有什麼建議?謝謝!

+0

換下來表決任何評論將被讚賞。 – Chengzhi

回答

2

嘗試

from pyspark.sql.functions import * 

df.withColumn("new_column", 
    when(col("type") == "A", col("after_values.id")) \ 
    .when(col("type") == "B", col("before_values.id")) \ 
    .when(col("type") == "C", col("values.id")))