我需要實施星火下面的SQL邏輯DataFrame
星火據幀巢式病例在聲明
SELECT KEY,
CASE WHEN tc in ('a','b') THEN 'Y'
WHEN tc in ('a') AND amt > 0 THEN 'N'
ELSE NULL END REASON,
FROM dataset1;
我輸入DataFrame
是如下:
val dataset1 = Seq((66, "a", "4"), (67, "a", "0"), (70, "b", "4"), (71, "d", "4")).toDF("KEY", "tc", "amt")
dataset1.show()
+---+---+---+
|KEY| tc|amt|
+---+---+---+
| 66| a| 4|
| 67| a| 0|
| 70| b| 4|
| 71| d| 4|
+---+---+---+
我有落實巢式病例當聲明爲:
dataset1.withColumn("REASON", when(col("tc").isin("a", "b"), "Y")
.otherwise(when(col("tc").equalTo("a") && col("amt").geq(0), "N")
.otherwise(null))).show()
+---+---+---+------+
|KEY| tc|amt|REASON|
+---+---+---+------+
| 66| a| 4| Y|
| 67| a| 0| Y|
| 70| b| 4| Y|
| 71| d| 4| null|
+---+---+---+------+
如果嵌套的when語句更進一步,則使用「otherwise」語句執行上述邏輯的可讀性是很麻煩的。
有沒有更好的方法來實現嵌套情況下的語句在Spark DataFrames
?