3
我想從col1
存在於col2
刪除字符串時:火花柱字符串替換存在於其它列(行)
val df = spark.createDataFrame(Seq(
("Hi I heard about Spark", "Spark"),
("I wish Java could use case classes", "Java"),
("Logistic regression models are neat", "models")
)).toDF("sentence", "label")
使用regexp_replace
或translate
REF:spark functions api
val res = df.withColumn("sentence_without_label", regexp_replace
(col("sentence") , "(?????)", ""))
所以res
看起來如下:
沒有必要在這裏的UDF – mtoto