如何在Spark中總結多個列?例如,在SparkR中,以下代碼用於獲取一列的總和,但如果我嘗試獲取df
中兩列的總和,則會出現錯誤。在Spark中總結多列
# Create SparkDataFrame
df <- createDataFrame(faithful)
# Use agg to sum total waiting times
head(agg(df, totalWaiting = sum(df$waiting)))
##This works
# Use agg to sum total of waiting and eruptions
head(agg(df, total = sum(df$waiting, df$eruptions)))
##This doesn't work
無論SparkR或PySpark代碼將工作。