0
我有兩個數據幀,的GroupBy
Dataframe1包含鍵/值對:
+------+-----------------+
| Key | Value |
+------+-----------------+
| key1 | Column1 |
+------+-----------------+
| key2 | Column2 |
+------+-----------------+
| key3 | Column1,Column3 |
+------+-----------------+
第二數據幀:
這是實際的數據框,我需要申請GROUPBY操作
+---------+---------+---------+--------+
| Column1 | Column2 | Column3 | Amount |
+---------+---------+---------+--------+
| A | A1 | XYZ | 100 |
+---------+---------+---------+--------+
| A | A1 | XYZ | 100 |
+---------+---------+---------+--------+
| A | A2 | XYZ | 10 |
+---------+---------+---------+--------+
| A | A3 | PQR | 100 |
+---------+---------+---------+--------+
| B | B1 | XYZ | 200 |
+---------+---------+---------+--------+
| B | B2 | PQR | 280 |
+---------+---------+---------+--------+
| B | B3 | XYZ | 20 |
+---------+---------+---------+--------+
Dataframe1包含鍵值列 它採取從dataframe1的鑰匙,它必須採取相應的值,並做了dataframe2的GROUPBY操作
Dframe= df.groupBy($"key").sum("amount").show()
預期輸出:基於在數據幀的鍵第三dataframes
d1= df.grouBy($"key1").sum("amount").show()
它必須是:df.grouBy($"column1").sum("amount").show()
+---+-----+
| A | 310 |
+---+-----+
| B | 500 |
+---+-----+
代碼:
d2=df.groupBy($"key2").sum("amount").show()
result: df.grouBy($"column2").sum("amount").show()
數據框:
+----+-----+
| A1 | 200 |
+----+-----+
| A2 | 10 |
+----+-----+
代碼:
d3.df.groupBy($"key3").sum("amount").show()
數據框:
+---+-----+-----+
| A | XYZ | 320 |
+---+-----+-----+
| A | PQR | 10 |
+---+-----+-----+
| B | XYZ | 220 |
+---+-----+-----+
| B | PQR | 280 |
+---+-----+-----+
在未來,如果我增加更多的按鍵,它具有顯示數據框。有人能幫我嗎。
感謝您的答覆,這是我在尋找什麼。我可以聯合所有的數據框? – prapthi
聯合數據框的列將是什麼?對於聯合,所有數據框的列號應該相同。 –
有沒有可能我可以只爲column1和column2執行unionall? – prapthi