0
假設我有一個玩家ID的遊戲。每個ID可以有多個角色名稱(playerNames),我們對每個名稱都有一個評分。我想總計每個playerName的所有分數,並計算每個玩家名稱每個id的百分比分數。在豬羣結果內循環通過
所以,舉例來說:
id playerName playerScore 01 Test 45 01 Test2 15 02 Joe 100
將輸出
id {(playerName, playerScore, percentScore)} 01 {(Test, 45, .75), (Test2, 15, .25)} 02 {(Joe, 100, 1.0)}
我是這樣做的:
data = LOAD 'someData.data' AS (id:int, playerName:chararray, playerScore:int);
grouped = GROUP data BY id;
withSummedScore = FOREACH grouped GENERATE SUM(data.playerScore) AS summedPlayerScore, FLATTEN(data);
withPercentScore = FOREACH withSummedScore GENERATE data::id AS id, data::playerName AS playerName, (playerScore/summedPlayerScore) AS percentScore;
percentScoreIdroup = GROUP withPercentScore By id;
目前,我這樣做有2 GROUP BY語句,我很好奇,如果他們都是必要的,或者如果有更有效的方法來做到這一點。我可以將其減少到單個GROUP BY嗎?或者,有沒有一種方法可以迭代一堆元組,並將percentScore添加到所有元組中,而不會壓扁數據?
這樣做很有意義,謝謝TC1 – Newtang