2017-04-17 90 views
0

我正在使用標準非聚合表列key1和key2上的group by運行Hive查詢。但我正在添加一個常量類型的列,根據每個表的聯合狀態的條件。是否可以通過Hive中的恆定列進行分組?

CREATE TABLE IF NOT EXISTS T_FINAL AS SELECT DISTINCT union_tbles.key1 AS key1, union_tbles.key2 AS key2, union_tbles.cnt AS cnt, union_tbles.type AS type FROM (
SELECT key1 AS key1, key2 AS key2, COUNT(val) AS cnt, 'x1' AS type FROM T_SUB1 WHERE key1 IN ('X1') GROUP BY key1, key2 
UNION ALL 
SELECT key1 AS key1, key2 AS key2, COUNT(val) AS cnt, 'x2' AS type FROM T_SUB1 WHERE key1 IN ('X2') GROUP BY key1, key2 
) union_tbles 

是否可以像下面那樣將常數列類型添加爲分組列? 當我試圖在group-by中添加常量列類型時,我在Hive中收到Invalid列別名錯誤。任何建議如何在Hive中實現?

CREATE TABLE IF NOT EXISTS T_FINAL AS SELECT DISTINCT union_tbles.key1 AS key1, union_tbles.key2 AS key2, union_tbles.cnt AS cnt, union_tbles.type AS type FROM (
SELECT key1 AS key1, key2 AS key2, COUNT(val) AS cnt, 'x1' AS type FROM T_SUB1 WHERE key1 IN ('X1') GROUP BY key1, key2, type 
UNION ALL 
SELECT key1 AS key1, key2 AS key2, COUNT(val) AS cnt, 'x2' AS type FROM T_SUB1 WHERE key1 IN ('X2') GROUP BY key1, key2, type 
) union_tbles 
+0

「SELECT DISTINCT」? –

+0

爲什麼「UNION ALL」在第一位呢? –

+0

我在一系列UNION上運行JOIN。我用兩種方式運行 - SELECT DISTINCT(... UNION ALL ... UNION ALL ...等)和SELECT FROM(... UNION DISTINCT ... UNION DISTINCT)。第一種方式是,就業人數爲15人,而第二種方式則爲25人。只需增加「聯盟差異」就可以大大增加就業人數。每個工作都使用9068個映射器和1009個reducer。所以它所花費的時間過高。所以我想減少工作的數量。 – somnathchakrabarti

回答

0

Hive無法識別GROUP BY子句中的別名。
在任何情況下,絕對不需要按常量進行分組。
常量不需要在GROUP BY子句中進行選擇。

​​
+---+-----+-----+-------+------------+ 
| x | _c1 | _c2 | _c3 | _c4  | 
+---+-----+-----+-------+------------+ 
| 1 | 1 | 2 | Hello | 2017-04-17 | 
+---+-----+-----+-------+------------+