2017-02-21 63 views
0

我有這個表中,每一行捐贈出售:獲取每天前N行的蜂巢 - 等級()

sale_date salesman sale_item_id 
20170102 JohnSmith  309 
20170102 JohnSmith  292 
20170103 AlexHam   93 

我試圖讓每天前20名銷售人員和我想出了這樣的:

SELECT sale_date, salesman, sale_count, row_num 
FROM (
    SELECT sale_date, salesman, 
     count(*) as sale_count, 
     rank() over (partition by sale_date order by sale_count desc) as row_num 
    from salesforce.sales_data 
) T 
WHERE sale_date between '20170101' and '20170110' 
and row_num <= 20 

,但我得到:

FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. 
Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: Line 5:35 Expression not in GROUP BY key 'sale_date' 

我不知道在什麼時候該組將生效,但。有人可以幫忙嗎? TX!

回答

2

你缺少的子查詢中group by

SELECT sale_date, salesman, sale_count, row_num 
FROM (SELECT sale_date, salesman, 
      count(*) as sale_count, 
      rank() over (partition by sale_date order by count(*) desc) as row_num 
     FROM salesforce.sales_data 
     GROUP BY sale_date, salesman 
    ) T 
WHERE sale_date between '20170101' and '20170110' and row_num <= 20; 

我覺得蜂巢會接受在order by列別名,order by sale_count desc

另請注意,如果有關係,則可以獲得多於或少於20行。如果您需要正好20行,您可能需要row_number()

+0

謝謝@Gordon - 我現在得到了同樣的錯誤,但是「Expression not not in GROUP BY key'sale_count'」。 AFAIK別名不能用於分組子句中,但是對於它的問題,我將它添加到分組子句中並得到「無效的表別名或列引用'sale_count'」 – Craig

+0

您不需要使用窗口函數組。 – hlagos

+0

@lake 。 。 。如果排名在聚合上,您可以這樣做。 –

0

試試這個

SELECT sale_date, salesman, sale_count, row_num from (
SELECT sale_date, salesman, sale_count, 
rank() over (partition by sale_date order by sale_count desc) as   row_num 
from 
(
SELECT sale_date, salesman, 
    count(*) over (partition by salesman) as sale_count 
from employee 
) t1 
) t2 where sale_date between '20170101' and '20170110' 
and row_num <= 20; 
WHERE sale_date between '20170101' and '20170110' 
and row_num <= 20 

編輯和testest。你的問題基本上是你在計算你的over子句之前要使用count,如果你在推銷員分區的子查詢中計算你的count,它將解決問題。您無法在銷售查詢中進行分組,如果您這樣做,則無法訪問sale_date。