2016-09-22 72 views
0

我正在使用mysql,但需要在配置單元上覆制一些查詢。蜂巢:尋找客戶從交易中一起購買的商品

我有一個表,這個表格

transaction table

我想檢索以下信息:

Resultant table

在MySQL,下面的查詢工作:

SELECT c.original_item_id, c.bought_with_item_id, count(*) as times_bought_together 
FROM (
    SELECT a.item_id as original_item_id, b.item_id as bought_with_item_id 
    FROM items a 
    INNER join items b 
    ON a.transaction_id = b.transaction_id AND a.item_id != b.item_id where original_item_id in ('B','C')) c 
GROUP BY c.original_item_id, c.bought_with_item_id; 

但我不是能夠將此轉換爲蜂巢查詢,我已經嘗試了很多洗牌連接,並在條件下替換了哪裏,但沒有得到必要的結果。如果我可以在此找到一些幫助,會很好。

回答

0

Hive不支持不平等連接。但是,你可以將這個條件a.item_id != b.item_idwhere條款:

create table items(transaction_id smallint, item_id string); 

insert overwrite table items 
select 1 , 'A' from default.dual union all 
select 1 , 'B' from default.dual union all 
select 1 , 'C' from default.dual union all 
select 2 , 'B' from default.dual union all 
select 2 , 'A' from default.dual union all 
select 3 , 'A' from default.dual union all 
select 4 , 'B' from default.dual union all 
select 4 , 'C' from default.dual; 

SELECT c.original_item_id, c.bought_with_item_id, count(*) as times_bought_together 
FROM (
     SELECT a.item_id as original_item_id, b.item_id as bought_with_item_id 
     FROM items a 
     INNER join items b ON a.transaction_id = b.transaction_id 
     WHERE 
      a.item_id in ('B','C') --original_item_id 
     and a.item_id != b.item_id 
    ) c 
GROUP BY c.original_item_id, c.bought_with_item_id; 
--- 
OK 
original_item_id  bought_with_item_id  times_bought_together 
B  A  2 
B  C  2 
C  A  1 
C  B  2 

耗時:24.164秒,抓取時間:4行(S)

相關問題