0
假設我有以下的表格,檢查字計數字符串和更少的計數刪除的話 - 蜂巢
date_part string_word id
2017-08-08 India America Advance Apartments 1
2017-08-08 Apartments Planner Headlines 1
2017-08-08 India America Headlines Gucci 1
2017-08-08 Images Same Thing Africa 2
2017-08-08 Images 2
2017-08-07 India America Advance Apartments 2
2017-08-07 Apartments Planner Headlines 3
2017-08-07 India America Headlines Gucci 3
2017-08-07 Images Same Thing Africa 3
2017-08-07 Images 4
現在我想找到字數每天和刪除的話數量較少。爲了找到字數,我寫了下面的查詢,
SELECT date_part, word, COUNT(*) as total_word_count
FROM table_name LATERAL VIEW explode(split(string_word, ' ')) lTable as word
where date_part > '2017-08-05'
GROUP BY date_part, word
這將給以下,
date_part word total_word_count
2017-08-08 India 2
2017-08-08 America 2
2017-08-08 Advance 1
2017-08-08 Apartments 2
2017-08-08 Planner 1
2017-08-08 Headlines 2
2017-08-08 Gucci 1
2017-08-08 Images 2
2017-08-08 Same 1
2017-08-08 Thing 1
2017-08-08 Africa 1
2017-08-07 India 2
2017-08-07 America 2
2017-08-07 Advance 1
2017-08-07 Apartments 2
2017-08-07 Planner 1
2017-08-07 Headlines 2
2017-08-07 Gucci 1
2017-08-07 Images 2
2017-08-07 Same 1
2017-08-07 Thing 1
2017-08-07 Africa 1
現在我想用計數刪除的話小於2,即用1字應該在每個日期刪除計數。以下應該是輸出,
date_part string_word id
2017-08-08 India America Apartments 1
2017-08-08 Apartments Headlines 1
2017-08-08 India America Headlines 1
2017-08-08 Images 2
2017-08-08 Images 2
2017-08-07 India America Apartments 2
2017-08-07 Apartments Headlines 3
2017-08-07 India America Headlines 3
2017-08-07 Images 3
2017-08-07 Images 4
這裏帶有1計數的單詞已被刪除。這是我期望得到的輸出,這也是每天都要做的。
有人可以幫我做這件事嗎?
感謝
加上'HAVING total_word_count> 1'到查詢... –
@usagi過濾是罰款。但是我想從原始表格中刪除單詞。只有一個以上的計數應該存在。剩下的話應該刪除。這就是我正在看的問題 – haimen