2017-06-13 191 views
1

我在HIVE(HDFS)中使用以下行並將Presto用作查詢引擎。查找字符串中的所有匹配項 - Apache Presto

1,@markbutcher72 @charlottegloyn Not what Belinda Carlisle thought. And yes, she was singing about Edgbaston. 
2,@tomkingham @markbutcher72 @charlottegloyn It's true the garden of Eden is currently very green... 
3,@MrRhysBenjamin @gasuperspark1 @markbutcher72 Actually it's Springfield Park, the (occasional) home of the might 

要求是通過Presto Query獲取以下內容。我們怎樣才能得到這個請

1,markbutcher72 
1,charlottegloyn 
2,tomkingham 
2,markbutcher72 
2,charlottegloyn 
3,MrRhysBenjamin 
3,gasuperspark1 
3,markbutcher72 
+0

尚不清楚。它是一個單列的Hive表嗎? 2列?更多?... –

+0

@DuduMarkovitz - 感謝您的回覆。 配置單元表有2列。 ID和TEXT。理想情況下,我想迭代地執行一個字符串標記,當@出現時,直到SPACE。 我在看strpos(文本'@')。但是這隻給出了'@'的第一次出現而不是迭代 –

回答

1
select t.id 
     ,u.token 

from mytable as t 
     cross join unnest (regexp_extract_all(text,'(?<[email protected])\S+')) as u(token) 
; 

+----+----------------+ 
| id |  token  | 
+----+----------------+ 
| 1 | markbutcher72 | 
| 1 | charlottegloyn | 
| 2 | tomkingham  | 
| 2 | markbutcher72 | 
| 2 | charlottegloyn | 
| 3 | MrRhysBenjamin | 
| 3 | gasuperspark1 | 
| 3 | markbutcher72 | 
+----+----------------+ 
+0

輝煌..感謝一噸。 –

相關問題