2016-09-07 60 views
0

在COBOL程序上工作時,一個地雷同事遇到了這個問題,最終在應用程序級別解決了這個問題。 我仍然很好奇,如果有可能在SQL的數據訪問級別上解決它。 這與this other question有某種關係,但我只想使用ANSI SQL。在SQL中將CSV字段拆分成不同的行

我正在尋找一個單一的SQL選擇查詢,該查詢作用於包含可變長度CSV行的VARCHAR字段。查詢的目的是在自己的結果集行中分割每個CSV字段。

這裏是架構和數據的例子(這裏是fiddle):

CREATE TABLE table1 (`field` varchar(100)); 

INSERT INTO table1 (`field`) 
     VALUES 
      ('Hello,world,!') , 
      ('Haloa,!')   , 
      ('Have,a,nice,day,!'); 

這裏是我想從查詢到有輸出:

Hello 
world 
! 
Haloa 
! 
Have 
a 
nice 
day 
! 

的CSV使用的分隔符是逗號,現在我不擔心轉義。

+0

取決於您的DBMS。有一些分割函數的實現(很像很多語言中的),你需要爲每個表格記錄(字段)調用','作爲分隔符。如果您的數據庫管理系統中不可用,您可以編寫一個簡單的函數來返回一個簡單的數組/遊標/結果集。 – FDavidov

+0

標記您正在使用的dbms。 – jarlh

+1

首先不將逗號分隔的值存儲到SQL表中可防止許多問題。您似乎可以控制數據庫 - 正確設計它,而不是浪費時間創建可完全避免的問題的解決方法。 – Tomalak

回答

2

據我所知,這是ANSI SQL:

with recursive word_list (field, word, rest, field_id, level) as (    
    select field, 
     substring(field from 1 for position(',' in field) - 1) as word, 
     substring(field from position(',' in field) + 1) as rest, 
     row_number() over() as field_id, 
     1 
    from table1 
    union all 
    select c.field, 
     case 
      when position(',' in p.rest) = 0 then p.rest 
      else substring(p.rest from 1 for position(',' in p.rest) - 1) 
     end as word, 
     case 
      when position(',' in p.rest) = 0 then null 
      else substring(p.rest from position(',' in p.rest) + 1) 
     end as rest, 
     p.field_id, 
     p.level + 1 
    from table1 as c 
    join word_list p on c.field = p.field and position(',' in p.rest) >= 0 
) 
select word 
from word_list 
order by field_id, level; 

這假定field中的值是唯一的。

這裏是一個正在運行的例子:http://rextester.com/NARS7464

+0

這真是太神奇了,以下是可以在其中工作的數據庫的概述:https://en.wikipedia.org/wiki/Hierarchical_and_recursive_queries_in_SQL#Common_table_expression –

0

在Oracle中你可以使用類似的東西(也許它不是最優雅,但它給你想要的結果) - 簡單地用your_table_name更換tab

WITH 
tab2 AS (
SELECT t.field, 
     CASE WHEN INSTR(t.field, ',', 1, 1) > 0 AND regexp_count(t.field,',') >= 1 THEN INSTR(t.field, ',', 1, 1) ELSE NULL END AS pos1, 
     CASE WHEN INSTR(t.field, ',', 1, 2) > 0 AND regexp_count(t.field,',') >= 2 THEN INSTR(t.field, ',', 1, 2) ELSE NULL END AS pos2, 
     CASE WHEN INSTR(t.field, ',', 1, 3) > 0 AND regexp_count(t.field,',') >= 3 THEN INSTR(t.field, ',', 1, 3) ELSE NULL END AS pos3, 
     CASE WHEN INSTR(t.field, ',', 1, 4) > 0 AND regexp_count(t.field,',') >= 4 THEN INSTR(t.field, ',', 1, 4) ELSE NULL END AS pos4, 
     CASE WHEN INSTR(t.field, ',', 1, 5) > 0 AND regexp_count(t.field,',') >= 5 THEN INSTR(t.field, ',', 1, 5) ELSE NULL END AS pos5, 
     CASE WHEN INSTR(t.field, ',', 1, 6) > 0 AND regexp_count(t.field,',') >= 6 THEN INSTR(t.field, ',', 1, 6) ELSE NULL END AS pos6 
FROM tab t 
), 
tab3 AS (
SELECT SUBSTR(tt.field,1,tt.pos1-1) AS col1, 
     SUBSTR(tt.field,tt.pos1+1, CASE WHEN tt.pos2 IS NULL THEN LENGTH(tt.field) - tt.pos1 ELSE tt.pos2 - tt.pos1 - 1 END) AS col2, 
     SUBSTR(tt.field,tt.pos2+1, CASE WHEN tt.pos3 IS NULL THEN LENGTH(tt.field) - tt.pos2 ELSE tt.pos3 - tt.pos2 - 1 END) AS col3, 
     SUBSTR(tt.field,tt.pos3+1, CASE WHEN tt.pos4 IS NULL THEN LENGTH(tt.field) - tt.pos3 ELSE tt.pos4 - tt.pos3 - 1 END) AS col4, 
     SUBSTR(tt.field,tt.pos4+1, CASE WHEN tt.pos5 IS NULL THEN LENGTH(tt.field) - tt.pos4 ELSE tt.pos5 - tt.pos4 - 1 END) AS col5, 
     SUBSTR(tt.field,tt.pos5+1, CASE WHEN tt.pos6 IS NULL THEN LENGTH(tt.field) - tt.pos5 ELSE tt.pos6 - tt.pos5 - 1 END) AS col6 
     ,ROWNUM AS r 
FROM tab2 tt 
), 
tab4 AS (
SELECT ttt.col1 AS col FROM tab3 ttt WHERE r = 1 
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r = 1 
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r = 1 
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r = 1 
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r = 1 
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r = 1 
UNION ALL 
SELECT ttt.col1 FROM tab3 ttt WHERE r = 2 
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r = 2 
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r = 2 
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r = 2 
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r = 2 
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r = 2 
UNION ALL 
SELECT ttt.col1 FROM tab3 ttt WHERE r = 3 
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r = 3 
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r = 3 
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r = 3 
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r = 3 
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r = 3 
UNION ALL 
SELECT ttt.col1 FROM tab3 ttt WHERE r = 4 
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r = 4 
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r = 4 
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r = 4 
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r = 4 
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r = 4 
UNION ALL 
SELECT ttt.col1 FROM tab3 ttt WHERE r = 5 
UNION ALL SELECT ttt.col2 FROM tab3 ttt WHERE r = 5 
UNION ALL SELECT ttt.col3 FROM tab3 ttt WHERE r = 5 
UNION ALL SELECT ttt.col4 FROM tab3 ttt WHERE r = 5 
UNION ALL SELECT ttt.col5 FROM tab3 ttt WHERE r = 5 
UNION ALL SELECT ttt.col6 FROM tab3 ttt WHERE r = 5 
) 
SELECT col 
FROM tab4 
WHERE col IS NOT NULL 

它給我的結果:

1 Hello 
2 world 
3 ! 
4 Haloa 
5 ! 
6 Have 
7 a 
8 nice 
9 day 
10 ! 
0

FWIW,這裏是另一個Oracle特定的方法。也許它至少會給出一個想法或幫助未來的搜索者。

SQL> with tbl(rownbr, col1) as (
      select 1, 'Hello,world,!'  from dual union 
      select 2, 'Haloa,!'   from dual union 
      select 3, 'Have,a,nice,day,!' from dual 
    ) 
    SELECT rownbr, column_value substring_nbr, 
     regexp_substr(col1, '(.*?)(,|$)', 1, column_value, null, 1) 
    FROM tbl, 
       TABLE(
        CAST(
        MULTISET(SELECT LEVEL 
           FROM dual 
           CONNECT BY LEVEL <= REGEXP_COUNT(col1, ',')+1 
          ) AS sys.OdciNumberList 
       ) 
       ) 
     order by rownbr, substring_nbr; 

    ROWNBR SUBSTRING_NBR REGEXP_SUBSTR(COL 
---------- ------------- ----------------- 
     1    1 Hello 
     1    2 world 
     1    3 ! 
     2    1 Haloa 
     2    2 ! 
     3    1 Have 
     3    2 a 
     3    3 nice 
     3    4 day 
     3    5 ! 

10 rows selected. 

SQL>