2016-09-23 64 views
3

我在一個表中的下列數據:SQL計數連續行

|event_id |starttime  |person_id|attended| 
|------------|-----------------|---------|--------| 
| 11512997-1 | 01-SEP-16 08:00 | 10001 | N  | 
| 11512997-2 | 01-SEP-16 10:00 | 10001 | N  | 
| 11512997-3 | 01-SEP-16 12:00 | 10001 | N  | 
| 11512997-4 | 01-SEP-16 14:00 | 10001 | N  | 
| 11512997-5 | 01-SEP-16 16:00 | 10001 | N  | 
| 11512997-6 | 01-SEP-16 18:00 | 10001 | Y  | 
| 11512997-7 | 02-SEP-16 08:00 | 10001 | N  | 
| 11512997-1 | 01-SEP-16 08:00 | 10002 | N  | 
| 11512997-2 | 01-SEP-16 10:00 | 10002 | N  | 
| 11512997-3 | 01-SEP-16 12:00 | 10002 | N  | 
| 11512997-4 | 01-SEP-16 14:00 | 10002 | Y  | 
| 11512997-5 | 01-SEP-16 16:00 | 10002 | N  | 
| 11512997-6 | 01-SEP-16 18:00 | 10002 | Y  | 
| 11512997-7 | 02-SEP-16 08:00 | 10002 | Y  | 

欲產生以下結果,其中連續出現次數的最大數目,其中atended =「N」返回:

|person_id|consec_missed_max| 
| 1001 | 5    | 
| 1002 | 3    | 

這怎麼能在Oracle(或ANSI)SQL中完成?謝謝!

編輯:

到目前爲止,我曾嘗試:

WITH t1 AS 
(SELECT t.person_id, 
    row_number() over(PARTITION BY t.person_id ORDER BY t.starttime) AS idx 
    FROM the_table t 
    WHERE t.attended = 'N'), 
t2 AS 
(SELECT person_id, MAX(idx) max_idx FROM t1 GROUP BY person_id) 
SELECT t1.person_id, COUNT(1) ct 
    FROM t1 
    JOIN t2 
    ON t1.person_id = t2.person_id 
GROUP BY t1.person_id; 
+0

只是添加了什麼我至今嘗試過,當談到使用分析功能我仍然不完全確定如何去做。 – ubersnack

回答

6

的主要工作是在保理子查詢 「準備」。你似乎對分析功能有點熟悉,但這還不夠。該解決方案使用所謂的「tabibitosan」方法在一個或多個維度上創建具有相同特徵的連續行組;在這種情況下,您希望將每個序列的連續N行與不同的組合在一起。這是通過兩次ROW_NUMBER()調用的區別完成的 - 一次只能由人員進行分區,另一個人員進行分區並出席。谷歌「tabibitosan」如果需要閱讀更多關於這個想法。

with 
    inputs (event_id, starttime, person_id, attended) as (
     select '11512997-1', to_date('01-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
     select '11512997-2', to_date('01-SEP-16 10:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all  
     select '11512997-3', to_date('01-SEP-16 12:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
     select '11512997-4', to_date('01-SEP-16 14:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
     select '11512997-5', to_date('01-SEP-16 16:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
     select '11512997-6', to_date('01-SEP-16 18:00', 'dd-MON-yy hh24:mi'), 10001, 'Y' from dual union all 
     select '11512997-7', to_date('02-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
     select '11512997-1', to_date('01-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all 
     select '11512997-2', to_date('01-SEP-16 10:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all 
     select '11512997-3', to_date('01-SEP-16 12:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all 
     select '11512997-4', to_date('01-SEP-16 14:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual union all 
     select '11512997-5', to_date('01-SEP-16 16:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all 
     select '11512997-6', to_date('01-SEP-16 18:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual union all 
     select '11512997-7', to_date('02-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual 
    ), 
     prep (starttime, person_id, attended, gp) as (
     select starttime, person_id, attended, 
       row_number() over (partition by person_id order by starttime) - 
        row_number() over (partition by person_id, attended 
             order by starttime) 
     from inputs 
    ), 
     counts (person_id, consecutive_absences) as (
     select person_id, count(*) 
     from prep 
     where attended = 'N' 
     group by person_id, gp 
    ) 
select person_id, max(consecutive_absences) as max_consecutive_absences 
from counts 
group by person_id 
order by person_id; 

OUTPUT:

PERSON_ID    MAX_CONSECUTIVE_ABSENCES 
---------- --------------------------------------- 
    10001          5 
    10002          3 
+0

完美工作,謝謝! – ubersnack

0

如果您正在使用Oracle 12c你可以使用MATCH_RECOGNIZE

數據:

CREATE TABLE data AS 
SELECT * 
FROM (
with inputs (event_id, starttime, person_id, attended) as (
    select '11512997-1', to_date('01-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
    select '11512997-2', to_date('01-SEP-16 10:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all  
    select '11512997-3', to_date('01-SEP-16 12:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
    select '11512997-4', to_date('01-SEP-16 14:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
    select '11512997-5', to_date('01-SEP-16 16:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
    select '11512997-6', to_date('01-SEP-16 18:00', 'dd-MON-yy hh24:mi'), 10001, 'Y' from dual union all 
    select '11512997-7', to_date('02-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10001, 'N' from dual union all 
    select '11512997-1', to_date('01-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all 
    select '11512997-2', to_date('01-SEP-16 10:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all 
    select '11512997-3', to_date('01-SEP-16 12:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all 
    select '11512997-4', to_date('01-SEP-16 14:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual union all 
    select '11512997-5', to_date('01-SEP-16 16:00', 'dd-MON-yy hh24:mi'), 10002, 'N' from dual union all 
    select '11512997-6', to_date('01-SEP-16 18:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual union all 
    select '11512997-7', to_date('02-SEP-16 08:00', 'dd-MON-yy hh24:mi'), 10002, 'Y' from dual 
    ) 
SELECT * FROM inputs 
); 

和查詢:

SELECT PERSON_ID, MAX(LEN) AS MAX_ABSENCES_IN_ROW 
FROM data 
MATCH_RECOGNIZE (
    PARTITION BY PERSON_ID 
    ORDER BY STARTTIME 
    MEASURES FINAL COUNT(*) AS len 
    ALL ROWS PER MATCH 
    PATTERN(a b*) 
    DEFINE b AS attended = a.attended 
) 
WHERE attended = 'N' 
GROUP BY PERSON_ID; 

輸出:

"PERSON_ID","MAX_ABSENCES_IN_ROW" 
10001,5 
10002,3 

編輯:

由於@mathguy指出它可以被改寫爲:

SELECT PERSON_ID, MAX(LEN) AS MAX_ABSENCES_IN_ROW 
FROM data 
MATCH_RECOGNIZE (
    PARTITION BY PERSON_ID 
    ORDER BY STARTTIME 
    MEASURES COUNT(*) AS len 
    PATTERN(a+) 
    DEFINE a AS attended = 'N' 
) 
GROUP BY PERSON_ID; 
+0

太複雜。你不需要'每個匹配的所有行'。每場比賽只需返回該比賽的「COUNT」。那麼,應該沒有'WHERE'子句。相反,'PATTERN'應該是'a +','DEFINE'子句應該是'DEFINE a AS attend'='N''。這將是一個更有效的解決方案(如比較計劃所示)。 – mathguy

+1

我編輯刪除'COUNT(*)'前面的單詞'FINAL'。當你返回'每個匹配的所有行'時,'count()'的行爲就像一個分析函數,除非你用'final'限定它。但是當你每場比賽(默認)返​​回**一個**行時,沒有「跑步」和「最終」。當你使用'final(something)'''每行一行'時,Oracle不會拋出語法錯誤。它不會,但「最後」在那裏不合適。 – mathguy