查找時間間隔集羣

我有一個包含多個條目的表。一個條目由開始日期時間和結束日期時間組成。查找時間間隔集羣

我想找到條目的集羣這樣的方式：

如果條目開始，則進入前結束前兩者都是集羣的一部分。這是某種重疊問題。

例子：

id  start     end 
1  2007-04-11 15:34:02  2007-05-11 13:09:01 
2  2007-06-13 15:42:39  2009-07-21 11:30:00 
3  2007-11-26 14:30:02  2007-12-11 14:09:07 
4  2008-02-14 08:52:11  2010-02-23 16:00:00

我想要的

id  start     end 
1  2007-04-11 15:34:02  2007-05-11 13:09:01 
2-4  2007-06-13 15:42:39  2010-02-23 16:00:00

輸出我有這樣的排序開始，然後做一些計算與ROWNUMBER和滯後/超前等的解決方案。問題是第4行直接在第2行之後出現的特殊情況，所以我不認識它...

在sql中是否有很好的解決方案？也許我錯過了什麼？

來源

2015-04-17 tuxmania

這個問題已經在堆棧溢出之前解決。 –

確定這裏是一些解決方案與遞歸CTE：

Create table t(id int, s date, e date) 

Insert into t values 
(1, '20070411', '20070511'), 
(2, '20070613', '20090721'), 
(3, '20071126', '20071211'), 
(4, '20080214', '20100223') 

;with cte as(
select id, s, e, id as rid, s as rs, e as re from t 
Where not exists(select * from t ti where t.s > ti.s and t.s < ti.e) 

Union all 

Select t.*, 
    c.rid, 
    c.rs, 
    case when t.e > c.re then t.e else c.re end from t 
Join cte c on t.s > c.s and t.s < c.e 

) 

Select min(id) minid, max(id) maxid, min(rs) startdate, max(re) enddate from cte 
group by rid

輸出：

minid maxid startdate enddate 
1  1  2007-04-11 2007-05-11 
2  4  2007-06-13 2010-02-23

小提琴http://sqlfiddle.com/#!6/2d6d3/10

來源

2015-04-17 11:49:20

作品。雖然效率不高，但它工作:)。我希望它可以用窗口函數來完成..但是沒關係。 – tuxmania

嘗試......

select a.id ,a.start,a.end,b.id,b.start,b.end 
from tab a 
cross join tab b 
where a.start between b.start and b.end 
order by a.start, a.end

我們必須檢查每個行對所有其他行，就像使用一個循環和內循環。爲此我們做一個交叉連接。

然後，我們將使用BETWEEN AND運算符

來源

2015-04-17 10:19:10

加入一些解釋可能有助於操作。 – deW1

對不起工作 – tuxmania

@tuxmania請解釋它爲什麼不起作用 –

要回答這個問題，檢查重疊，要確定哪個時間開始一個新的組。然後，在每次開始之前，計算這些開始的數量以定義一個組 - 並按此值聚合。

假設你有沒有重複的時間，這應該設置標誌：

select e.*, 
     (case when not exists (select 1 
           from entries e2 
           where e2.start < e.start and e2.end > e.start 
          ) 
      then 1 else 0 
     end) as BeginsIsland 
from entries e;

下則不會累積和和聚集，假設SQL Server的2012+（這可以很容易地適應於早期版本，但是這是比較容易的代碼）：

with e as (
     select e.*, 
      (case when not exists (select 1 
            from entries e2 
            where e2.start < e.start and e2.end > e.start 
            ) 
         then 1 else 0 
       end) as BeginIslandFlag 
     from entries e 
    ) 
select (case when min(id) = max(id) then cast(max(id) as varchar(255)) 
      else cast(min(id) as varchar(255)) + '-' + cast(max(id) as varchar(255)) 
     end) as ids, 
     min(start) as start, max(end) as end 
from (select e.* sum(BeginIslandFlag) over (order by start) as grp 
     from e 
    ) e 
group by grp;

來源

2015-04-17 10:38:45

Gordon我不認爲這會起作用，因爲它會將1,2行標記爲0和3,4爲1，不會嗎？ –

對不起 – tuxmania

@tuxmania。。。它可能會更好地用'not exists'，這是正確的邏輯。 –

查找時間間隔集羣

回答

相關問題