2009-09-09 64 views
2

我有一個SQL Server 2005數據庫,其中包含一個名爲成員資格的表。日期範圍交叉點在SQL中拆分

表架構是:

PersonID int, Surname nvarchar(30), FirstName nvarchar(30), Description nvarchar(100), StartDate datetime, EndDate datetime

我目前工作的一個網格功能,顯示成員的被人闖入了下來。其中一個要求是在存在日期範圍交叉點的情況下拆分成員資格行。交叉點必須由Surname和FirstName綁定,即,分割僅出現在具有相同Surname和FirstName的成員資格記錄中。

示例表中的數據:

18 Smith John Poker Club 01/01/2009 NULL 
18 Smith John Library  05/01/2009 18/01/2009 
18 Smith John Gym   10/01/2009 28/01/2009 
26 Adams Jane Pilates  03/01/2009 16/02/2009

預期的結果集:

18 Smith John Poker Club     01/01/2009 04/01/2009 
18 Smith John Poker Club/Library  05/01/2009 09/01/2009 
18 Smith John Poker Club/Library/Gym 10/01/2009 18/01/2009 
18 Smith John Poker Club/Gym   19/01/2009 28/01/2009 
18 Smith John Poker Club     29/01/2009 NULL 
26 Adams Jane Pilates      03/01/2009 16/02/2009

沒有任何人有任何想法我怎麼能寫一個存儲過程,將返回它有擊穿的結果集如上所述。

+0

您的設計如何處理具有相同姓/名的多個成員?你提供的樣本數據是指三個不同的人,稱爲約翰史密斯,這並沒有超出可能性範圍。 – 2009-09-09 07:06:58

+0

這是一個有效的觀點,我編輯了我的問題以反映這種可能性。我確實爲每個人存儲了一個ID,但是在我寫這個問題的時候,我並沒有想到重複的名字。歡呼反饋。 – user168369 2009-09-09 07:42:39

+0

有一個是PersonID - 我完全忽略名稱位,直到最後的輸出選擇 – MartW 2009-09-09 07:43:35

回答

2

您將遇到的問題是,隨着數據集的增長,使用TSQL解決問題的解決方案將無法很好地擴展。下面使用一系列臨時表來解決問題。它使用數字表將每個日期範圍條目分割成各自的日期。這是不會擴展的地方,主要是由於您的開放範圍的NULL值看起來是無窮大的,所以您必須將固定的日期交換到將來,將轉換範圍限制爲可行的時間長度。通過爲每天的優化渲染建立適當索引的日期表或日曆表,您可能會看到更好的性能。

一旦範圍拆分後,將使用XML PATH合併說明,以便範圍系列中的每一天都有列出的所有說明。通過PersonID和Date的行編號允許使用兩個NOT EXISTS檢查來找到每個範圍的第一行和最後一行,以找到匹配的PersonID和Description集合中不存在前一行的實例,或者下一行不存在的情況下, t存在匹配的PersonID和Description集合。

然後使用ROW_NUMBER對此結果集重新編號,以便它們可以配對以構建最終結果。

/* 
SET DATEFORMAT dmy 
USE tempdb; 
GO 
CREATE TABLE Schedule 
(PersonID int, 
Surname nvarchar(30), 
FirstName nvarchar(30), 
Description nvarchar(100), 
StartDate datetime, 
EndDate datetime) 
GO 
INSERT INTO Schedule VALUES (18, 'Smith', 'John', 'Poker Club', '01/01/2009', NULL) 
INSERT INTO Schedule VALUES (18, 'Smith', 'John', 'Library', '05/01/2009', '18/01/2009') 
INSERT INTO Schedule VALUES (18, 'Smith', 'John', 'Gym', '10/01/2009', '28/01/2009') 
INSERT INTO Schedule VALUES (26, 'Adams', 'Jane', 'Pilates', '03/01/2009', '16/02/2009') 
GO 

*/ 

SELECT 
PersonID, 
Description, 
theDate 
INTO #SplitRanges 
FROM Schedule, (SELECT DATEADD(dd, number, '01/01/2008') AS theDate 
    FROM master..spt_values 
    WHERE type = N'P') AS DayTab 
WHERE theDate >= StartDate 
    AND theDate <= isnull(EndDate, '31/12/2012') 

SELECT 
ROW_NUMBER() OVER (ORDER BY PersonID, theDate) AS rowid, 
PersonID, 
theDate, 
STUFF((
    SELECT '/' + Description 
    FROM #SplitRanges AS s 
    WHERE s.PersonID = sr.PersonID 
    AND s.theDate = sr.theDate 
    FOR XML PATH('') 
), 1, 1,'') AS Descriptions 
INTO #MergedDescriptions 
FROM #SplitRanges AS sr 
GROUP BY PersonID, theDate 


SELECT 
ROW_NUMBER() OVER (ORDER BY PersonID, theDate) AS ID, 
* 
INTO #InterimResults 
FROM 
(
SELECT * 
FROM #MergedDescriptions AS t1 
WHERE NOT EXISTS 
    (SELECT 1 
    FROM #MergedDescriptions AS t2 
    WHERE t1.PersonID = t2.PersonID 
    AND t1.RowID - 1 = t2.RowID 
    AND t1.Descriptions = t2.Descriptions) 
UNION ALL 
SELECT * 
FROM #MergedDescriptions AS t1 
WHERE NOT EXISTS 
    (SELECT 1 
    FROM #MergedDescriptions AS t2 
    WHERE t1.PersonID = t2.PersonID 
    AND t1.RowID = t2.RowID - 1 
    AND t1.Descriptions = t2.Descriptions) 
) AS t 

SELECT DISTINCT 
PersonID, 
Surname, 
FirstName 
INTO #DistinctPerson 
FROM Schedule 

SELECT 
t1.PersonID, 
dp.Surname, 
dp.FirstName, 
t1.Descriptions, 
t1.theDate AS StartDate, 
CASE 
    WHEN t2.theDate = '31/12/2012' THEN NULL 
    ELSE t2.theDate 
END AS EndDate 
FROM #DistinctPerson AS dp 
JOIN #InterimResults AS t1 
ON t1.PersonID = dp.PersonID 
JOIN #InterimResults AS t2 
ON t2.PersonID = t1.PersonID 
    AND t1.ID + 1 = t2.ID 
    AND t1.Descriptions = t2.Descriptions 

DROP TABLE #SplitRanges 
DROP TABLE #MergedDescriptions 
DROP TABLE #DistinctPerson 
DROP TABLE #InterimResults 

/* 

DROP TABLE Schedule 

*/ 

上述解決方案也將處理額外的描述之間的間隙爲好,因此,如果你要添加另一個描述爲是PersonID 18留下間隙:

INSERT INTO Schedule VALUES (18, 'Smith', 'John', 'Gym', '10/02/2009', '28/02/2009') 

它將適當地填充間隙。正如在評論中指出的,你不應該在這個表中有名字信息,它應該被標準化到可以被連接到最終結果的人員表中。我通過使用SELECT DISTINCT來構建臨時表以創建該JOIN來模擬這個其他表。

1

試試這個

SET DATEFORMAT dmy 
DECLARE @Membership TABLE( 
    PersonID int, 
    Surname  nvarchar(16), 
    FirstName nvarchar(16), 
    Description nvarchar(16), 
    StartDate datetime, 
    EndDate  datetime) 
INSERT INTO @Membership VALUES (18, 'Smith', 'John', 'Poker Club', '01/01/2009', NULL) 
INSERT INTO @Membership VALUES (18, 'Smith', 'John','Library', '05/01/2009', '18/01/2009') 
INSERT INTO @Membership VALUES (18, 'Smith', 'John','Gym', '10/01/2009', '28/01/2009') 
INSERT INTO @Membership VALUES (26, 'Adams', 'Jane','Pilates', '03/01/2009', '16/02/2009') 

--Program Starts 
declare @enddate datetime 
--Measuring extreme condition when all the enddates are null(i.e. all the memberships for all members are in progress) 
-- in such a case taking any arbitary date e.g. '31/12/2009' here else add 1 more day to the highest enddate 
select @enddate = case when max(enddate) is null then '31/12/2009' else max(enddate) + 1 end from @Membership 

--Fill the null enddates 
; with fillNullEndDates_cte as 
(
    select 
      row_number() over(partition by PersonId order by PersonId) RowNum 
      ,PersonId 
      ,Surname 
      ,FirstName 
      ,Description 
      ,StartDate 
      ,isnull(EndDate,@enddate) EndDate 
    from @Membership 
) 
--Generate a date calender 
, generateCalender_cte as 
(
    select 
     1 as CalenderRows 
     ,min(startdate) DateValue 
    from @Membership 
     union all 
     select 
      CalenderRows+1 
      ,DateValue + 1 
     from generateCalender_cte 
     where DateValue + 1 <= @enddate 
) 
--Generate Missing Dates based on Membership 
,datesBasedOnMemberships_cte as 
(
    select 
      t.RowNum 
      ,t.PersonId 
      ,t.Surname 
      ,t.FirstName 
      ,t.Description   
      , d.DateValue 
      ,d.CalenderRows 
    from generateCalender_cte d 
    join fillNullEndDates_cte t ON d.DateValue between t.startdate and t.enddate 
) 
--Generate Dscription Based On Membership Dates 
, descriptionBasedOnMembershipDates_cte as 
(
    select  
     PersonID 
     ,Surname 
     ,FirstName 
     ,stuff((
      select '/' + Description 
      from datesBasedOnMemberships_cte d1 
      where d1.PersonID = d2.PersonID 
      and d1.DateValue = d2.DateValue 
      for xml path('') 
     ), 1, 1,'') as Description 
     , DateValue 
     ,CalenderRows 
    from datesBasedOnMemberships_cte d2 
    group by PersonID, Surname,FirstName,DateValue,CalenderRows 
) 
--Grouping based on membership dates 
,groupByMembershipDates_cte as 
(
    select d.*, 
    CalenderRows - row_number() over(partition by Description order by PersonID, DateValue) AS [Group] 
    from descriptionBasedOnMembershipDates_cte d 
) 
select PersonId 
,Surname 
,FirstName 
,Description 
,convert(varchar(10), convert(datetime, min(DateValue)), 103) as StartDate 
,case when max(DateValue)= @enddate then null else convert(varchar(10), convert(datetime, max(DateValue)), 103) end as EndDate 
from groupByMembershipDates_cte 
group by [Group],PersonId,Surname,FirstName,Description 
order by PersonId,StartDate 
option(maxrecursion 0) 
0

[後來只有很多很多年。]

我創建了一個存儲過程,將調整和單個表內的分區破裂段,然後就可以使用這些對齊的中斷使用子查詢和XML PATH將描述轉換爲不規則的列。

看看下面的幫助:

  1. 文檔:https://github.com/Quebe/SQL-Algorithms/blob/master/Temporal/Date%20Segment%20Manipulation/DateSegments_AlignWithinTable.md

  2. 存儲過程:https://github.com/Quebe/SQL-Algorithms/blob/master/Temporal/Date%20Segment%20Manipulation/DateSegments_AlignWithinTable.sql

例如,您的通話可能看起來像:

EXEC dbo.DateSegments_AlignWithinTable 
@tableName = 'tableName', 
@keyFieldList = 'PersonID', 
@nonKeyFieldList = 'Description', 
@effectivveDateFieldName = 'StartDate', 
@terminationDateFieldName = 'EndDate' 

你想要捕捉的結果(這是一個表)到另一個表或臨時表(假設它被稱爲「AlignedDataTable」在下面的例子)。然後,您可以使用子查詢進行透視。

SELECT 
    PersonID, StartDate, EndDate, 

    SUBSTRING ((SELECT ',' + [Description] FROM AlignedDataTable AS innerTable 
     WHERE 
      innerTable.PersonID = AlignedDataTable.PersonID 
      AND (innerTable.StartDate = AlignedDataTable.StartDate) 
      AND (innerTable.EndDate = AlignedDataTable.EndDate) 
     ORDER BY id 
     FOR XML PATH ('')), 2, 999999999999999) AS IdList 

FROM AlignedDataTable 
GROUP BY PersonID, StartDate, EndDate 
ORDER BY PersonID, StartDate