2017-09-13 83 views
1

我有下面這個,我想要的是,任何由於地址變化而移動GP的人在該期間應該有開始日期和結束日期。但結束日期將比下一個開始日期少。請如何編寫此查詢?SQL Complex Transformation

DECLARE @Tab TABLE(Local_Patient_Identifier VARCHAR(70),  
    NHS_Number VARCHAR(70), GMP VARCHAR(70), Practice_Code_GP VARCHAR(70), CDS_Date DATE) 
INSERT INTO @Tab VALUES 
('A111111111', '8BFD000', 'G111111', 'N77777', '2016-05-23'), 
('A111111111', '8BFD000', 'G222222', 'N77777', '2016-06-13'), 
('A111111111', '8BFD000', 'G222222', 'N77777', '2016-06-13'), 
('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-02-09'), 
('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-06'), 
('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-15'), 
('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-29'), 
('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-05-10'), 
('A111111112', '8BFD002', 'G3333332', 'JJ44444', '2015-05-21'), 
('A111111112', '8BFD002', 'G3333332', 'KK44445', '2016-05-02'), 
('A111111112', '8BFD002', 'G3333332', 'WW44444', '2017-02-13') 
SELECT*FROM @Tab 

預期輸出 enter image description here

+0

發生第4行至第8排什麼窗函數導?爲什麼這些行不是輸出的一部分? –

+0

@KannanKandasamy每當一個人改變GP,他們都會改變。當前的GP結束日期將爲空。GP是GMP列 – JonWay

+0

因此,如果'GMP'改變或'Practice_code_GP改變',那應該引發一個新的時期? – Xedni

回答

1

您可以使用ROW_NUMBER得到這個結果:

;With Cte as (
    Select *,JoinKey = Row_Number() over(partition by Local_Patient_Identifier order by CDS_Date) from (
     Select *, RowN = Row_Number() over(partition by Local_patient_Identifier, GMP, Practice_Code_GP order by CDS_Date) 
      from #tab 
     ) a 
    where a.RowN = 1 
) 
Select c1.Local_Patient_Identifier,c1.NHS_Number, c1.GMP, c1.Practice_Code_GP, c1.CDS_Date as StartDate, 
    Dateadd(day, -1 , c2.CDS_Date) as EndDate from cte c1 left join cte c2 
on c1.Local_Patient_Identifier = c2.Local_Patient_Identifier 
and c1.JoinKey = c2.JoinKey - 1 

輸出如下:

+--------------------------+------------+----------+------------------+------------+------------+ 
| Local_Patient_Identifier | NHS_Number | GMP | Practice_Code_GP | StartDate | EndDate | 
+--------------------------+------------+----------+------------------+------------+------------+ 
| A111111111    | 8BFD000 | G111111 | N77777   | 2016-05-23 | 2016-06-12 | 
| A111111111    | 8BFD000 | G222222 | N77777   | 2016-06-13 | 2017-02-08 | 
| A111111111    | 8BFD000 | G3333333 | ZZ44444   | 2017-02-09 | NULL  | 
| A111111112    | 8BFD002 | G3333332 | JJ44444   | 2015-05-21 | 2016-05-01 | 
| A111111112    | 8BFD002 | G3333332 | KK44445   | 2016-05-02 | 2017-02-12 | 
| A111111112    | 8BFD002 | G3333332 | WW44444   | 2017-02-13 | NULL  | 
+--------------------------+------------+----------+------------------+------------+------------+ 

您可以使用,如果你正在使用SQL Server> = 2012

+0

作爲一般規則,您希望避免多次引用CTE。原因是CTE中的代碼每次被引用時都會執行。最好將cte結果「緩存」到#TempTable或@TableVariable,然後多次引用...... –

+0

我不同於這個想法......它又取決於作爲CTE選擇的數據量......原因是CTE在內部存儲沒有統計信息的內存......如果它超出內存限制,它將默認溢出到tempdb,而不進行統計。如果與分配的最大內存大小相比,體積是最小的,我仍然更喜歡去CTE ...如果大容量的記錄,那麼我們可以去臨時表 –

+0

你可以不同所有你想要的,它並不會減少真正的。如果CTE是查詢中最昂貴的部分(在這種情況下,這是因爲排序),那麼每次引用CTE時都會產生成本。 –

1

以下是我wenta回合它。因爲我厭倦了輸入長名字,所以在重新編寫專欄時我已將其重命名,但是我已經在離您的期望更近的路上給它們加了別名。我也轉換使用英國的日期格式(DD/MM/YYYY)來匹配您的輸出

declare @tab table 
(
    LPI varchar(70),  
    NHSNum varchar(70), 
    GMP varchar(70), 
    GP varchar(70), 
    CDSDate date 
) 
insert into @Tab 
values 
    ('A111111111', '8BFD000', 'G111111', 'N77777', '2016-05-23'), 
    ('A111111111', '8BFD000', 'G222222', 'N77777', '2016-06-13'), 
    ('A111111111', '8BFD000', 'G222222', 'N77777', '2016-06-13'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-02-09'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-06'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-15'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-29'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-05-10'), 
    ('A111111112', '8BFD002', 'G3333332', 'JJ44444', '2015-05-21'), 
    ('A111111112', '8BFD002', 'G3333332', 'KK44445', '2016-05-02'), 
    ('A111111112', '8BFD002', 'G3333332', 'WW44444', '2017-02-13') 


;with src as 
(
    select 
     RID = row_number() over (partition by LPI, NHSNum order by min(CDSDate)),  
     LPI, 
     NHSNum, 
     GMP, 
     GP, 
     MinDate = min(CDSDate) 
    from @tab 
    group by  
     LPI, 
     NHSNum, 
     GMP, 
     GP 
) 
select 
    LocalPatientIdentifier = a.LPI, 
    NHSNumber = a.NHSNum, 
    GMP = a.GMP, 
    PracticeCodeGP = a.GP, 
    StartDate = convert(varchar(50), a.MinDate, 103), 
    EndDate = convert(varchar(50), dateadd(day, -1, b.MinDate), 103) 
from src a 
left outer join src b 
    on a.LPI = b.LPI 
     and a.NHSNum = b.NHSNum 
     and a.RID = b.RID - 1 
1

輸出日期下面應該是不錯的SQL Server 2008R2 ...

IF OBJECT_ID('tempdb..#Tab', 'U') IS NOT NULL 
DROP TABLE #Tab; 

CREATE TABLE #Tab (
    Local_Patient_Identifier VARCHAR(70),  
    NHS_Number VARCHAR(70), 
    GMP VARCHAR(70), 
    Practice_Code_GP VARCHAR(70), 
    CDS_Date DATE 
    ); 
INSERT #Tab (Local_Patient_Identifier, NHS_Number, GMP, Practice_Code_GP, CDS_Date) VALUES 
    ('A111111111', '8BFD000', 'G111111', 'N77777', '2016-05-23'), 
    ('A111111111', '8BFD000', 'G222222', 'N77777', '2016-06-13'), 
    ('A111111111', '8BFD000', 'G222222', 'N77777', '2016-06-13'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-02-09'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-06'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-15'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-03-29'), 
    ('A111111111', '8BFD000', 'G3333333', 'ZZ44444', '2017-05-10'), 
    ('A111111112', '8BFD002', 'G3333332', 'JJ44444', '2015-05-21'), 
    ('A111111112', '8BFD002', 'G3333332', 'KK44445', '2016-05-02'), 
    ('A111111112', '8BFD002', 'G3333332', 'WW44444', '2017-02-13'); 

-- SELECT * FROM #Tab t 

--====================================================================== 

IF OBJECT_ID('tempdb..#ChangeData', 'U') IS NOT NULL 
DROP TABLE #ChangeData; 

WITH 
    cte_AddRN AS (
     SELECT 
      t.Local_Patient_Identifier, 
      t.NHS_Number, 
      t.GMP, 
      t.Practice_Code_GP, 
      t.CDS_Date, 
      RN = ROW_NUMBER() OVER (PARTITION BY t.Local_Patient_Identifier, t.GMP, t.Practice_Code_GP ORDER BY t.CDS_Date) 
     FROM 
      #Tab t 
     ) 
SELECT 
    ar.Local_Patient_Identifier, 
    ar.NHS_Number, 
    ar.GMP, 
    ar.Practice_Code_GP, 
    ar.CDS_Date, 
    RN = ROW_NUMBER() OVER (PARTITION BY ar.Local_Patient_Identifier ORDER BY ar.CDS_Date) 
    INTO #ChangeData 
FROM 
    cte_AddRN ar 
WHERE 
    ar.RN = 1; 

-- SELECT * FROM #ChangeData cd 

SELECT 
    cd1.Local_Patient_Identifier, 
    cd1.NHS_Number, 
    cd1.GMP, 
    cd1.Practice_Code_GP, 
    StartDate = cd1.CDS_Date, 
    EndDate = cd2.CDS_Date 
FROM 
    #ChangeData cd1 
    LEFT JOIN #ChangeData cd2 
     ON cd1.Local_Patient_Identifier = cd2.Local_Patient_Identifier 
     AND cd1.RN = cd2.RN - 1; 

結果.. 。

Local_Patient_Identifier NHS_Number GMP   Practice_Code_GP StartDate EndDate 
------------------------ ---------- -------- ---------------- ---------- ---------- 
A111111111     8BFD000  G111111  N77777    2016-05-23 2016-06-13 
A111111111     8BFD000  G222222  N77777    2016-06-13 2017-02-09 
A111111111     8BFD000  G3333333 ZZ44444    2017-02-09 NULL 
A111111112     8BFD002  G3333332 JJ44444    2015-05-21 2016-05-02 
A111111112     8BFD002  G3333332 KK44445    2016-05-02 2017-02-13 
A111111112     8BFD002  G3333332 WW44444    2017-02-13 NULL