2011-04-02 67 views
1

好的。首先讓我大肆道歉,如果這個問題已經被覆蓋。我做了看,但沒有解決方案解決我的問題的細節。根據時間戳重新分配模式,重新分配模式

我有一張超過1.6億行數據跟蹤員工/服務器條件的表。我想要創建這些數據的一個子集,並刪除整個過程中發生的重複,但保留髮生變化的順序。對於大多數員工的減少將是700行(和增長)爲1

這裏是我想要得到一個簡單的例子:

Given: 

RowID Employee Server Timestamp 
----- -------- ------ --------- 
5  E000001 Serv-B May01 
4  E000001 Serv-A Apr01 
3  E000001 Serv-B Mar01 
2  E000001 Serv-A Feb01 
1  E000001 Serv-A Jan01 

Doing a "Min(Timestamp) Group By Employee, Server" would yield: 
Employee Server Timestamp 
-------- ------ --------- 
E000001 Serv-B Mar01 
E000001 Serv-A Jan01 
. 
What I need is: 
Employee Server Timestamp 
-------- ------ --------- 
E000001 Serv-B May01 
E000001 Serv-A Apr01 
E000001 Serv-B Mar01 
E000001 Serv-A Jan01 

表和饋電的過程不屬於我們的團隊,所以我不能在那裏影響解決方案,我寧願不要被整個事情的副本卡住。考慮到表格的大小,我不能現實地做光標/ RBAR方法。如果支持到一個角落,我可以編寫一個應用程序來做到這一點,但我想知道SQoLympus的任何神靈是否有任何智慧在存儲過程中這樣做。提前致謝!

編輯:這是SQL Server 2008 - 抱歉沒有提及它。

+0

什麼RDBMS和版本?什麼樣的數據類型是'Timestamp'? – 2011-04-02 19:31:42

回答

1

如果SQL Server(假設我已經明白你的要求正確)

/*Set up test table*/ 
DECLARE @T TABLE (
    RowID  INT, 
    Employee CHAR(7), 
    [Server] CHAR(6), 
    [timestamp] DATETIME); 

INSERT INTO @T 
SELECT 5,'E000001','Serv-B', '20010501' UNION ALL 
SELECT 4,'E000001','Serv-A', '20010401' UNION ALL 
SELECT 3,'E000001','Serv-B', '20010301' UNION ALL 
SELECT 2,'E000001','Serv-A', '20010201' UNION ALL 
SELECT 1,'E000001','Serv-A', '20010101'; 

WITH cte 
    As (SELECT ROW_NUMBER() OVER (PARTITION BY Employee ORDER BY RowID) - 
       ROW_NUMBER() OVER (PARTITION BY Employee, Server 
             ORDER BY RowID) AS Grp, 
       * 
     FROM @T), 
    cte2 
    AS (SELECT *, 
       ROW_NUMBER() OVER (PARTITION BY Employee, Grp ORDER BY RowID) AS 
       Rn 
     FROM cte) 

/* Edit: Actually - You want a SELECT not a DELETE I think? 
DELETE FROM cte2 WHERE Rn > 1*/ 

SELECT RowID, Employee, [Server], [timestamp] 
FROM cte2 
WHERE Rn = 1 
+0

絕對是SQoLympus的衆神之一!非常感謝! – 2011-04-05 21:47:41

0

您沒有說明您正在使用哪個數據庫,但如果例如這是Oracle,則可以使用laglead分析函數來引用上一行或下一行。

select employee, server, timestamp 
from 
    (select employee, server, timestamp, 
    lag(employee) over (order by employee, server, timestamp) prev_employee 
    lag(server) over (order by employee, server, timestamp) prev_server 
    from table 
    ) 
where not (employee = prev_employee and server = prev_server) 
+0

不幸的是,這些函數還沒有使它成爲SQL Server。 – 2011-04-02 23:19:30