2015-01-20 96 views
0

對於下面給出的數據集,我想刪除具有較晚時間戳的行。根據條件刪除重複項

**37C1Z2990E5E0 (TRXID) should be UNIQUE** in the below dataSet 

    JKLAMMSDF123 20141112 20141117 5000.0 P 1.22 RT101018 *2014-11-12 10:10:26* 37C1Z2990E5E0 101018 
    JKLAMMSDF123 20141110 20141114 5000.0 P 1.22 RT161002 *2014-11-12 10:11:33* 37C1Z2990E5E0 161002 

-- More rows 
+0

你不能去PK的相同的值在一個表中。這是非規格化的數據集? – 2015-01-20 21:19:56

+0

您是否只對帶有時間戳[BETWEEN](https://msdn.microsoft.com/en-us/library/ms187922.aspx)兩個其他結果感興趣? – ryanyuyu 2015-01-20 21:20:38

+0

我的意思是我們可以將TRXID作爲唯一的值,並且不允許重複 – SHinny 2015-01-20 21:21:31

回答

1

試試這個:

;WITH DATA AS 
(
    SELECT TRXID, MAX(YourTimestampColumn) AS TS 
    FROM YourTable 
    GROUP BY TRXID 
    HAVING COUNT(*) > 1 
) 
DELETE T 
FROM YourTable AS T 
INNER JOIN DATA AS D 
    ON T.TRXID = D.TRXID 
    AND T.YourTimestampColumn = D.TS; 
+0

這選擇所有行,不僅重複... – SHinny 2015-01-20 21:43:18

+0

你現在可以嘗試。 – dario 2015-01-20 21:47:31

+0

感謝您的真棒解決方案。 – SHinny 2015-01-21 15:26:59

0

選擇timestamp列的min和所有其他列的組。

SELECT MIN(TIMESTAMP), C1, C2, C3... 
FROM YOUR_TABLE 
GROUP BY C1, C2, C3.. 
0

我會用window functionCTE做到這一點。

若要檢查刪除重複項後的結果使用此。

;WITH DATA 
    AS (SELECT *, 
       Row_number()OVER(partition BY TRXID ORDER BY YourTimestampColumn) rn 
     FROM YourTable) 
select * 
FROM data 
WHERE rn = 1 

delete重複項使用此項。

;WITH DATA 
    AS (SELECT *, 
       Row_number()OVER(partition BY TRXID ORDER BY YourTimestampColumn) rn 
     FROM YourTable) 
DELETE FROM data 
WHERE rn > 1 

這會工作,即使你比一個重複的相同TRXID