2011-05-03 68 views
0

我想鞏固一組記錄鞏固記錄

(id)/ (referencedid) 

1   10 
1   11 
2   11 
2   10 
3   10 
3   11 
3   12 

查詢的結果應該是

1   10 
1   11 
3   10 
3   11 
3   12 

所以,因爲ID = 1和ID = 2具有相同的一組相應referenceids的{ 10,11}他們將被鞏固。但id = 3 s對應的referenceids不一樣,因此不會被合併。

什麼是最好的方法來完成這件事?

+0

什麼數據庫產品和版本? – Thomas 2011-05-03 02:23:54

+0

數據庫:SQLite版本:3.7.5 – jajo87 2011-05-03 02:31:04

回答

1
Select id, referenceid 
From MyTable 
Where Id In (
       Select Min(Z.Id) As Id 
       From (
         Select Z1.id, Group_Concat(Z1.referenceid) As signature 
         From (
           Select id, referenceid 
           From MyTable 
           Order By id, referenceid 
           ) As Z1 
         Group By Z1.id 
         ) As Z 
       Group By Z.Signature 
       ) 
+0

非常感謝,Thomas – jajo87 2011-05-03 06:21:46

0
-- generate count of elements for each distinct id 
with Counts as (
    select 
     id, 
     count(1) as ReferenceCount 
    from 
     tblReferences R 
    group by 
     R.id 
) 
-- generate every pairing of two different id's, along with 
-- their counts, and how many are equivalent between the two 
,Pairings as (
    select 
     R1.id as id1 
     ,R2.id as id2 
     ,C1.ReferenceCount as count1 
     ,C2.ReferenceCount as count2 
     ,sum(case when R1.referenceid = R2.referenceid then 1 else 0 end) as samecount 
    from 
     tblReferences R1 join Counts C1 on R1.id = C1.id 
    cross join 
     tblReferences R2 join Counts C2 on R2.id = C2.id 
    where 
     R1.id < R2.id 
    group by 
     R1.id, C1.ReferenceCount, R2.id, C2.ReferenceCount 
) 
-- generate the list of ids that are safe to remove by picking 
-- out any id's that have the same number of matches, and same 
-- size of list, which means their reference lists are identical. 
-- since id2 > id, we can safely remove id2 as a copy of id, and 
-- the smallest id of which all id2 > id are copies will be left 
,RemovableIds as (
    select 
     distinct id2 as id 
    from 
     Pairings P 
    where 
     P.count1 = P.count2 and P.count1 = P.samecount 
) 
-- validate the results by just selecting to see which id's 
-- will be removed. can also include id in the query above 
-- to see which id was identified as the copy 

select id from RemovableIds R 

-- comment out `select` above and uncomment `delete` below to 
-- remove the records after verifying they are correct! 

--delete from tblReferences where id in (select id from RemovableIds) R 
+0

謝謝mellamokb – jajo87 2011-05-03 06:21:33