如何使用Rails刪除MySQL中的重複項？

關係

[id,user_id,status] 
1,2,sent_reply 
1,2,sent_mention 
1,3,sent_mention 
1,4,sent_reply 
1,4,sent_mention

我正在尋找一種方式來刪除重複的，因此，只有以下行將保持：

1,2,sent_reply 
1,3,sent_mention 
1,4,sent_reply

（最好使用Rails）

來源

2011-04-12 MikeMarsian

你想只返回uniq項目，或者你想刪除所有重複？ – fl00r 2011-04-12 16:33:02

同樣你的所有關係都有相同的ID – fl00r 2011-04-12 16:34:45

所以你只需要一個單一的（id，user_id）對，不管狀態如何？你如何決定保留哪些「狀態」信息？最後一個記錄？第一？隨機？ – 2011-04-12 16:44:31

-1

最好通過SQL來完成。但是，如果你喜歡用Rails：

(Relation.all - Relation.all.uniq_by{|r| [r.user_id, r.status]}).each{ |d| d.destroy }

或

ids = Relation.all.uniq_by{|r| [r.user_id, r.status]}.map(&:id) 
Relation.where("id IS NOT IN (?)", ids).destroy_all # or delete_all, which is faster

，但我不喜歡這樣的解決方案：d

來源

2011-04-12 16:43:40 fl00r

這將是非常緩慢和消耗內存（我的關係表是100,000+行。有沒有更多的SQLish方式來做到這一點，在這一點上，它不是很重要的包裝它在軌道。 – MikeMarsian 2011-04-14 09:33:33

我知道這是遲到的方式，但我發現一個好辦法使用Rails 3來完成它。然而，有可能有更好的方法，但我不知道這將如何執行100,000行以上的數據，但這應該讓你走上正軌。

# Get a hash of all id/user_id pairs and how many records of each pair 
counts = ModelName.group([:id, :user_id]).count 
# => {[1, 2]=>2, [1, 3]=>1, [1, 4]=>2} 

# Keep only those pairs that have more than one record 
dupes = counts.select{|attrs, count| count > 1} 
# => {[1, 2]=>2, [1, 4]=>2} 

# Map objects by the attributes we have 
object_groups = dupes.map do |attrs, count| 
    ModelName.where(:id => attrs[0], :user_id => attrs[1]) 
end 

# Take each group and #destroy the records you want. 
# Or call #delete instead to save time if you don't need ActiveRecord callbacks 
# Here I'm just keeping the first one I find. 
object_groups.each do |group| 
    group.each_with_index do |object, index| 
    object.destroy unless index == 0 
    end 
end

來源

2012-10-23 21:42:45

如何使用Rails刪除MySQL中的重複項？

回答

相關問題