掙扎在Rails的

優化軌道WHERE NOT IN查詢這裏的查詢，因爲它是在軌：掙扎在Rails的

User.limit(20). 
    where.not(id: to_skip, number_of_photos: 0). 
    where(age: @[email protected]_age_max). 
    tagged_with(@user.seeking_traits, on: :trait, any: true). 
    tagged_with(@user.seeking_gender, on: :trait, any: true).ids

而這裏的EXPLAIN ANALYZE輸出。注意id <> ALL(...)部分縮短了。其中大約有10K個ID。

Limit (cost=23.32..5331.16 rows=20 width=1698) (actual time=2237.871..2243.709 rows=20 loops=1) 
    -> Nested Loop Semi Join (cost=23.32..875817.48 rows=3300 width=1698) (actual time=2237.870..2243.701 rows=20 loops=1) 
     -> Merge Semi Join (cost=22.89..857813.95 rows=8311 width=1702) (actual time=463.757..2220.691 rows=1351 loops=1) 
       Merge Cond: (users.id = users_trait_taggings_356a192.taggable_id) 
       -> Index Scan using users_pkey on users (cost=0.29..834951.51 rows=37655 width=1698) (actual time=455.122..2199.322 rows=7866 loops=1) 
        Index Cond: (id IS NOT NULL) 
        Filter: ((number_of_photos <> 0) AND (age >= 18) AND (age <= 99) AND (id <> ALL ('{7066,7065,...,15624,23254}'::integer[]))) 
        Rows Removed by Filter: 7652 
       -> Index Only Scan using taggings_idx on taggings users_trait_taggings_356a192 (cost=0.42..22767.59 rows=11393 width=4) (actual time=0.048..16.009 rows=4554 loops=1) 
        Index Cond: ((tag_id = 2) AND (taggable_type = 'User'::text) AND (context = 'trait'::text)) 
        Heap Fetches: 4554 
     -> Index Scan using index_taggings_on_taggable_id_and_taggable_type_and_context on taggings users_trait_taggings_5df4b2a (cost=0.42..2.16 rows=1 width=4) (actual time=0.016..0.016 rows=0 loops=1351) 
       Index Cond: ((taggable_id = users.id) AND ((taggable_type)::text = 'User'::text) AND ((context)::text = 'trait'::text)) 
       Filter: (tag_id = ANY ('{4,6}'::integer[])) 
       Rows Removed by Filter: 2 
Total runtime: 2243.913 ms

Complete version here。

Index Scan using users_pkey on users好像有什麼問題，索引掃描需要很長時間。即使有上age，number_of_photos索引和id：

add_index "users", ["age"], name: "index_users_on_age", using: :btree 
add_index "users", ["number_of_photos"], name: "index_users_on_number_of_photos", using: :btree

to_skip是用戶ID的數組不要跳過。 A user有很多skips。每個skip有一個partner_id。

所以下載to_skip我做：

to_skip = @user.skips.pluck(:partner_id)

我試圖查詢隔離只是：

sql = User.limit(20). 
    where.not(id: to_skip, number_of_photos: 0). 
    where(age: @[email protected]_age_max).to_sql

而且還與講解分析得到了同樣的問題。同樣，用戶ID列表被剪切：

Limit (cost=0.00..435.34 rows=20 width=1698) (actual time=0.219..4.844 rows=20 loops=1) 
    -> Seq Scan on users (cost=0.00..819629.38 rows=37655 width=1698) (actual time=0.217..4.838 rows=20 loops=1) 
     Filter: ((id IS NOT NULL) AND (number_of_photos <> 0) AND (age >= 18) AND (age <= 99) AND (id <> ALL ('{7066,7065,...,15624,23254}'::integer[]))) 
     Rows Removed by Filter: 6 
Total runtime: 5.044 ms

Complete version here。

有關如何在rails + postgres中優化此查詢的任何想法？

編輯：下面是相關機型：

User model

class User < ActiveRecord::Base 
    acts_as_messageable required: :body, # default [:topic, :body] 
         dependent: :destroy 

    has_many :skips, :dependent => :destroy 

    acts_as_taggable # Alias for acts_as_taggable_on :tags 
    acts_as_taggable_on :seeking_gender, :trait, :seeking_race 
    scope :by_updated_date, -> { 
    order("updated_at DESC") 
    } 
end 

# schema 

create_table "users", force: :cascade do |t| 
    t.string "email", default: "", null: false 
    t.datetime "created_at", null: false 
    t.datetime "updated_at", null: false 
    t.text  "skips", array: true 
    t.integer "number_of_photos", default: 0 
    t.integer "age" 
end 

add_index "users", ["age"], name: "index_users_on_age", using: :btree 
add_index "users", ["email"], name: "index_users_on_email", unique: true, using: :btree 
add_index "users", ["number_of_photos"], name: "index_users_on_number_of_photos", using: :btree 
add_index "users", ["updated_at"], name: "index_users_on_updated_at", order: {"updated_at"=>:desc}, using: :btree

Skips model

class Skip < ActiveRecord::Base 
    belongs_to :user 
end 

# schema 

create_table "skips", force: :cascade do |t| 
    t.integer "user_id" 
    t.integer "partner_id" 
    t.datetime "created_at", null: false 
    t.datetime "updated_at", null: false 
end 

add_index "skips", ["partner_id"], name: "index_skips_on_partner_id", using: :btree 
add_index "skips", ["user_id"], name: "index_skips_on_user_id", using: :btree

來源

2016-07-30 Vu Tran

請發佈所有相關型號的代碼。 –

添加了它們。請讓我知道他們是否足夠。 –

你有一個專門的'Skip'模型和'Users.skips'數組字段。後者的原因是什麼？ –

速度問題可能是由於在to_skip（約IDS的一長串60Kb）作爲數組傳入。那麼解決方案就是將其重寫爲子查詢的結果，以便Postgress可以更好地優化查詢。

建立時to_skip，請嘗試使用select而不是pluck。 pluck返回一個數組，然後傳遞給主查詢。 select反過來返回ActiveRecord::Relation，其中的sql可以包含在主查詢中，可能會使其更有效。

to_skip = @user.skips.select(:partner_id)

在發佈型號代碼之前，很難提出更具體的建議。我將探索的一般方向是嘗試將所有相關步驟合併到單個查詢中，以便數據庫執行優化。

UPDATE

使用select會是這個樣子的活動記錄查詢（我跳過taggable的東西，因爲它似乎不影響性能的多）：

User.limit(20). 
    where.not(id: @user.skips.select(:partner_id), number_of_photos: 0). 
    where(age: 0..25)

這是SQL查詢被執行。請注意子查詢如何跳過id：

SELECT "users".* FROM "users" 
    WHERE ("users"."number_of_photos" != 0) 
    AND ("users"."id" NOT IN (
     SELECT "skips"."partner_id" 
     FROM "skips" 
     WHERE "skips"."user_id" = 1 
    )) 
    AND ("users"."age" BETWEEN 0 AND 25) 
    LIMIT 20

嘗試以此方式運行您的查詢並查看它如何影響性能。

來源

2016-07-30 11:14:23

謝謝。我會試一試。同時，我還添加了模型註釋+模式 –

實際上，如何使用select（：partner_id）編寫查詢？ to_skip = @ user.skips.select（：PARTNER_ID）結果= User.limit（20）.where.not（ID：to_skip）.ids 似乎不起作用 –

請參閱更新我的答案。 –

掙扎在Rails的

回答

相關問題