2014-07-07 55 views
0

我正在查詢選擇位於特定地理區域的記錄,並且我正在進行一些連接和過濾。爲什麼不使用PostgreSQL索引並建議索引

這是我的查詢:

SELECT "events".* FROM "events" INNER JOIN "albums" ON "albums"."event_id" = "events"."id" INNER JOIN "photos" ON "photos"."album_id" = "albums"."id" WHERE "events"."deleted_at" IS NULL AND "albums"."deleted_at" IS NULL AND "photos"."deleted_at" IS NULL AND (events.latitude BETWEEN -44.197088742316055 AND -23.22003941183816 AND events.longitude BETWEEN 133.226480859375 AND 165.570230859375) GROUP BY events.id HAVING count(albums.id) > 0 ORDER BY start_date DESC 

我有以下指標:

活動:

"events_pkey" PRIMARY KEY, btree (id) 
"index_events_on_deleted_at" btree (deleted_at) 
"index_events_on_latitude_and_longitude" btree (latitude, longitude) 

專輯:

"albums_pkey" PRIMARY KEY, btree (id) 
"index_albums_on_deleted_at" btree (deleted_at) 
"index_albums_on_event_id" btree (event_id) 

照片:

"photos_pkey" PRIMARY KEY, btree (id) 
"index_photos_on_album_id" btree (album_id) 
"index_photos_on_deleted_at" btree (deleted_at) 

做一個EXPLAIN ANALYZE結果在這一點,我沒有看到我的指標,其中任何使用。我不知道如何強制它使用索引。任何人都可以幫我優化這個嗎?

Sort (cost=4057.46..4057.84 rows=150 width=668) (actual time=556.114..556.187 rows=76 loops=1) 
    Sort Key: events.start_date 
    Sort Method: quicksort Memory: 78kB 
    -> HashAggregate (cost=4050.16..4052.04 rows=150 width=668) (actual time=555.667..555.783 rows=76 loops=1) 
     Filter: (count(albums.id) > 0) 
     -> Hash Join (cost=76.14..3946.54 rows=20724 width=668) (actual time=3.675..467.578 rows=48050 loops=1) 
       Hash Cond: (photos.album_id = albums.id) 
       -> Seq Scan on photos (cost=0.00..3441.87 rows=59013 width=4) (actual time=0.008..169.206 rows=60599 loops=1) 
        Filter: (deleted_at IS NULL) 
       -> Hash (cost=74.10..74.10 rows=163 width=668) (actual time=3.633..3.633 rows=318 loops=1) 
        Buckets: 1024 Batches: 1 Memory Usage: 176kB 
        -> Hash Join (cost=49.80..74.10 rows=163 width=668) (actual time=1.195..2.519 rows=318 loops=1) 
          Hash Cond: (albums.event_id = events.id) 
          -> Seq Scan on albums (cost=0.00..21.47 rows=321 width=8) (actual time=0.011..0.458 rows=321 loops=1) 
           Filter: (deleted_at IS NULL) 
          -> Hash (cost=47.92..47.92 rows=150 width=664) (actual time=1.151..1.151 rows=195 loops=1) 
           Buckets: 1024 Batches: 1 Memory Usage: 126kB 
           -> Seq Scan on events (cost=0.00..47.92 rows=150 width=664) (actual time=0.007..0.488 rows=195 loops=1) 
             Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision) AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision)) 
Total runtime: 556.459 ms 

謝謝!!

編輯:謝謝你的鏈接。我已嘗試禁用seqscan。現在我的計劃是:

Sort (cost=5565.73..5566.10 rows=150 width=46) (actual time=451.208..451.290 rows=76 loops=1) 
    Sort Key: (date(events.start_date)) 
    Sort Method: quicksort Memory: 31kB 
    -> GroupAggregate (cost=0.00..5560.31 rows=150 width=46) (actual time=2.990..450.850 rows=76 loops=1) 
     Filter: (count(albums.id) > 0) 
     -> Nested Loop (cost=0.00..5454.44 rows=20724 width=46) (actual time=0.077..278.319 rows=48050 loops=1) 
       -> Merge Join (cost=0.00..205.35 rows=163 width=46) (actual time=0.051..2.856 rows=318 loops=1) 
        Merge Cond: (events.id = albums.event_id) 
        -> Index Scan using events_pkey on events (cost=0.00..118.72 rows=150 width=42) (actual time=0.024..0.792 rows=195 loops=1) 
          Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision) 
AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision)) 
        -> Index Scan using index_albums_on_event_id on albums (cost=0.00..83.83 rows=321 width=8) (actual time=0.017..0.832 rows=321 loops=1) 
          Filter: (deleted_at IS NULL) 
       -> Index Scan using index_photos_on_album_id on photos (cost=0.00..30.27 rows=155 width=4) (actual time=0.010..0.409 rows=151 loops=318) 
        Index Cond: (album_id = albums.id) 
        Filter: (deleted_at IS NULL) 
Total runtime: 451.562 ms 

Still索引沒有被完全使用,特別是在緯度和長條件下。我的索引設置是否正確?

編輯:在http://stackoverflow.com/questions/8228326/how-can-i-avoid-postgresql-sometimes-choosing-a-bad-query-plan-for-one-of-two-ne尋找答案之後,我認爲它的表現就像這一點,因爲我的查詢語句返回的所有記錄,然後我更新了條件,而新的查詢計劃是:

Sort (cost=786.18..786.22 rows=19 width=668) (actual time=3.754..3.755 rows=2 loops=1) 
    Sort Key: events.start_date 
    Sort Method: quicksort Memory: 25kB 
    -> HashAggregate (cost=785.54..785.77 rows=19 width=668) (actual time=3.700..3.703 rows=2 loops=1) 
     Filter: (count(albums.id) > 0) 
     -> Nested Loop (cost=48.39..765.51 rows=2670 width=668) (actual time=1.116..2.968 rows=543 loops=1) 
       -> Hash Join (cost=48.39..89.25 rows=21 width=668) (actual time=1.093..1.128 rows=3 loops=1) 
        Hash Cond: (events.id = albums.event_id) 
        -> Bitmap Heap Scan on events (cost=9.42..49.44 rows=19 width=664) (actual time=0.061..0.080 rows=9 loops=1) 
          Recheck Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision)) 
          Filter: (deleted_at IS NULL) 
          -> Bitmap Index Scan on index_events_on_latitude_and_longitude (cost=0.00..9.42 rows=28 width=0) (actual time=0.050..0.050 rows=9 loops=1) 
           Index Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision)) 
        -> Hash (cost=34.95..34.95 rows=321 width=8) (actual time=0.992..0.992 rows=321 loops=1) 
          Buckets: 1024 Batches: 1 Memory Usage: 13kB 
          -> Bitmap Heap Scan on albums (cost=14.74..34.95 rows=321 width=8) (actual time=0.069..0.570 rows=321 loops=1) 
           Recheck Cond: (deleted_at IS NULL) 
           -> Bitmap Index Scan on index_albums_on_deleted_at (cost=0.00..14.66 rows=321 width=0) (actual time=0.056..0.056 rows=321 loops=1) 
             Index Cond: (deleted_at IS NULL) 
       -> Index Scan using index_photos_on_album_id on photos (cost=0.00..30.27 rows=155 width=4) (actual time=0.014..0.273 rows=181 loops=3) 
        Index Cond: (album_id = albums.id) 
        Filter: (deleted_at IS NULL) 
Total runtime: 3.958 ms 

而時間是verrry少!! 有什麼建議嗎?

+0

也相關http://stackoverflow.com/questions/14554302/postgres-query-optimization-forcing-a-index-scan和http://www.postgresql.org/docs/current/static/indexes-examine .html – pozs

+0

嗨@pozs我試圖禁用'seqscan'。更新了查詢計劃。 –

+0

我看到,順便說一句你的查詢計劃尚未更新。 – pozs

回答

1

這主要是因爲您沒有在查詢中設置LIMIT子句。沒有LIMIT,你總是要求你的表中的所有數據,所以查看它們的索引也是不夠的。

SQLFiddle 1, 2 vs. 3, 4

還要注意,FOREIGN KEY不添加索引(不過UNIQUE & PRIMARY KEY約束一樣)。因此,您可能需要爲albums.event_id & photos.album_id添加索引。

SQLFiddle 3 vs. 4

排序(如果存在)可以使用索引。在您的查詢中,這意味着events.start_date上的索引。

+0

非常感謝@pozs。不幸的是,我現在無法在我的應用中使用「LIMIT」,因爲我需要做很多前端更改。但它確實有幫助! :) –

+0

而且,我已經有這兩個'FK'的索引。順便說一句,讓我給start_date添加索引吧! 幫了我很多。謝謝! –

+0

加入'LIMIT'沒有幫助。我也爲排序添加了索引。我也刪除了排序..但時間停留在450-500毫秒。 –