緩慢PostgreSQL查詢與（不正確？）索引

我有一個3000萬行的事件表。下面的查詢在25秒內返回緩慢PostgreSQL查詢與（不正確？）索引

SELECT DISTINCT "events"."id", "calendars"."user_id" 
FROM "events" 
LEFT JOIN "calendars" ON "events"."calendar_id" = "calendars"."id" 
WHERE "events"."deleted_at" is null 
AND tstzrange('2016-04-21T12:12:36-07:00', '2016-04-21T12:22:36-07:00') @> lower(time_range) 
AND ("status" is null or (status->>'pre_processed') IS NULL)

status是jsonb柱與status->>'pre_processed'的索引。以下是在事件表上創建的其他索引。 time_range是TSTZRANGE的類型。

CREATE INDEX events_time_range_idx ON events USING gist (time_range); 
CREATE INDEX events_lower_time_range_index on events(lower(time_range)); 
CREATE INDEX events_upper_time_range_index on events(upper(time_range)); 
CREATE INDEX events_calendar_id_index on events (calendar_id)

我絕對超出了我的舒適區，並試圖減少查詢時間。這裏的輸出解釋分析

HashAggregate (cost=7486635.89..7486650.53 rows=1464 width=48) (actual time=26989.272..26989.306 rows=98 loops=1) 
    Group Key: events.id, calendars.user_id 
    -> Nested Loop Left Join (cost=0.42..7486628.57 rows=1464 width=48) (actual time=316.110..26988.941 rows=98 loops=1) 
    -> Seq Scan on events (cost=0.00..7475629.43 rows=1464 width=50) (actual time=316.049..26985.344 rows=98 loops=1) 
      Filter: ((deleted_at IS NULL) AND ((status IS NULL) OR ((status ->> 'pre_processed'::text) IS NULL)) AND ('["2016-04-21 19:12:36+00","2016-04-21 19:22:36+00")'::tstzrange @> lower(time_range))) 
      Rows Removed by Filter: 31592898 
    -> Index Scan using calendars_pkey on calendars (cost=0.42..7.50 rows=1 width=48) (actual time=0.030..0.031 rows=1 loops=98) 
      Index Cond: (events.calendar_id = (id)::text) 
Planning time: 1.468 ms 
Execution time: 26989.370 ms

這裏是講解與查詢的events.deleted_at部分分析去除

HashAggregate (cost=7487382.57..7487398.33 rows=1576 width=48) (actual time=23880.466..23880.503 rows=115 loops=1) 
    Group Key: events.id, calendars.user_id 
    -> Nested Loop Left Join (cost=0.42..7487374.69 rows=1576 width=48) (actual time=16.612..23880.114 rows=115 loops=1) 
    -> Seq Scan on events (cost=0.00..7475629.43 rows=1576 width=50) (actual time=16.576..23876.844 rows=115 loops=1) 
      Filter: (((status IS NULL) OR ((status ->> 'pre_processed'::text) IS NULL)) AND ('["2016-04-21 19:12:36+00","2016-04-21 19:22:36+00")'::tstzrange @> lower(time_range))) 
      Rows Removed by Filter: 31592881 
    -> Index Scan using calendars_pkey on calendars (cost=0.42..7.44 rows=1 width=48) (actual time=0.022..0.023 rows=1 loops=115) 
      Index Cond: (events.calendar_id = (id)::text)

規劃時間：0.372毫秒執行時間：23880.571毫秒

我在status列中添加索引。一切已經存在，我不確定如何繼續前進。有關如何將查詢時間降低到更易於管理的數字的任何建議？

來源

2016-04-22 FajitaNachos

事件和日曆表的結構將是有益的。如果你可以發佈解釋分析輸出，而不僅僅是可以提供幫助的解釋。 – e4c5

@ e4c5謝謝。我添加了解釋分析。我可以稍後添加結構。我提到了我查詢的字段是TSTZRANGE和JSONB。 deleted_at只是一個時間戳 – FajitaNachos

不知道你需要'@> lower（time_range）'不會有「重疊」嗎？ 'where ... @> time_range' - 可以使用該列上的主要索引。還有哪些條件會刪除大部分行？ 'status'上的條件，'time_range'上的條件或'deleted_at'上的條件？ –

上lower(time_range)的B樹索引只能被用於涉及<，<=，=，>=和>運營商的條件。 @>運營商可能會依靠這些內部，但就規劃者而言，此範圍檢查操作是一個黑匣子，因此它不能使用該索引。

您將需要重新制定你的病情在B樹運營商方面，即：

lower(time_range) >= '2016-04-21T12:12:36-07:00' AND 
lower(time_range) < '2016-04-21T12:22:36-07:00'

來源

2016-04-25 23:41:02

你是男人。這個小小的變化是68ms。我懷疑這是與我在時間範圍內查詢的方式有關，但我在這裏非常偏離我的元素。它現在使用索引掃描索引掃描使用events_lower_time_range_index事件（成本= 0.57..2177.94行= 5寬度= 50）（實際時間= 0.019..0.186行= 98循環= 1）'。它說我可以在2個小時內獎賞你的賞金。謝謝！ – FajitaNachos

我第一次更改時間範圍時，需要花費相當長時間（10秒，30秒，16秒），但該時間範圍內的所有後續查詢均<1秒。是否有其他的調整可以幫助避免這種情況，還是Postgres內部的工作方式？ – FajitaNachos

這可能是由於緩存;第一個查詢必須打到磁盤，但隨後的查詢會查找RAM中的所有數據。增加緩存大小（通過提高['shared_buffers']（http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-SHARED-BUFFERS）和/或增加更多的RAM）可能會有所幫助，但沒有銀彈。 –

因此，爲events.deleted_at添加一個索引以擺脫討厭的順序掃描。那之後它看起來像什麼？

來源

2016-04-25 19:23:55 Nathan

需要一段時間才能在此表中添加索引，因此同時我只是移除了「WHERE」事件。「」deleted_at「爲null」，並且查詢仍然在相同的時間內返回。發佈瞭解釋分析輸出。看起來像順序掃描仍在運行 – FajitaNachos

緩慢PostgreSQL查詢與（不正確？）索引

回答

相關問題