我有一個3000萬行的事件表。下面的查詢在25秒內返回緩慢PostgreSQL查詢與(不正確?)索引
SELECT DISTINCT "events"."id", "calendars"."user_id"
FROM "events"
LEFT JOIN "calendars" ON "events"."calendar_id" = "calendars"."id"
WHERE "events"."deleted_at" is null
AND tstzrange('2016-04-21T12:12:36-07:00', '2016-04-21T12:22:36-07:00') @> lower(time_range)
AND ("status" is null or (status->>'pre_processed') IS NULL)
status
是jsonb
柱與status->>'pre_processed'
的索引。以下是在事件表上創建的其他索引。 time_range
是TSTZRANGE
的類型。
CREATE INDEX events_time_range_idx ON events USING gist (time_range);
CREATE INDEX events_lower_time_range_index on events(lower(time_range));
CREATE INDEX events_upper_time_range_index on events(upper(time_range));
CREATE INDEX events_calendar_id_index on events (calendar_id)
我絕對超出了我的舒適區,並試圖減少查詢時間。這裏的輸出解釋分析
HashAggregate (cost=7486635.89..7486650.53 rows=1464 width=48) (actual time=26989.272..26989.306 rows=98 loops=1)
Group Key: events.id, calendars.user_id
-> Nested Loop Left Join (cost=0.42..7486628.57 rows=1464 width=48) (actual time=316.110..26988.941 rows=98 loops=1)
-> Seq Scan on events (cost=0.00..7475629.43 rows=1464 width=50) (actual time=316.049..26985.344 rows=98 loops=1)
Filter: ((deleted_at IS NULL) AND ((status IS NULL) OR ((status ->> 'pre_processed'::text) IS NULL)) AND ('["2016-04-21 19:12:36+00","2016-04-21 19:22:36+00")'::tstzrange @> lower(time_range)))
Rows Removed by Filter: 31592898
-> Index Scan using calendars_pkey on calendars (cost=0.42..7.50 rows=1 width=48) (actual time=0.030..0.031 rows=1 loops=98)
Index Cond: (events.calendar_id = (id)::text)
Planning time: 1.468 ms
Execution time: 26989.370 ms
這裏是講解與查詢的events.deleted_at
部分分析去除
HashAggregate (cost=7487382.57..7487398.33 rows=1576 width=48) (actual time=23880.466..23880.503 rows=115 loops=1)
Group Key: events.id, calendars.user_id
-> Nested Loop Left Join (cost=0.42..7487374.69 rows=1576 width=48) (actual time=16.612..23880.114 rows=115 loops=1)
-> Seq Scan on events (cost=0.00..7475629.43 rows=1576 width=50) (actual time=16.576..23876.844 rows=115 loops=1)
Filter: (((status IS NULL) OR ((status ->> 'pre_processed'::text) IS NULL)) AND ('["2016-04-21 19:12:36+00","2016-04-21 19:22:36+00")'::tstzrange @> lower(time_range)))
Rows Removed by Filter: 31592881
-> Index Scan using calendars_pkey on calendars (cost=0.42..7.44 rows=1 width=48) (actual time=0.022..0.023 rows=1 loops=115)
Index Cond: (events.calendar_id = (id)::text)
規劃時間:0.372毫秒 執行時間:23880.571毫秒
我在status
列中添加索引。一切已經存在,我不確定如何繼續前進。有關如何將查詢時間降低到更易於管理的數字的任何建議?
事件和日曆表的結構將是有益的。如果你可以發佈解釋分析輸出,而不僅僅是可以提供幫助的解釋。 – e4c5
@ e4c5謝謝。我添加了解釋分析。我可以稍後添加結構。我提到了我查詢的字段是TSTZRANGE和JSONB。 deleted_at只是一個時間戳 – FajitaNachos
不知道你需要'@> lower(time_range)'不會有「重疊」嗎? 'where ... @> time_range' - 可以使用該列上的主要索引。還有哪些條件會刪除大部分行? 'status'上的條件,'time_range'上的條件或'deleted_at'上的條件? –