爲什麼Postgres拒絕在某些設置中使用組合索引？

所以，我有一個表，看起來像：爲什麼Postgres拒絕在某些設置中使用組合索引？

        Table "public.rule_traffic" 
      Column  | Type |      Modifiers 
    id    | bigint | not null default nextval('rule_traffic_seq'::regclass) 
    device_id   | integer | not null 
    version_id  | integer | not null 
    policy_name  | text | 
    rule_uid   | uuid | not null 
    traffic_hash_code | bigint | not null 
    action   | integer |

與這些指標一起：

"rule_traffic_pkey" PRIMARY KEY, btree (id) 
"unique_device_id_version_id_policy_name_uid_in_rule_traffic" UNIQUE, btree (device_id, version_id, policy_name, rule_uid)

當我運行我的設置（和許多其他）測試查詢，它看起來像我「M實際使用定義的索引unique_device_id_version_id_policy_name_uid_in_rule_traffic：

                   QUERY PLAN 
HashAggregate (cost=8.29..8.30 rows=1 width=56) (actual time=1.563..1.563 rows=0 loops=1) 
-> Index Scan using unique_device_id_version_id_policy_name_uid_in_rule_traffic on rule_traffic this_ (cost=0.00..8.28 rows=1 width=56) (actual time=1.558..1.558 rows=0 loops=1) 
    Index Cond: ((device_id = 11) AND (policy_name IS NULL)) 
    Filter: ((rule_uid = 'f6c0dc29-e741-4f9a-adf1-f11d18768af3'::uuid) OR (rule_uid = 'c1a12087-2d85-4e44-a115-f9cad7ec915e'::uuid)) 
Total runtime: 1.704 ms

但有一個與一個完全不同的查詢計劃的設置（序列SC an）：

                    QUERY PLAN 
HashAggregate (cost=150538.23..150538.25 rows=2 width=56) (actual time=2403.600..2403.601 rows=2 loops=1) 
-> Seq Scan on rule_traffic this_ (cost=0.00..150538.20 rows=4 width=56) (actual time=2354.481..2403.573 rows=2 loops=1) 
    Filter: ((policy_name IS NULL) AND (device_id = 11) AND ((rule_uid = 'f6c0dc29-e741-4f9a-adf1-f11d18768af3'::uuid) OR (rule_uid = 'c1a12087-2d85-4e44-a115-f9cad7ec915e'::uuid))) 
Total runtime: 2403.661 ms

我試着在沒有結果的表上運行VACUUM FULL \ ANALYZE。

有沒有人有任何想法爲什麼postgres決定不使用複合索引？

更新1：

試圖迫使不使用序列掃描：

securetrack=# explain analyze select max(this_.id) as y0_, this_.rule_uid as y1_, this_.policy_name as y2_ from rule_traffic this_ where this_.device_id=11 and ((this_.rule_uid='f6c0dc29-e741-4f9a-adf1-f11d18768af3' and this_.policy_name is null) OR (this_.rule_uid = 'c1a12087-2d85-4e44-a115-f9cad7ec915e' and this_.policy_name is null)) group by this_.rule_uid, this_.policy_name; 

QUERY PLAN 
HashAggregate (cost=209498.38..209498.40 rows=2 width=56) (actual time=2475.980..2475.981 rows=2 loops=1) 
    -> Seq Scan on rule_traffic this_ (cost=0.00..209498.35 rows=4 width=56) (actual time=1631.945..2475.950 rows=3 loops=1) 
    Filter: ((policy_name IS NULL) AND (device_id = 11) AND ((rule_uid = 'f6c0dc29-e741-4f9a-adf1-f11d18768af3'::uuid) OR (rule_uid = 'c1a12087-2d85-4e44-a115-f9cad7ec915e'::uuid))) 
Total runtime: 2476.038 ms 
(4 rows)

SETTING seqscan =假：

securetrack=# SET enable_seqscan=false; 
SET 
securetrack=# explain analyze select max(this_.id) as y0_, this_.rule_uid as y1_, this_.policy_name as y2_ from rule_traffic this_ where this_.device_id=11 and ((this_.rule_uid='f6c0dc29-e741-4f9a-adf1-f11d18768af3' and this_.policy_name is null) OR (this_.rule_uid = 'c1a12087-2d85-4e44-a115-f9cad7ec915e' and this_.policy_name is null)) group by this_.rule_uid, this_.policy_name; 
                          QUERY PLAN 
HashAggregate (cost=371469.08..371469.10 rows=2 width=56) (actual time=2936.608..2936.610 rows=2 loops=1) 
    -> Bitmap Heap Scan on rule_traffic this_ (cost=197981.02..371469.05 rows=4 width=56) (actual time=2308.843..2936.577 rows=3 loops=1) 
    Recheck Cond: ((device_id = 11) AND (policy_name IS NULL)) 
    Filter: ((rule_uid = 'f6c0dc29-e741-4f9a-adf1-f11d18768af3'::uuid) OR (rule_uid = 'c1a12087-2d85-4e44-a115-f9cad7ec915e'::uuid)) 
    -> Bitmap Index Scan on unique_device_id_version_id_policy_name_uid_in_rule_traffic (cost=0.00..197981.02 rows=5774287 width=0) (actual time=1283.603..1283.603 rows=5849739 loops=1) 
      Index Cond: ((device_id = 11) AND (policy_name IS NULL)) 
Total runtime: 2936.680 ms 
(7 rows)

貌似成本實際上是更高的。怎麼可能？

來源

2017-02-19 yairo

奇怪，你可以嘗試一下，如果你執行'SET enable_seqscan = false;'在運行你的查詢之前會發生什麼？如果它仍然進行順序掃描，而不是有某種原因，它不能使用索引，否則它只會認爲這是一個壞主意。順便哪個postgresql版本？ – Eelke

PostgreSQL在這裏做正確的事情。

如果您查看強制使用索引的查詢計劃，則會看到索引掃描發現5897939行爲(device_id = 11) AND (policy_name IS NULL)，所有這些行都必須使用該表進行重新檢查。

現在掃描索引的這麼大一部分並且重新檢查發現的所有錶行是比整個表的順序掃描更昂貴的（順序讀取通常比隨機讀取讀取更快）。

使用EXPLAIN (ANALYZE, BUFFERS)是有益的，因爲它會告訴你實際訪問的數據庫塊的數量。

來源

2017-02-20 07:56:36

謝謝！解釋它！ – yairo

爲什麼Postgres拒絕在某些設置中使用組合索引？

回答

相關問題