是否可以將LIMIT子句分發給子查詢？

我加入這個查詢的結果：是否可以將LIMIT子句分發給子查詢？

SELECT 
     twitter_personas.id 
    , 'TwitterPersona' 
    , twitter_personas.name 
    FROM twitter_personas 
UNION ALL 
    SELECT 
     facebook_personas.id 
    , 'FacebookPersona' 
    , facebook_personas.name 
-- and more UNION ALL statements pertaining to my other services

的評分表。 JOIN本身不是問題，但查詢計劃是「錯誤的」：PostgreSQL找到前50個分數，然後加入到上面的全部視圖中，這意味着它完成了大量工作，因爲我們只對此感興趣前50名。但50是一個變量 - 它可能會改變（取決於用戶界面的關注，並可能在某些時候分頁，yada，yada）。

我限制我的結果直接在子查詢集合進行的查詢的速度非常快：

SELECT 
    personas.id 
    , personas.type 
    , personas.name 
    , xs.value 
FROM (
    SELECT 
     twitter_personas.id 
    , 'TwitterPersona' 
    , twitter_personas.name 
    FROM twitter_personas 
    WHERE id IN (
    SELECT persona_id 
    FROM xs 
    ORDER BY 
     xs.value DESC 
    LIMIT 50) 
UNION ALL 
    SELECT 
     facebook_personas.id 
    , 'FacebookPersona' 
    , facebook_personas.name 
    FROM facebook_personas 
    WHERE id IN (
    SELECT persona_id 
    FROM xs 
    ORDER BY 
     xs.value DESC 
    LIMIT 50)) AS personas(id, type, name) 
    INNER JOIN xs ON xs.persona_id = personas.id 
ORDER BY 
    xs.value DESC 
LIMIT 50

我的問題是我怎麼能分配50以上從外部查詢內部查詢？與原始合併相比，此查詢執行速度非常快（90毫秒），可以在15秒內執行UNION ALL的完整結果集。也許有更好的方法來做到這一點？

以下是供參考的我的查詢計劃。首先，「壞」的一個，走了將近15秒時：

                   QUERY PLAN 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Limit (cost=0.00..31072.27 rows=50 width=176) (actual time=304.299..14403.551 rows=50 loops=1) 
    -> Subquery Scan personas_ranked (cost=0.00..253116556.67 rows=407303 width=176) (actual time=304.298..14403.511 rows=50 loops=1) 
     -> Nested Loop Left Join (cost=0.00..253112483.64 rows=407303 width=112) (actual time=304.297..14403.474 rows=50 loops=1) 
       -> Nested Loop (cost=0.00..252998394.22 rows=407303 width=108) (actual time=304.283..14402.815 rows=50 loops=1) 
        Join Filter: ("*SELECT* 1".id = xs.persona_id) 
        -> Index Scan Backward using xs_value_index on xs xs (cost=0.00..459.97 rows=10275 width=12) (actual time=0.013..0.208 rows=50 loops=1) 
        -> Append (cost=0.00..15458.35 rows=407303 width=88) (actual time=0.006..244.217 rows=398435 loops=50) 
          -> Subquery Scan "*SELECT* 1" (cost=0.00..15420.65 rows=406562 width=88) (actual time=0.006..199.945 rows=398434 loops=50) 
           -> Seq Scan on twitter_personas (cost=0.00..11355.02 rows=406562 width=88) (actual time=0.005..134.607 rows=398434 loops=50) 
          -> Subquery Scan "*SELECT* 2" (cost=0.00..14.88 rows=150 width=502) (actual time=0.002..0.002 rows=0 loops=49) 
           -> Seq Scan on email_personas (cost=0.00..13.38 rows=150 width=502) (actual time=0.001..0.001 rows=0 loops=49) 
          -> Subquery Scan "*SELECT* 3" (cost=0.00..21.80 rows=590 width=100) (actual time=0.001..0.001 rows=0 loops=49) 
           -> Seq Scan on facebook_personas (cost=0.00..15.90 rows=590 width=100) (actual time=0.001..0.001 rows=0 loops=49) 
          -> Subquery Scan "*SELECT* 4" (cost=0.00..1.03 rows=1 width=25) (actual time=0.018..0.019 rows=1 loops=49) 
           -> Seq Scan on web_personas (cost=0.00..1.02 rows=1 width=25) (actual time=0.017..0.018 rows=1 loops=49) 
       -> Index Scan using people_personas_pkey on people_personas (cost=0.00..0.27 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=50) 
        Index Cond: (people_personas.persona_id = "*SELECT* 1".id) 
Total runtime: 14403.711 ms

重寫的查詢，同時只有90毫秒：

                      QUERY PLAN 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
Limit (cost=2830.93..2831.05 rows=50 width=108) (actual time=83.914..83.925 rows=50 loops=1) 
    -> Sort (cost=2830.93..2832.30 rows=551 width=108) (actual time=83.912..83.918 rows=50 loops=1) 
     Sort Key: xs.value 
     Sort Method: top-N heapsort Memory: 28kB 
     -> Hash Join (cost=875.60..2812.62 rows=551 width=108) (actual time=8.394..79.326 rows=10275 loops=1) 
       Hash Cond: ("*SELECT* 1".id = xs.persona_id) 
       -> Append (cost=588.41..2509.59 rows=551 width=4) (actual time=5.078..69.901 rows=10275 loops=1) 
        -> Subquery Scan "*SELECT* 1" (cost=588.41..1184.14 rows=200 width=4) (actual time=5.078..42.428 rows=10274 loops=1) 
          -> Nested Loop (cost=588.41..1182.14 rows=200 width=4) (actual time=5.078..40.220 rows=10274 loops=1) 
           -> HashAggregate (cost=588.41..590.41 rows=200 width=4) (actual time=5.066..7.900 rows=10275 loops=1) 
             -> Index Scan Backward using xs_value_index on xs xs (cost=0.00..459.97 rows=10275 width=12) (actual time=0.005..2.079 rows=10275 loops=1) 
           -> Index Scan using twitter_personas_id_index on twitter_personas (cost=0.00..2.95 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=10275) 
             Index Cond: (twitter_personas.id = xs.persona_id) 
        -> Subquery Scan "*SELECT* 2" (cost=588.41..649.27 rows=200 width=4) (actual time=13.017..13.017 rows=0 loops=1) 
          -> Nested Loop (cost=588.41..647.27 rows=200 width=4) (actual time=13.016..13.016 rows=0 loops=1) 
           -> HashAggregate (cost=588.41..590.41 rows=200 width=4) (actual time=5.267..6.909 rows=10275 loops=1) 
             -> Index Scan Backward using xs_value_index on xs xs (cost=0.00..459.97 rows=10275 width=12) (actual time=0.007..2.292 rows=10275 loops=1) 
           -> Index Scan using facebook_personas_id_index on facebook_personas (cost=0.00..0.27 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=10275) 
             Index Cond: (facebook_personas.id = xs.persona_id) 
        -> Subquery Scan "*SELECT* 3" (cost=588.41..648.77 rows=150 width=4) (actual time=12.568..12.568 rows=0 loops=1) 
          -> Nested Loop (cost=588.41..647.27 rows=150 width=4) (actual time=12.566..12.566 rows=0 loops=1) 
           -> HashAggregate (cost=588.41..590.41 rows=200 width=4) (actual time=5.015..6.538 rows=10275 loops=1) 
             -> Index Scan Backward using xs_value_index on xs xs (cost=0.00..459.97 rows=10275 width=12) (actual time=0.002..2.065 rows=10275 loops=1) 
           -> Index Scan using email_personas_id_index on email_personas (cost=0.00..0.27 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=10275) 
             Index Cond: (email_personas.id = xs.persona_id) 
        -> Subquery Scan "*SELECT* 4" (cost=0.00..27.41 rows=1 width=4) (actual time=0.629..0.630 rows=1 loops=1) 
          -> Nested Loop Semi Join (cost=0.00..27.40 rows=1 width=4) (actual time=0.628..0.628 rows=1 loops=1) 
           Join Filter: (web_personas.id = xs.persona_id) 
           -> Seq Scan on web_personas (cost=0.00..1.01 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1) 
           -> Index Scan Backward using xs_value_index on xs xs (cost=0.00..459.97 rows=10275 width=12) (actual time=0.002..0.421 rows=1518 loops=1) 
       -> Hash (cost=158.75..158.75 rows=10275 width=12) (actual time=3.307..3.307 rows=10275 loops=1) 
        -> Seq Scan on xs xs (cost=0.00..158.75 rows=10275 width=12) (actual time=0.006..1.563 rows=10275 loops=1) 
Total runtime: 84.066 ms

來源

2011-04-18 François Beausoleil

爲什麼這是行不通的是，ORDER BY xs.value DESC是極限前處理，才能知道第一（或最後）50個條目的原因，它必須（邏輯）首先計算所有條目。如果你把限制放在工會的分支機構中，那麼你只能得到那些已經進入他們人物類型前50名的前50名，這可能是不同的。如果您可以接受，則可以像您一樣手動重寫查詢，但數據庫系統無法爲您執行此操作。

來源

2011-04-19 11:10:51

你是絕對正確的：我將獲得每個人物的前50名，因此50個角色類型。在這個較小的結果集中，我將提取前50名。我們對前50名角色非常感興趣，他們可能是。感謝您的分析。 – 2011-04-20 05:27:54

企劃到XS的所有行加入到每個表UNION，因爲規劃人員無法事先知道連接不會影響結果數據集（這可能會影響哪些行位於前50位）。

你可以用臨時表做兩步嗎？

create temporary table top50 as 
select xs.persona_id 
, xs.value 
from xs 
order by value desc 
limit 50; 

select * 
from top50 
join personas_view on top50.persona_id = personas_view.id;

來源

2011-04-19 01:16:06

是否可以將LIMIT子句分發給子查詢？

回答

相關問題