2010-01-06 154 views
1

我有一個移植到SQLite的800MB MS Access數據庫。數據庫的結構如下(SQLite數據庫遷移後大約爲330MB):SQLite查詢比MSAccess查詢運行速度慢10倍

Occurrence有1,600,000條記錄。該表是這樣的:

CREATE TABLE Occurrence 
(
SimulationID INTEGER, SimRunID INTEGER, OccurrenceID INTEGER, 
OccurrenceTypeID INTEGER, Period INTEGER, HasSucceeded BOOL, 
PRIMARY KEY (SimulationID, SimRunID, OccurrenceID) 
) 

它具有以下指標:

CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" ("HasSucceeded" ASC) 

CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" ("OccurrenceID" ASC) 

CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" ASC) 

CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" ("SimulationID" ASC) 

OccurrenceParticipant有340萬分的記錄。該表是這樣的:

CREATE TABLE OccurrenceParticipant 
(
SimulationID INTEGER,  SimRunID INTEGER, OccurrenceID  INTEGER, 
RoleTypeID  INTEGER,  ParticipantID INTEGER 
) 

它具有以下指標:

CREATE INDEX "OccurrenceParticipant_OccurrenceID_idx" ON "OccurrenceParticipant" ("OccurrenceID" ASC) 

CREATE INDEX "OccurrenceParticipant_ParticipantID_idx" ON "OccurrenceParticipant" ("ParticipantID" ASC) 

CREATE INDEX "OccurrenceParticipant_RoleType_idx" ON "OccurrenceParticipant" ("RoleTypeID" ASC) 

CREATE INDEX "OccurrenceParticipant_SimRunID_idx" ON "OccurrenceParticipant" ("SimRunID" ASC) 

CREATE INDEX "OccurrenceParticipant_SimulationID_idx" ON "OccurrenceParticipant" ("SimulationID" ASC) 

InitialParticipant有130條記錄。該表的結構是

CREATE TABLE InitialParticipant 
(
ParticipantID INTEGER PRIMARY KEY,  ParticipantTypeID INTEGER, 
ParticipantGroupID  INTEGER 
) 

表有以下指標:

CREATE INDEX "initialpart_participantTypeID_idx" ON "InitialParticipant" ("ParticipantGroupID" ASC) 

CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" ("ParticipantID" ASC) 

ParticipantGroup有22條記錄。它看起來像

CREATE TABLE ParticipantGroup (
ParticipantGroupID INTEGER, ParticipantGroupTypeID  INTEGER, 
Description varchar (50),  PRIMARY KEY( ParticipantGroupID ) 
) 

表有以下指標: CREATE INDEX 「ParticipantGroup_ParticipantGroupID_idx」 ON 「ParticipantGroup」( 「ParticipantGroupID」 ASC)

tmpSimArgs有18條記錄。它具有以下結構:

CREATE TABLE tmpSimArgs (SimulationID varchar, SimRunID int(10)) 

與以下指標:

CREATE INDEX tmpSimArgs_SimRunID_idx ON tmpSimArgs(SimRunID ASC) 

CREATE INDEX tmpSimArgs_SimulationID_idx ON tmpSimArgs(SimulationID ASC) 

表「tmpPartArgs」有80條記錄。它具有以下結構:

CREATE TABLE tmpPartArgs(participantID INT) 

及以下指標:

CREATE INDEX tmpPartArgs_participantID_idx ON tmpPartArgs(participantID ASC) 

我有一個涉及到多個內部連接的查詢,我所面臨的問題是查詢的Access版本大約需要一秒,而相同查詢的SQLite版本需要10秒(大約慢10倍!)我不可能遷移回Access,SQLite是我唯一的選擇。

我是新來編寫數據庫查詢,因此這些查詢可能看起來很愚蠢,所以請告訴任何你看到錯誤或孩子的東西。

在訪問查詢是(整個查詢需要1秒來執行):

SELECT ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period, Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID FROM 
( 
    ParticipantGroup INNER JOIN InitialParticipant ON ParticipantGroup.ParticipantGroupID = InitialParticipant.ParticipantGroupID 
) INNER JOIN 
(
tmpPartArgs INNER JOIN 
    (
    (
     tmpSimArgs INNER JOIN Occurrence ON (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND (tmpSimArgs.SimulationID = Occurrence.SimulationID) 
    ) INNER JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID = OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = OccurrenceParticipant.SimulationID) 
) ON tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID 
) ON InitialParticipant.ParticipantID = OccurrenceParticipant.ParticipantID WHERE (((OccurrenceParticipant.RoleTypeID)=52 Or (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period; 

SQLite的查詢如下(此查詢需要大約10秒):

SELECT ij1.Description, ij2.occSimulationID, ij2.occSimRunID, ij2.Period, Count(ij2.occpParticipantID) AS CountOfParticipantID FROM 
(
    SELECT ip.ParticipantGroupID AS ipParticipantGroupID, ip.ParticipantID AS ipParticipantID, ip.ParticipantTypeID, pg.ParticipantGroupID AS pgParticipantGroupID, pg.ParticipantGroupTypeID, pg.Description FROM ParticipantGroup as pg INNER JOIN InitialParticipant AS ip ON pg.ParticipantGroupID = ip.ParticipantGroupID 
) AS ij1 INNER JOIN 
(
    SELECT tpa.participantID AS tpaParticipantID, ij3.* FROM tmpPartArgs AS tpa INNER JOIN 
    (
     SELECT ij4.*, occp.SimulationID as occpSimulationID, occp.SimRunID AS occpSimRunID, occp.OccurrenceID AS occpOccurrenceID, occp.ParticipantID AS occpParticipantID, occp.RoleTypeID FROM 
      (
       SELECT tsa.SimulationID AS tsaSimulationID, tsa.SimRunID AS tsaSimRunID, occ.SimulationID AS occSimulationID, occ.SimRunID AS occSimRunID, occ.OccurrenceID AS occOccurrenceID, occ.OccurrenceTypeID, occ.Period, occ.HasSucceeded FROM tmpSimArgs AS tsa INNER JOIN Occurrence AS occ ON (tsa.SimRunID = occ.SimRunID) AND (tsa.SimulationID = occ.SimulationID) 
     ) AS ij4 INNER JOIN OccurrenceParticipant AS occp ON (occOccurrenceID =  occpOccurrenceID) AND (occSimRunID = occpSimRunID) AND (occSimulationID = occpSimulationID) 
    ) AS ij3 ON tpa.participantID = ij3.occpParticipantID 
) AS ij2 ON ij1.ipParticipantID = ij2.occpParticipantID WHERE (((ij2.RoleTypeID)=52 Or (ij2.RoleTypeID)=49)) AND ij2.HasSucceeded = 1 GROUP BY ij1.Description, ij2.occSimulationID, ij2.occSimRunID, ij2.Period; 

我不知道我在這裏做錯了什麼。我有所有的索引,但我想我缺少宣佈一些關鍵指標,將爲我做的伎倆。有趣的是,在遷移之前,我在SQLite上的'研究'表明,與Access相比,SQLite在各個方面都更快,更小,更好。但我似乎無法讓SQLite在查詢方面比Access更快地工作。我重申,我是SQLite的新手,顯然沒有太多的想法和經驗,所以如果有任何學習的靈魂可以幫助我,這將是非常感激。

+1

該查詢令我頭疼。我不明白你爲什麼要做所有的查詢(子選擇)。你能用英文(而不是SQL)來解釋你試圖從查詢中返回的內容嗎?會讓你更容易回答你的問題。 – JohnFx 2010-01-06 23:37:25

+0

我將解釋每個子選擇語句在英語中的作用。由於此評論框只能容納600個字符,因此我將解釋發佈爲我的問題的答案。 – 2010-01-07 00:11:38

回答

0

我提出了一個較小的縮小版本的查詢。希望這比我以前的更清晰明瞭。

SELECT5 * FROM 
(
SELECT4 FROM ParticipantGroup as pg INNER JOIN InitialParticipant AS ip ON pg.ParticipantGroupID = ip.ParticipantGroupID 
) AS ij1 INNER JOIN 
(
    SELECT3 * FROM tmpPartArgs AS tpa INNER JOIN 
     (
      SELECT2 * FROM 
       (
        SELECT1 * FROM tmpSimArgs AS tsa INNER JOIN Occurrence AS occ ON (tsa.SimRunID = occ.SimRunID) AND (tsa.SimulationID = occ.SimulationID) 
      ) AS ij4 INNER JOIN OccurrenceParticipant AS occp ON (occOccurrenceID =  occpOccurrenceID) AND (occSimRunID = occpSimRunID) AND (occSimulationID = occpSimulationID) 
    ) AS ij3 ON tpa.participantID = ij3.occpParticipantID 
) AS ij2 ON ij1.ipParticipantID = ij2.occpParticipantID WHERE (((ij2.RoleTypeID)=52 Or (ij2.RoleTypeID)=49)) AND ij2.HasSucceeded = 1 

,我工作的應用程序是一個模擬的應用程序,爲了瞭解上述查詢的方面,我認爲有必要,給應用程序的簡要說明。讓我們假設有一個擁有一些初始資源和生命代理的星球。這個星球被允許存在1000年,並且由代理人執行的行爲被監視並存儲在數據庫中。 1000年後,這顆行星被摧毀,並再次以同樣的初始資源和生活代理重新創建,這是第一次。這(創建和銷燬)重複了18次,並且在這1000年中執行的所有代理的所有行爲都存儲在數據庫中。因此,我們的整個實驗由18個被稱爲「模擬」的重新創建組成。這個星球18次被重新創建的每一次都被稱爲一次奔跑,1000年的每一次奔跑都被稱爲一段時間。所以「模擬」包含18次運行,每次運行包含1000次。在每次運行開始時,我們將「模擬」分配爲一組初始知識項目和動態代理,這些知識項目和動態代理可以相互交互並與項目交互。知識項目由代理存儲在知識庫中。知識庫也被認爲是我們模擬中的參與實體。但是這個概念(關於知識商店)並不重要。我試圖詳細說明每個SELECT語句和涉及的表。選擇1:我認爲這個查詢可以替換爲'發生'表,因爲它沒有什麼用處。表發生存儲代理在特定「模擬」的每次模擬運行的每個週期中採取的不同動作。通常每個「模擬」包含18次運行。每次運行由1000個週期組成。在「模擬」中,代理可以在每次運行的每個時段採取行動。但是「發生」表不存儲任何有關執行操作的代理的詳細信息。發生表可能存儲與多個「模擬」相關的數據。

SELECT2:該查詢只是簡單地返回「模擬」每次運行的每個週期中執行的操作的細節,以及「模擬」的所有參與者的詳細信息,如其各自的ParticipantID。對於模擬的每一個參與實體的OccurrenceParticipant表存儲記錄,包括代理商,知識存量,知識項目等

選擇三:該查詢返回只從僞表ij3是由於代理和知識的項目的記錄。 ij3中關於知識項目的所有記錄都將被過濾掉。

SELECT4:此查詢將'Description'字段附加到'InitialParticipant'的每個記錄。請注意,'Description'列是整個查詢的輸出列。 InitialParticipant表包含每個代理和每個知識項的記錄,這些記錄最初分配給'模擬'。SELECT5:此最終查詢返回參與實體的RoleType(可能爲代理或知識項)是49或52.

+3

爲什麼不只是編輯您的問題而不是此答案「? – 2010-01-07 03:31:21

2

我已經重新格式化您的代碼(使用我的家庭衝煮sql formatter),希望能夠讓別人更容易閱讀..

重新格式化查詢:

SELECT 
    ij1.Description, 
    ij2.occSimulationID, 
    ij2.occSimRunID, 
    ij2.Period, 
    Count(ij2.occpParticipantID) AS CountOfParticipantID 

FROM (

    SELECT 
     ip.ParticipantGroupID AS ipParticipantGroupID, 
     ip.ParticipantID AS ipParticipantID, 
     ip.ParticipantTypeID, 
     pg.ParticipantGroupID AS pgParticipantGroupID, 
     pg.ParticipantGroupTypeID, 
     pg.Description 

    FROM ParticipantGroup AS pg 

    INNER JOIN InitialParticipant AS ip 
      ON pg.ParticipantGroupID = ip.ParticipantGroupID 

) AS ij1 

INNER JOIN (

    SELECT 
     tpa.participantID AS tpaParticipantID, 
     ij3.* 

    FROM tmpPartArgs AS tpa 

    INNER JOIN (

     SELECT 
      ij4.*, 
      occp.SimulationID AS occpSimulationID, 
      occp.SimRunID AS occpSimRunID, 
      occp.OccurrenceID AS occpOccurrenceID, 
      occp.ParticipantID AS occpParticipantID, 
      occp.RoleTypeID 

     FROM (

      SELECT 
       tsa.SimulationID AS tsaSimulationID, 
       tsa.SimRunID AS tsaSimRunID, 
       occ.SimulationID AS occSimulationID, 
       occ.SimRunID AS occSimRunID, 
       occ.OccurrenceID AS occOccurrenceID, 
       occ.OccurrenceTypeID, 
       occ.Period, 
       occ.HasSucceeded 

      FROM tmpSimArgs AS tsa 

      INNER JOIN Occurrence AS occ 
        ON (tsa.SimRunID = occ.SimRunID) 
        AND (tsa.SimulationID = occ.SimulationID) 

     ) AS ij4 

     INNER JOIN OccurrenceParticipant AS occp 
       ON (occOccurrenceID = occpOccurrenceID) 
       AND (occSimRunID = occpSimRunID) 
       AND (occSimulationID = occpSimulationID) 

    ) AS ij3 
     ON tpa.participantID = ij3.occpParticipantID 

) AS ij2 
    ON ij1.ipParticipantID = ij2.occpParticipantID 

WHERE (

    (

     (ij2.RoleTypeID) = 52 
     OR 
     (ij2.RoleTypeID) = 49 

    ) 

) 
    AND ij2.HasSucceeded = 1 

GROUP BY 
    ij1.Description, 
    ij2.occSimulationID, 
    ij2.occSimRunID, 
    ij2.Period; 

作爲每JohnFx(上文),I是由派生視圖混淆。我認爲實際上並不需要它,尤其是因爲它們都是內部聯接。所以,下面我試圖減少複雜性。請檢查並測試性能。我不得不使用tmpSimArgs進行交叉連接,因爲它只與Occurence連接 - 我認爲這是期望的行爲。

SELECT 
    pg.Description, 
    occ.SimulationID, 
    occ.SimRunID, 
    occ.Period, 
    COUNT(occp.ParticipantID) AS CountOfParticipantID 

FROM ParticipantGroup AS pg 

INNER JOIN InitialParticipant AS ip 
     ON pg.ParticipantGroupID = ip.ParticipantGroupID 

CROSS JOIN tmpSimArgs AS tsa 

INNER JOIN Occurrence AS occ 
     ON tsa.SimRunID = occ.SimRunID 
     AND tsa.SimulationID = occ.SimulationID 

INNER JOIN OccurrenceParticipant AS occp 
     ON occ.OccurrenceID = occp.OccurrenceID 
     AND occ.SimRunID = occp.SimRunID 
     AND occ.SimulationID = occp.SimulationID 

INNER JOIN tmpPartArgs AS tpa 
     ON tpa.participantID = occp.ParticipantID 

WHERE occ.HasSucceeded = 1 
    AND (occp.RoleTypeID = 52 OR occp.RoleTypeID = 49) 

GROUP BY 
    pg.Description, 
    occ.SimulationID, 
    occ.SimRunID, 
    occ.Period; 
0

我建議移動ij2.RoleTypeID從最外面的查詢過濾,ij3,使用IN而不是OR和移動HasSucceeded查詢ij4。