2016-11-09 49 views
0

我是新來的性能問題。所以我不確定我的方法應該是什麼。如何提高哈希匹配的外部連接的SQL Server性能問題

這是超過7分鐘運行的查詢。

INSERT INTO SubscriberToEncounterMapping(PatientEncounterID, InsuranceSubscriberID) 
    SELECT 
     PV.PatientVisitId AS PatientEncounterID, 
     InsSub.InsuranceSubscriberID 
    FROM 
     DB1.dbo.PatientVisit PV 
    JOIN 
     DB1.dbo.PatientVisitInsurance PVI ON PV.PatientVisitId = PVI.PatientVisitId 
    JOIN 
     DB1.dbo.PatientInsurance PatIns on PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
    JOIN 
     DB1.dbo.PatientProfile PP On PP.PatientProfileId = PatIns.PatientProfileId 
    LEFT OUTER JOIN 
     DB1.dbo.Guarantor G ON PatIns.PatientProfileId = G.PatientProfileId 
    JOIN 
     Warehouse.dbo.InsuranceSubscriber InsSub ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
         AND InsSub.OrderForClaims = PatIns.OrderForClaims 
         AND ((InsSub.GuarantorID = G.GuarantorId) OR (InsSub.GuarantorID IS NULL AND G.GuarantorId IS NULL)) 
    JOIN 
     Warehouse.dbo.Encounter E ON E.PatientEncounterID = PV.PatientVisitId  

執行計劃指出,有一個

哈希匹配右外連接,成本89%

查詢

enter image description here

沒有一個右外連接查詢,所以我不明白問題出在哪裏。

如何使查詢更有效?

這裏是哈希地圖詳情: enter image description here

+0

首先:我沒有看到你的語句使用您在.....也行的你'SELECT'列表使用'InsSub'別名任何表:你*真的*需要加入所有這些表格才能得到這兩條信息? –

+0

你可以顯示哈希匹配的細節嗎?什麼是探測器,輸出是什麼?從屏幕截圖中不清楚。我猜想這個謂詞會導致你的問題 - '(InsSub.GuarantorID = G.GuarantorId)或(InsSub.GuarantorID IS NULL AND G.GuarantorId IS NULL)',你可能想要考慮使用兩個查詢,並且結合結果通常當你有這樣的OR或謂詞時,它會導致次優計劃,而且這兩個單獨的查詢能夠更好地利用索引。 – GarethD

+0

@GarethD也許在where子句中使用EXISTS而不是在連接中使用這兩個謂詞? – dfundako

回答

1

要闡述我的意見,你可以嘗試它分裂成兩個查詢,第一個匹配GuarantorID和第二匹配當它在InsuranceSubscriberNULL,並在Guarantor,或者如果記錄完全丟失從Guarantor

INSERT INTO SubscriberToEncounterMapping(PatientEncounterID, InsuranceSubscriberID) 
SELECT PV.PatientVisitId AS PatientEncounterID, InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV 
     JOIN DB1.dbo.PatientVisitInsurance PVI 
      ON PV.PatientVisitId = PVI.PatientVisitId 
     JOIN DB1.dbo.PatientInsurance PatIns 
      ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
     JOIN DB1.dbo.PatientProfile PP 
      ON PP.PatientProfileId = PatIns.PatientProfileId 
     JOIN DB1.dbo.Guarantor G 
      ON PatIns.PatientProfileId = G.PatientProfileId 
     JOIN Warehouse.dbo.InsuranceSubscriber InsSub 
      ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
      AND InsSub.OrderForClaims = PatIns.OrderForClaims 
      AND InsSub.GuarantorID = G.GuarantorId 
     JOIN Warehouse.dbo.Encounter E 
      ON E.PatientEncounterID = PV.PatientVisitId 
UNION ALL 
SELECT PV.PatientVisitId AS PatientEncounterID, InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV 
     JOIN DB1.dbo.PatientVisitInsurance PVI 
      ON PV.PatientVisitId = PVI.PatientVisitId 
     JOIN DB1.dbo.PatientInsurance PatIns 
      ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
     JOIN DB1.dbo.PatientProfile PP 
      ON PP.PatientProfileId = PatIns.PatientProfileId 
     JOIN Warehouse.dbo.InsuranceSubscriber InsSub 
      ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
      AND InsSub.OrderForClaims = PatIns.OrderForClaims 
      AND InsSub.GuarantorID IS NULL 
     JOIN Warehouse.dbo.Encounter E 
      ON E.PatientEncounterID = PV.PatientVisitId 
WHERE NOT EXISTS 
     ( SELECT 1 
      FROM DB1.dbo.Guarantor G 
      WHERE PatIns.PatientProfileId = G.PatientProfileId 
      AND  InsSub.GuarantorID IS NOT NULL 
     ); 
+0

這絕對快很多!但是返回的記錄與原始查詢不同。所以我將不得不推遲,但這絕對是要走的路!! –

-2

的聯接基礎上,以減少每個返回的記錄數加入的能力我會重新排序。無論哪個加入可以減少返回的數量或記錄都會提高效率。然後執行外部連接。此外,表鎖定總是可能是一個問題,所以添加(nolock)以防止記錄被鎖定。

也許像這樣的東西將工作與一點點調整。

INSERT INTO SubscriberToEncounterMapping (
    PatientEncounterID 
    , InsuranceSubscriberID 
    ) 
SELECT PV.PatientVisitId AS PatientEncounterID 
    , InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV WITH (NOLOCK) 
INNER JOIN Warehouse.dbo.Encounter E WITH (NOLOCK) 
    ON E.PatientEncounterID = PV.PatientVisitId 
INNER JOIN DB1.dbo.PatientVisitInsurance PVI WITH (NOLOCK) 
    ON PV.PatientVisitId = PVI.PatientVisitId 
INNER JOIN DB1.dbo.PatientInsurance PatIns WITH (NOLOCK) 
    ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
INNER JOIN DB1.dbo.PatientProfile PP WITH (NOLOCK) 
    ON PP.PatientProfileId = PatIns.PatientProfileId 
INNER JOIN Warehouse.dbo.InsuranceSubscriber InsSub WITH (NOLOCK) 
    ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
     AND InsSub.OrderForClaims = PatIns.OrderForClaims 
LEFT JOIN DB1.dbo.Guarantor G WITH (NOLOCK) 
    ON PatIns.PatientProfileId = G.PatientProfileId 
     AND (
      (InsSub.GuarantorID = G.GuarantorId) 
      OR (
       InsSub.GuarantorID IS NULL 
       AND G.GuarantorId IS NULL 
       ) 
      ) 
+1

添加NOLOCK如何影響執行計劃中的散列連接運算符? – dfundako

+1

連接寫入的順序與它們被執行的順序無關(除非你使用'OPTION(FORCEORDER)'),所以這沒有任何區別。你也可以閱讀[不良習慣:把NOLOCK放在任何地方](https://blogs.sentryone.com/aaronbertrand/bad-habits-nolock-everywhere/),這不是一個神奇的性能修復,應該謹慎使用通常是由那些瞭解並意識到風險的人。 – GarethD

+0

我發現連接順序很重要,如果你想要優化器去做它的工作,那麼它可以自行優化或自行優化Joins。同意沒有鎖可能不需要或理想,但如果有東西被鎖定,它將通過防止等待鎖來更快地執行。如果它不幫助刪除它們。哈希匹配將始終存在,但減少操作中的記錄集大小應該有所幫助。 – KH1229