如何提高哈希匹配的外部連接的SQL Server性能問題

我是新來的性能問題。所以我不確定我的方法應該是什麼。如何提高哈希匹配的外部連接的SQL Server性能問題

這是超過7分鐘運行的查詢。

INSERT INTO SubscriberToEncounterMapping(PatientEncounterID, InsuranceSubscriberID) 
    SELECT 
     PV.PatientVisitId AS PatientEncounterID, 
     InsSub.InsuranceSubscriberID 
    FROM 
     DB1.dbo.PatientVisit PV 
    JOIN 
     DB1.dbo.PatientVisitInsurance PVI ON PV.PatientVisitId = PVI.PatientVisitId 
    JOIN 
     DB1.dbo.PatientInsurance PatIns on PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
    JOIN 
     DB1.dbo.PatientProfile PP On PP.PatientProfileId = PatIns.PatientProfileId 
    LEFT OUTER JOIN 
     DB1.dbo.Guarantor G ON PatIns.PatientProfileId = G.PatientProfileId 
    JOIN 
     Warehouse.dbo.InsuranceSubscriber InsSub ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
         AND InsSub.OrderForClaims = PatIns.OrderForClaims 
         AND ((InsSub.GuarantorID = G.GuarantorId) OR (InsSub.GuarantorID IS NULL AND G.GuarantorId IS NULL)) 
    JOIN 
     Warehouse.dbo.Encounter E ON E.PatientEncounterID = PV.PatientVisitId

執行計劃指出，有一個

哈希匹配右外連接，成本89％

查詢

。

沒有一個右外連接查詢，所以我不明白問題出在哪裏。

如何使查詢更有效？

這裏是哈希地圖詳情：

來源

2016-11-09 Gloria Santin

首先：我沒有看到你的語句使用您在.....也行的你'SELECT'列表使用'InsSub'別名任何表：你*真的*需要加入所有這些表格才能得到這兩條信息？ –

你可以顯示哈希匹配的細節嗎？什麼是探測器，輸出是什麼？從屏幕截圖中不清楚。我猜想這個謂詞會導致你的問題 - '（InsSub.GuarantorID = G.GuarantorId）或（InsSub.GuarantorID IS NULL AND G.GuarantorId IS NULL）'，你可能想要考慮使用兩個查詢，並且結合結果通常當你有這樣的OR或謂詞時，它會導致次優計劃，而且這兩個單獨的查詢能夠更好地利用索引。 – GarethD

@GarethD也許在where子句中使用EXISTS而不是在連接中使用這兩個謂詞？ – dfundako

要闡述我的意見，你可以嘗試它分裂成兩個查詢，第一個匹配GuarantorID和第二匹配當它在InsuranceSubscriberNULL，並在Guarantor，或者如果記錄完全丟失從Guarantor：

INSERT INTO SubscriberToEncounterMapping(PatientEncounterID, InsuranceSubscriberID) 
SELECT PV.PatientVisitId AS PatientEncounterID, InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV 
     JOIN DB1.dbo.PatientVisitInsurance PVI 
      ON PV.PatientVisitId = PVI.PatientVisitId 
     JOIN DB1.dbo.PatientInsurance PatIns 
      ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
     JOIN DB1.dbo.PatientProfile PP 
      ON PP.PatientProfileId = PatIns.PatientProfileId 
     JOIN DB1.dbo.Guarantor G 
      ON PatIns.PatientProfileId = G.PatientProfileId 
     JOIN Warehouse.dbo.InsuranceSubscriber InsSub 
      ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
      AND InsSub.OrderForClaims = PatIns.OrderForClaims 
      AND InsSub.GuarantorID = G.GuarantorId 
     JOIN Warehouse.dbo.Encounter E 
      ON E.PatientEncounterID = PV.PatientVisitId 
UNION ALL 
SELECT PV.PatientVisitId AS PatientEncounterID, InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV 
     JOIN DB1.dbo.PatientVisitInsurance PVI 
      ON PV.PatientVisitId = PVI.PatientVisitId 
     JOIN DB1.dbo.PatientInsurance PatIns 
      ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
     JOIN DB1.dbo.PatientProfile PP 
      ON PP.PatientProfileId = PatIns.PatientProfileId 
     JOIN Warehouse.dbo.InsuranceSubscriber InsSub 
      ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
      AND InsSub.OrderForClaims = PatIns.OrderForClaims 
      AND InsSub.GuarantorID IS NULL 
     JOIN Warehouse.dbo.Encounter E 
      ON E.PatientEncounterID = PV.PatientVisitId 
WHERE NOT EXISTS 
     ( SELECT 1 
      FROM DB1.dbo.Guarantor G 
      WHERE PatIns.PatientProfileId = G.PatientProfileId 
      AND  InsSub.GuarantorID IS NOT NULL 
     );

來源

2016-11-09 17:26:54 GarethD

這絕對快很多！但是返回的記錄與原始查詢不同。所以我將不得不推遲，但這絕對是要走的路！！ –

-2

的聯接基礎上，以減少每個返回的記錄數加入的能力我會重新排序。無論哪個加入可以減少返回的數量或記錄都會提高效率。然後執行外部連接。此外，表鎖定總是可能是一個問題，所以添加（nolock）以防止記錄被鎖定。

也許像這樣的東西將工作與一點點調整。

INSERT INTO SubscriberToEncounterMapping (
    PatientEncounterID 
    , InsuranceSubscriberID 
    ) 
SELECT PV.PatientVisitId AS PatientEncounterID 
    , InsSub.InsuranceSubscriberID 
FROM DB1.dbo.PatientVisit PV WITH (NOLOCK) 
INNER JOIN Warehouse.dbo.Encounter E WITH (NOLOCK) 
    ON E.PatientEncounterID = PV.PatientVisitId 
INNER JOIN DB1.dbo.PatientVisitInsurance PVI WITH (NOLOCK) 
    ON PV.PatientVisitId = PVI.PatientVisitId 
INNER JOIN DB1.dbo.PatientInsurance PatIns WITH (NOLOCK) 
    ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId 
INNER JOIN DB1.dbo.PatientProfile PP WITH (NOLOCK) 
    ON PP.PatientProfileId = PatIns.PatientProfileId 
INNER JOIN Warehouse.dbo.InsuranceSubscriber InsSub WITH (NOLOCK) 
    ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId 
     AND InsSub.OrderForClaims = PatIns.OrderForClaims 
LEFT JOIN DB1.dbo.Guarantor G WITH (NOLOCK) 
    ON PatIns.PatientProfileId = G.PatientProfileId 
     AND (
      (InsSub.GuarantorID = G.GuarantorId) 
      OR (
       InsSub.GuarantorID IS NULL 
       AND G.GuarantorId IS NULL 
       ) 
      )

來源

2016-11-09 17:12:56 KH1229

添加NOLOCK如何影響執行計劃中的散列連接運算符？ – dfundako

連接寫入的順序與它們被執行的順序無關（除非你使用'OPTION（FORCEORDER）'），所以這沒有任何區別。你也可以閱讀[不良習慣：把NOLOCK放在任何地方]（https://blogs.sentryone.com/aaronbertrand/bad-habits-nolock-everywhere/），這不是一個神奇的性能修復，應該謹慎使用通常是由那些瞭解並意識到風險的人。 – GarethD

我發現連接順序很重要，如果你想要優化器去做它的工作，那麼它可以自行優化或自行優化Joins。同意沒有鎖可能不需要或理想，但如果有東西被鎖定，它將通過防止等待鎖來更快地執行。如果它不幫助刪除它們。哈希匹配將始終存在，但減少操作中的記錄集大小應該有所幫助。 – KH1229

如何提高哈希匹配的外部連接的SQL Server性能問題

回答

相關問題