2016-06-07 118 views
4

假設這個表:如何分組連接多個列?

PruchaseID | Customer | Product | Method 
-----------|----------|----------|-------- 
1   | John  | Computer | Credit 
2   | John  | Mouse | Cash 
3   | Will  | Computer | Credit 
4   | Will  | Mouse | Cash 
5   | Will  | Speaker | Cash 
6   | Todd  | Computer | Credit 

我想生成對他們買什麼每一位客戶,他們的支付方法的報告。
但我想該報告是每個客戶一行,如:

Customer | Products     | Methods 
---------|--------------------------|-------------- 
John | Computer, Mouse   | Credit, Cash 
Will | Computer, Mouse, Speaker | Credit, Cash 
Todd | Computer     | Credit 

什麼我發現到目前爲止是組接續模式採用XML PATH方法,如:

SELECT 
    p.Customer, 
    STUFF(
     SELECT ', ' + xp.Product 
     FROM Purchases xp 
     WHERE xp.Customer = p.Customer 
     FOR XML PATH('')), 1, 1, '') AS Products, 
    STUFF(
     SELECT ', ' + xp.Method 
     FROM Purchases xp 
     WHERE xp.Customer = p.Customer 
     FOR XML PATH('')), 1, 1, '') AS Methods 
FROM Purchases 

這給了我的結果,但我關心的是這個速度。
乍一看有三種不同的選擇在這裏進行,其中兩個將乘以購買的行數。最終這會慢慢減慢。

那麼,有沒有辦法做到這一點有更好的表現?
我想添加更多的列來聚合,我應該爲每個列做這個STUFF()塊嗎?這聽起來不夠快。

Siggestions?

+0

好吧,你正在反規範化你的數據來做到這一點,因此性能將是一個潛在的挑戰。 XML方法是將數據非規範化爲分隔列表的最佳方法。 –

+0

使用'for xml path'時要小心,如果你有例如'&'的數據,它可能會讓你大吃一驚。 Aaron Bertrand做了一個[比較](http://sqlperformance.com/2014/08/t-sql-queries/sql-server-grouped-concatenation)您可能想要查看的不同方法。 –

回答

4

只是一個想法:

DECLARE @t TABLE (
    Customer VARCHAR(50), 
    Product VARCHAR(50), 
    Method VARCHAR(50), 
    INDEX ix CLUSTERED (Customer) 
) 

INSERT INTO @t (Customer, Product, Method) 
VALUES 
    ('John', 'Computer', 'Credit'), 
    ('John', 'Mouse', 'Cash'), 
    ('Will', 'Computer', 'Credit'), 
    ('Will', 'Mouse', 'Cash'), 
    ('Will', 'Speaker', 'Cash'), 
    ('Todd', 'Computer', 'Credit') 

SELECT t.Customer 
    , STUFF(CAST(x.query('a/text()') AS NVARCHAR(MAX)), 1, 2, '') 
    , STUFF(CAST(x.query('b/text()') AS NVARCHAR(MAX)), 1, 2, '') 
FROM (
    SELECT DISTINCT Customer 
    FROM @t 
) t 
OUTER APPLY (
    SELECT DISTINCT [a] = CASE WHEN id = 'a' THEN ', ' + val END 
        , [b] = CASE WHEN id = 'b' THEN ', ' + val END 
    FROM @t t2 
    CROSS APPLY (
     VALUES ('a', t2.Product) 
      , ('b', t2.Method) 
    ) t3 (id, val) 
    WHERE t2.Customer = t.Customer 
    FOR XML PATH(''), TYPE 
) t2 (x) 

輸出:

Customer Product     Method  
---------- -------------------------- ------------------ 
John  Computer, Mouse   Cash, Credit 
Todd  Computer     Credit 
Will  Computer, Mouse, Speaker Cash, Credit 

更多的性能優勢,另一個想法:

IF OBJECT_ID('tempdb.dbo.#EntityValues') IS NOT NULL 
    DROP TABLE #EntityValues 

DECLARE @Values1 VARCHAR(MAX) 
     , @Values2 VARCHAR(MAX) 

SELECT Customer 
    , Product 
    , Method 
    , RowNum = ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY 1/0) 
    , Values1 = CAST(NULL AS VARCHAR(MAX)) 
    , Values2 = CAST(NULL AS VARCHAR(MAX)) 
INTO #EntityValues 
FROM @t 

UPDATE #EntityValues 
SET 
     @Values1 = Values1 = 
     CASE WHEN RowNum = 1 
      THEN Product 
      ELSE @Values1 + ', ' + Product 
     END 
    , @Values2 = Values2 = 
     CASE WHEN RowNum = 1 
      THEN Method 
      ELSE @Values2 + ', ' + Method 
     END 

SELECT Customer 
     , Values1 = MAX(Values1) 
     , Values2 = MAX(Values2) 
FROM #EntityValues 
GROUP BY Customer 

但是有一些限制:

Customer  Values1      Values2 
------------- ----------------------------- ---------------------- 
John   Computer, Mouse    Credit, Cash 
Todd   Computer      Credit 
Will   Computer, Mouse, Speaker  Credit, Cash, Cash 

還檢查我的舊文章有關字符串聚合:

http://www.codeproject.com/Articles/691102/String-Aggregation-in-the-World-of-SQL-Server

+0

有用的替代方法。 – niksofteng

+1

嗨Devart,我喜歡那樣! – Shnugo

+0

@Shnugo謝謝:)非常感謝... – Devart

1

這是用例的遞歸的CTE(公共表表達式)之一。你可以在這裏瞭解更多https://technet.microsoft.com/en-us/library/ms190766(v=sql.105).aspx

; 
WITH CTE1 (PurchaseID, Customer, Product, Method, RowID) 
AS 
(
    SELECT 
     PurchaseID, Customer, Product, Method, 
     ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY Customer) 
    FROM 
     @tbl 
     /* This table holds source data. I ommited declaring and inserting 
     data into it because that's not important. */ 
) 
, CTE2 (PurchaseID, Customer, Product, Method, RowID) 
AS 
(
    SELECT 
     PurchaseID, Customer, 
     CONVERT(VARCHAR(MAX), Product), 
     CONVERT(VARCHAR(MAX), Method), 
     1 
    FROM 
     CTE1 
    WHERE 
     RowID = 1 
    UNION ALL 
    SELECT 
     CTE2.PurchaseID, CTE2.Customer, 
     CONVERT(VARCHAR(MAX), CTE2.Product + ',' + CTE1.Product), 
     CONVERT(VARCHAR(MAX), CTE2.Method + ',' + CTE1.Method), 
     CTE2.RowID + 1 
    FROM 
     CTE2 INNER JOIN CTE1 
      ON CTE2.Customer = CTE1.Customer 
      AND CTE2.RowID + 1 = CTE1.RowID 
) 

SELECT Customer, MAX(Product) AS Products, MAX(Method) AS Methods 
FROM CTE2 
GROUP BY Customer 

輸出:

Customer Products    Methods 
John  Computer,Mouse   Credit,Cash 
Todd  Computer    Credit 
Will  Computer,Mouse,Speaker Credit,Cash,Cash 
+2

嗨,@JamesZ上面發佈了一個鏈接[性能比較](http://sqlperformance.com/2014/08/t-sql-queries/sql-server-grouped-concatenation)。你可以看看這個。您的代碼可以正常工作,但**性能很差** ... – Shnugo

1

另一種解決方案是組串聯的CLR方法@aaron貝特朗做這個here的性能比較。 如果您可以部署CLR,然後從http://groupconcat.codeplex.com/下載免費的腳本。 以及文檔中的所有詳細信息。 您的查詢只會變成這樣

SELECT Customer,dbo.GROUP_CONCAT(product),dbo.GROUP_CONCAT(method) 
FROM Purchases 
GROUP BY Customer 

這個查詢短,易於記憶和使用,XML方法也做了工作,但記住的代碼是有點困難(ATLEAST我)和毛骨悚然的像XML實體化這樣的問題可以得到解決,並且在他的博客中也描述了一些陷阱。

也從性能角度看使用。查詢很耗時我在性能方面遇到了同樣的問題。我希望你能找到我在https://dba.stackexchange.com/questions/125771/multiple-column-concatenation 這裏提出的這個問題,檢查kenneth fisher給出的版本2嵌套的xml連接方法或者spaggettidba建議的unpivot/pivot方法。