2015-02-08 60 views
1

我想查找每個客戶購買的最頻繁的產品。我的數據集是這樣的:在sql server 2012中查找最頻繁的值

CustomerID  ProdID FavouriteProduct 
    1    A    ? 
    1    A    ? 
    1    A    ? 
    1    B    ? 
    1    A    ? 
    1    A    ? 
    1    A    ? 
    1    B    ? 
    2    A    ? 
    2    AN    ? 
    2    G    ? 
    2    C    ? 
    2    C    ? 
    2    F    ? 
    2    D    ? 
    2    C    ? 

有這麼多的產品,所以我不能把它們放在數據透視表中。

回答是這樣的:

CustomerID  ProdID FavouriteProduct 
    1    A    A 
    1    A    A 
    1    A    A 
    1    B    A 
    1    A    A 
    1    A    A 
    1    A    A 
    1    B    A 
    2    A    C 
    2    AN    C 
    2    G    C 
    2    C    C 
    2    C    C 
    2    F    C 
    2    D    C 
    2    C    C 

的查詢可能是這樣的:

Update table 
set FavouriteProduct = (Select 
          CustomerID, Product, Max(Count(Product)) 
         From Table 
         group by CustomerID, Product) FP  
+0

樞軸與此無關。首先計算出返回每個客戶最喜愛的產品的查詢。你快到了。然後我們可以幫助進行更新。 – 2015-02-08 12:21:03

+0

@ Nick.McDermaid - 我知道,我剛纔說如果產品的數量是三到四個,我們可以通過數據透視表很容易地找到最喜歡的產品。但現在? – Ariox66 2015-02-08 12:28:19

+1

轉到此頁面的底部http://www.sql-server-performance.com/2006/find-frequent-values/,看看您是否可以修改SQL以返回所有客戶使用他們最喜愛的產品的列表。 – 2015-02-08 12:31:44

回答

1

感謝尼克,我找到了一種方法,找到最頻繁的值。我與你分享它是如何工作:

Select CustomerID,ProductID,Count(*) as Number 
    from table A 
    group by CustomerID,ProductID 
    having Count(*)>= (Select Max(Number) from (Select CustomerID,ProductID,Count(*) as Number from table B where B.CustomerID= A.CustomerID group by CustomerID,Product)C) 
1

萬一您的SQL不執行速度不夠快,你有客戶也是一個較小的表,這可能會更好地工作::

​​
2

另一個的方式得到最頻繁的產品是使用row_number()

select customerid, productid, 
     max(case when seqnum = 1 then productid end) over (partition by customerid) as favoriteproductid 
from (select customerid, productid, count(*) as cnt, 
      row_number() over (partition by customerid order by count(*) desc) as seqnum 
     from customer c 
     group by customerid, productid 
    ) cp; 
1

這其中,基於在此頁面末尾的例子:http://www.sql-server-performance.com/2006/find-frequent-values/可能會更快:

SELECT CustomerID, ProdID, Cnt 
FROM 
(
    SELECT CustomerID, ProdID, COUNT(*) as Cnt, 
    RANK() OVER (
     PARTITION BY CustomerID 
     ORDER BY COUNT(*) DESC 
    ) AS Rnk 
    FROM YourTransactionTable 
    GROUP BY CustomerID, ProdID 
) x 
WHERE Rnk = 1 

此人使用RANK()函數。在這種情況下,您不必回到同一個表格(這意味着需要很少的工作)

現在爲了更新您的現有數據,我喜歡將我的數據集包裝在WITH中以使調試更容易一些而最後的更新簡單一點:

;WITH 
(
    SELECT CustomerID, ProdID, Cnt 
    FROM 
    (
    SELECT CustomerID, ProdID, COUNT(*) as Cnt, 
    RANK() OVER (PARTITION BY CustomerID 
    ORDER BY COUNT(*) DESC) AS Rnk 
    FROM TransactionTable 
    GROUP BY CustomerID, ProdID 
) x 
    WHERE Rnk = 1 
) As SRC 

UPDATE FavouriteTable 
SET Favourite = SRC.ProdID 
FROM SRC 
WHERE SRC.CustomerID = Favourite.CustomerID 
2

要完全按照你在問題中所描述返回行,你可以嘗試使用表表達式(我用了一個CTE在我的例子)先返回一個人氣排名,其中數字越高,每個客戶的產品越受歡迎。

WITH RankTable AS (
    SELECT 
    CustomerID, ProductID, COUNT(*) AS Popularity 
    FROM TableA 
    GROUP BY CustomerID, ProductID 
) 

然後全部結果表可以通過首先執行對原始表(表A)和表表達式(RankTable)內部聯接,然後使用窗函數來創建在FavoriteProduct列中的值被返回。

SELECT 
    P.CustomerID 
    , P.ProductID 
    , FIRST_VALUE(P.ProductID) OVER(
     PARTITION BY R.CustomerID 
     ORDER BY R.Popularity DESC, R.ProductID) AS FavoriteProduct 
FROM TableA AS P 
    INNER JOIN RankTable AS R 
    ON P.CustomerID = R.CustomerID 
    AND P.ProductID= R.ProductID;