這裏是我的版本。我真的只是把它作爲一種好奇心來表達,以展示另一種思考問題的方式。事實證明它比這更有用,因爲它甚至比馬丁史密斯酷炫的「羣島」解決方案的表現還要好。但是,一旦他擺脫了一些過於昂貴的聚合窗口功能,並且做了真正的聚合,他的查詢開始踢屁股。
解決方案1:運行3個月或更長時間,通過檢查前後1個月並使用半連接來完成。
WITH Months AS (
SELECT DISTINCT
O.CustID,
Grp = DateDiff(Month, '20000101', O.OrderDate)
FROM
CustOrder O
), Anchors AS (
SELECT
M.CustID,
Ind = M.Grp + X.Offset
FROM
Months M
CROSS JOIN (
SELECT -1 UNION ALL SELECT 0 UNION ALL SELECT 1
) X (Offset)
GROUP BY
M.CustID,
M.Grp + X.Offset
HAVING
Count(*) = 3
)
SELECT
C.CustName,
[Year] = Year(OrderDate),
O.OrderDate
FROM
Cust C
INNER JOIN CustOrder O ON C.CustID = O.CustID
WHERE
EXISTS (
SELECT 1
FROM
Anchors A
WHERE
O.CustID = A.CustID
AND O.OrderDate >= DateAdd(Month, A.Ind, '19991201')
AND O.OrderDate < DateAdd(Month, A.Ind, '20000301')
)
ORDER BY
C.CustName,
OrderDate;
解決方案2:精確3個月的圖案。如果是4個月或更長時間的運行,則排除這些值。這是通過檢查前2個月和後兩個月(基本上尋找模式N,Y,Y,Y,N)完成的。
WITH Months AS (
SELECT DISTINCT
O.CustID,
Grp = DateDiff(Month, '20000101', O.OrderDate)
FROM
CustOrder O
), Anchors AS (
SELECT
M.CustID,
Ind = M.Grp + X.Offset
FROM
Months M
CROSS JOIN (
SELECT -2 UNION ALL SELECT -1 UNION ALL SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2
) X (Offset)
GROUP BY
M.CustID,
M.Grp + X.Offset
HAVING
Count(*) = 3
AND Min(X.Offset) = -1
AND Max(X.Offset) = 1
)
SELECT
C.CustName,
[Year] = Year(OrderDate),
O.OrderDate
FROM
Cust C
INNER JOIN CustOrder O ON C.CustID = O.CustID
INNER JOIN Anchors A
ON O.CustID = A.CustID
AND O.OrderDate >= DateAdd(Month, A.Ind, '19991201')
AND O.OrderDate < DateAdd(Month, A.Ind, '20000301')
ORDER BY
C.CustName,
OrderDate;
這裏是我的表加載腳本,如果別人想打:
IF Object_ID('CustOrder', 'U') IS NOT NULL DROP TABLE CustOrder
IF Object_ID('Cust', 'U') IS NOT NULL DROP TABLE Cust
GO
SET NOCOUNT ON
CREATE TABLE Cust (
CustID int identity(1,1) NOT NULL PRIMARY KEY CLUSTERED,
CustName varchar(100) UNIQUE
)
CREATE TABLE CustOrder (
OrderID int identity(100, 1) NOT NULL PRIMARY KEY CLUSTERED,
CustID int NOT NULL FOREIGN KEY REFERENCES Cust (CustID),
OrderDate smalldatetime NOT NULL
)
DECLARE @i int
SET @i = 1000
WHILE @i > 0 BEGIN
WITH N AS (
SELECT
Nm =
Char(Abs(Checksum(NewID())) % 26 + 65)
+ Char(Abs(Checksum(NewID())) % 26 + 97)
+ Char(Abs(Checksum(NewID())) % 26 + 97)
+ Char(Abs(Checksum(NewID())) % 26 + 97)
+ Char(Abs(Checksum(NewID())) % 26 + 97)
+ Char(Abs(Checksum(NewID())) % 26 + 97)
)
INSERT Cust
SELECT N.Nm
FROM N
WHERE NOT EXISTS (
SELECT 1
FROM Cust C
WHERE
N.Nm = C.CustName
)
SET @i = @i - @@RowCount
END
WHILE @i < 50000 BEGIN
INSERT CustOrder
SELECT TOP (50000 - @i)
Abs(Checksum(NewID())) % 1000 + 1,
DateAdd(Day, Abs(Checksum(NewID())) % 10000, '19900101')
FROM master.dbo.spt_values
SET @i = @i + @@RowCount
END
性能
這裏有一些性能測試結果爲3個月或更多的查詢:
Query CPU Reads Duration
Martin 1 2297 299412 2348
Martin 2 625 285 809
Denis 3641 401 3855
Erik 1855 94727 2077
這只是一次運行每個,但數字是相當具有代表性的。事實證明,你的查詢並不是那麼糟糕,畢竟,丹尼斯。馬丁的查詢擊敗了其他人,但起初他使用了一些他固定的過於昂貴的窗口功能策略。
當然,正如我所指出的,當客戶在同一天有兩個訂單時,丹尼斯的查詢不會拉動正確的行,所以他的查詢不存在爭用,除非他是固定的。
此外,不同的指數可能會改變事情。我不知道。
如果將'113,13-AUG-2007,1'行添加到訂單表中,您希望輸出什麼? AA的輸出塊有4行或兩個輸出塊,每行包含3行?如果您願意,是否「一次嚴格三個月」或「一次三個月以上」。 – 2010-09-19 00:40:00
對不起,我比較喜歡三個月 – Gopi 2010-09-20 15:22:57
你的意思是說一個4個月的字符串會返回6行,一個是第1,2,3個月,另一個是第2,3,4個月,或者只是排除所有不完全是3個月的訂單? – ErikE 2010-09-20 17:04:06