2009-07-24 35 views
3

Jeff最近問了this question並得到了一些很好的答案。用於確定連續訪問天數的不同時段的SQL?

傑夫的問題圍繞着找到已連續幾天登錄到系統的用戶。使用數據庫表結構如下:

 
Id  UserId CreationDate 
------ ------ ------------ 
750997  12 2009-07-07 18:42:20.723 
750998  15 2009-07-07 18:42:20.927 
751000  19 2009-07-07 18:42:22.283 

讀第一爲清楚起見,然後...

我被確定有多少不同(N) - 天爲一個週期的問題很感興趣用戶。

是否可以創建一個可以返回用戶列表的快速SQL查詢以及它們具有的不同(n)天期數?

編輯:根據下面的評論如果某人有連續2天,那麼差距,然後連續4天,然後一個差距,然後連續8天。這將是3「不同的4天時期」。 8天的時間應該算作兩個背靠背的4天時間。

+0

你能在你的 「不同的(N) - 天週期」 的確定指標詳細點嗎?如果有人連續2天,那麼差距,然後連續4天,然後一個差距,然後連續8天,是2「不同的4天期」或3? (8天期間連續兩天是4天?) – MatBailie 2009-07-24 14:09:53

回答

0

這與我的測試數據非常吻合。

DECLARE @days int 
SET @days = 30 

SELECT DISTINCT l.UserId, (datediff(d,l.CreationDate, -- Get first date in contiguous range 
(
    SELECT min(a.CreationDate) as CreationDate 
    FROM UserHistory a 
     LEFT OUTER JOIN UserHistory b 
      ON a.CreationDate = dateadd(day, -1, b.CreationDate) AND 
      a.UserId = b.UserId 
    WHERE b.CreationDate IS NULL AND 
     a.CreationDate >= l.CreationDate AND 
     a.UserId = l.UserId 
))+1)/@days as cnt 
INTO #cnttmp 
FROM UserHistory l 
    LEFT OUTER JOIN UserHistory r 
     ON r.CreationDate = dateadd(day, -1, l.CreationDate) AND 
     r.UserId = l.UserId 
WHERE r.CreationDate IS NULL 
ORDER BY l.UserId 

SELECT UserId, sum(cnt) 
FROM #cnttmp 
GROUP BY UserId 
HAVING sum(cnt) > 0 
1

我的答案似乎還沒出現......

我再試一次......

羅布·法利的回答原來的問題有,包括連續數的便利好處天。

with numberedrows as 
(
     select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID 
     from tablename 
) 
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID 
from numberedrows 
group by UserID, TheOffset 

使用整數除法,簡單地將天的連續數給出「天期間不同(n)的」由整個連續週期覆蓋的數量...
- 2/4 = 0
- 第4/4 = 1
- 8/4 = 2
- 9/4 = 2
- 等,等

因此,這裏是我對羅布的回答您的需求...
(我真的很喜歡Rob's answer,去閱讀的解釋,它的靈感思維)

with 
    numberedrows (
     UserID, 
     TheOffset 
    ) 
as 
(
    select 
     UserID, 
     row_number() over (partition by UserID order by CreationDate) 
      - DATEDIFF(DAY, 0, CreationDate) as TheOffset 
    from 
     tablename 
), 
    ConsecutiveCounts(
     UserID, 
     ConsecutiveDays 
    ) 
as 
(
    select 
     UserID, 
     count(*) as ConsecutiveDays 
    from 
     numberedrows 
    group by 
     UserID, 
     TheOffset 
) 
select 
    UserID, 
    SUM(ConsecutiveDays/@period_length) AS distinct_n_day_periods 
from 
    ConsecutiveCounts 
group by 
    UserID 

唯一真正的區別是,我把羅布的結果,然後通過另一GROUP BY運行它...

1

所以 - !我要去從我最後一個問題的查詢開始,其中列出了連續幾天的每次運行。然後,我將按用戶ID和NumConsecutiveDays將其分組,以計算這些用戶有多少天的運行時間。

with numberedrows as 
(
     select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID 
     from tablename 
) 
, 
runsOfDay as 
(
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID 
from numberedrows 
group by UserID, TheOffset 
) 
select UserID, NumConsecutiveDays, count(*) as NumOfRuns 
from runsOfDays 
group by UserID, NumConsecutiveDays 
; 

和當然,如果你想這個過濾器只考慮一定長度的運行,然後把「裏NumConsecutiveDays> = @days」在過去的查詢。

現在,如果您想要將連續16天計爲三次爲期5天的運行,那麼每次運行都會計爲NumConsecutiveDays/@runlength(這將對每個整數進行舍入)。因此,現在不是隻計算每個數的數量,而是使用SUM。你可以使用上面的查詢並使用SUM(NumOfRuns * NumConsecutiveDays/@runlength),但是如果你理解了邏輯,那麼下面的查詢就容易一些。

with numberedrows as 
(
     select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID 
     from tablename 
) 
, 
runsOfDay as 
(
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID 
from numberedrows 
group by UserID, TheOffset 
) 
select UserID, sum(NumConsecutiveDays/@runlength) as NumOfRuns 
from runsOfDays 
where NumConsecutiveDays >= @runlength 
group by UserID 
; 

希望這有助於

羅布