2016-09-30 98 views
2

我有SQL Server表名爲table1的其中有一個時間戳列column_ts和一些列說列1,列2,欄3如何在30分鐘窗口中選擇時間戳最高的行?

所以表如下所示:

column_ts     column1  column2  column3 
2016-09-30 00:04:00.000  number1  string1  integer1 
2016-09-30 00:24:00.000  number2  string2  integer2 
2016-09-30 00:29:00.000  number3  string3  integer3 
2016-09-30 00:44:00.000  number4  string4  integer4 
2016-09-30 00:48:00.000  number5  string5  integer5 
2016-09-30 01:04:00.000  number6  string6  integer6 
2016-09-30 01:24:00.000  number7  string7  integer7 
2016-09-30 01:54:00.000  number8  string8  integer8 
2016-09-30 01:59:00.000  number9  string9  integer9 

首先,我將選擇記錄where column_ts >= 2016-09-30 00:00:00.000。然後,我想從column_ts的每個30分鐘窗口中僅選擇一個具有最高時間戳的行。

因此,對於給定的數據,查詢應只選擇以下行:

column_ts     column1  column2  column3 
2016-09-18 00:29:00.000  number3  string3  integer3 
2016-09-18 00:48:00.000  number5  string5  integer5 
2016-09-18 01:24:00.000  number7  string7  integer7 
2016-09-18 01:59:00.000  number9  string9  integer9 

在某種程度上,我想使column_ts30分鐘窗戶像

1)2016-09 -30 00:00:00.000 - 2016-09-30 00:30:00.000
2)2016-09-30 00:30:00.000 - 2016-09-30 01:00:00.000
3)2016-09 -30 01:00:00.000 - 2016-09-30 01:30:00.000
4)2016-09- 30 01:30:00.000 - 2016-09-30 02:00:00.000

最後,希望從這30個分鐘窗口的每一個窗口中選擇一個具有最高值的行column_ts

我無法弄清楚如何生成30分鐘的窗口,我可以從中選擇MAX(column_ts)。請建議我如何做到這一點。

+0

你能後你試一下查詢?它可以幫助我們找出你錯在哪裏? – mfredy

+0

'終於要選擇具有從這些30分鐘windows'是如何產生的,這些對於column_ts最高值一列,看來你是簡單地生成每天30分鐘的時間間隔,還需要後期例如output.Your沒有太多明確 – TheGameiswar

+0

請出示你的最終輸出的樣子 – TheGameiswar

回答

3

您可以從一個紀元取分數的日期差,然後除以30,以30分鐘爲間隔。

這個查詢將給各30分鐘時間段與該插槽最大column_ts一起:

select dateadd(minute, datediff(minute, '1970-1-1',column_ts)/30*30,'1970-1-1') as timegroup, 
     MAX(column_ts) as max_time 
from table1 where column_ts >= '2016-09-30 00:00:00.000' 
group by datediff(minute, '1970-1-1', column_ts)/30 

上述方法產生:

timegroup     max_time 
2016-09-30 00:00:00.000  2016-09-30 00:29:00.000 
2016-09-30 00:30:00.000  2016-09-30 00:48:00.000 
2016-09-30 01:00:00.000  2016-09-30 01:24:00.000 
2016-09-30 01:30:00.000  2016-09-30 01:59:00.000 

一旦你的,你可以在使用它子查詢即可得到你後面的結果:

select groups.timegroup, t.column_ts, t.column1, t.column2, t.column3 
from (
    select dateadd(minute, datediff(minute, '1970-1-1',column_ts)/30*30,'1970-1-1') as timegroup,MAX(column_ts) as max_time 
    from table1 where column_ts >= '2016-09-30 00:00:00.000' 
    group by datediff(minute, '1970-1-1', column_ts)/30 
) as groups 
inner join table1 t on t.column_ts = groups.max_time 

其中產生

timegroup     column_ts     column1 column2 column3 
2016-09-30 00:00:00.000  2016-09-30 00:29:00.000  number3 string3 integer3 
2016-09-30 00:30:00.000  2016-09-30 00:48:00.000  number5 string5 integer5 
2016-09-30 01:00:00.000  2016-09-30 01:24:00.000  number7 string7 integer7 
2016-09-30 01:30:00.000  2016-09-30 01:59:00.000  number9 string9 integer9 
0

可以不開窗函數來完成:

select max(column_ts) column_ts, column1, column2, column3 
from mytable 
where column_ts >= 2016-09-30 00:00:00.000 
group by column1, column2, column3 

由支架太要獲得跨越多個時間帶,分組結果:

select max(column_ts) column_ts, column1, column2, column3 
from mytable 
group by column1, column2, column3, <expression to calculate a unique value for each column_ts bracket> 
+0

30分鐘窗口是我必須在查詢中創建的內容,它只是從每個窗口中選擇一條記錄,並使用最高的column_ts。結果中不需要30分鐘的窗口。我還沒有生成30分鐘的窗口,我無法想出辦法做到這一點。預期結果:column_ts列1列2欄3 2016年9月18日00:29:00.000 number3的STRING3 integer3 2016年9月18日0點48分00秒。000 number5 string5 integer5 2016-09-18 01:24:00.000 number7 string7 integer7 2016-09-18 01:59:00.000 number9 string9 integer9 – 300

+0

預期結果與我在原始問題中提到的相同:抱歉,我無法將適當格式的預期結果放在這裏。 – 300

2

假設你正在使用SQL Server 2005 +,這裏是該腳本

use tempdb 
--drop table dbo.t 
create table dbo.t (column_ts datetime, column1 varchar(30), column2 varchar(30), column3 varchar(30)); 
go 
-- populate the table 
insert into dbo.t (column_ts, column1, column2, column3) 
select '2016-09-30 00:04:00.000','number1','string1','integer1' 
union all select '2016-09-30 00:24:00.000','number2','string2','integer2' 
union all select '2016-09-30 00:29:00.000','number3','string3','integer3' 
union all select '2016-09-30 00:44:00.000','number4','string4','integer4' 
union all select '2016-09-30 00:48:00.000','number5','string5','integer5' 
union all select '2016-09-30 01:04:00.000','number6','string6','integer6' 
union all select '2016-09-30 01:24:00.000','number7','string7','integer7' 
union all select '2016-09-30 01:54:00.000','number8','string8','integer8' 
union all select '2016-09-30 01:59:00.000','number9','string9','integer9'; 
go 

-- the query 
; with c as (
select section=datediff(minute, '2016-09-30', column_ts)/30, * from dbo.t 
) 
, c2 as (select rnk=rank() over (partition by section order by column_ts desc), * from c) 
select column_ts, column1, column2, column3 
from c2 
where rnk = 1; 

我做過類似的事情之前,當我需要找出mos在收集性能曲線之後,每30分鐘窗口查詢一次昂貴的查詢。

1

我會生成一個間隔表,並將其加入到您的數據中。然後,以column_ts排序的每個區間以遞減方式添加row_number(),僅返回最高值(RN = 1)。

DECLARE @Test TABLE (column_ts datetime, column1 varchar(50), column2 varchar(50), column3 varchar(50)) 
INSERT INTO @Test 
VALUES ('2016-09-30 00:04:00.000','number1','string1','integer1'), 
     ('2016-09-30 00:24:00.000','number2','string2','integer2'), 
     ('2016-09-30 00:29:00.000','number3','string3','integer3'), 
     ('2016-09-30 00:44:00.000','number4','string4','integer4'), 
     ('2016-09-30 00:48:00.000','number5','string5','integer5'), 
     ('2016-09-30 01:04:00.000','number6','string6','integer6'), 
     ('2016-09-30 01:24:00.000','number7','string7','integer7'), 
     ('2016-09-30 01:54:00.000','number8','string8','integer8'), 
     ('2016-09-30 01:59:00.000','number9','string9','integer9') 

DECLARE @TimeGrid TABLE (IntervalStart TIME, IntervalEnd TIME) 

DECLARE @MyTime TIME, @true BIT=1 

WHILE @true=1 
BEGIN 
    IF @MyTime IS NULL SET @MyTime = CONVERT(TIME,'00:00:00') 

    INSERT INTO @TimeGrid (IntervalStart,IntervalEnd) 
    SELECT @MyTime, DATEADD(NS,-100,DATEADD(MI,30,@MyTime)) 

    SET @MyTime=DATEADD(MI,30,@MyTime) 
    IF @MyTime= CONVERT(TIME,'00:00:00') 
     SET @true=0 
END 

;WITH X AS 
(
    SELECT * 
    FROM @Test T 
    JOIN @TimeGrid TG ON CONVERT(TIME,T.column_ts) BETWEEN TG.IntervalStart AND TG.IntervalEnd 
), Y AS 
    (
     SELECT *, 
       ROW_NUMBER() OVER(PARTITION BY IntervalStart ORDER BY column_ts DESC) AS RN 
     FROM X 
    ) 

SELECT column_ts, column1, column2, column3--, IntervalStart, IntervalEnd, RN 
FROM Y 
WHERE RN=1 
0

我這樣做是通過分別生成「間隔」表作爲CTE。如果你正在做這麼多事情,你可能需要在表格中「堅持」間隔,以便加入他們。你也應該考慮一下你想要的時候有相同的時間戳兩個事件發生什麼......

DECLARE @theDayInQuestion datetime = '2016-09-30'; 
WITH ints 
AS (SELECT 
    0 AS n 
UNION ALL 
SELECT 
    n + 30 
FROM ints 
WHERE n + 30 < 1440), 

LastTimestampInEachInterval 
AS (SELECT 
    DATEADD(MINUTE, n, @theDayInQuestion) AS StartInterval, 
    DATEADD(MINUTE, n + 30, @theDayInQuestion) AS EndInterval, 
    MAX(t.column_ts) AS LastTimeStamp 


FROM ints 
LEFT JOIN t 
    ON t.column_ts BETWEEN 
    DATEADD(MINUTE, n, @theDayInQuestion) --StartInterval 
    AND 
    DATEADD(MINUTE, n + 30, @theDayInQuestion) --EndInterval 
GROUP BY DATEADD(MINUTE, n, @theDayInQuestion), 
     DATEADD(MINUTE, n + 30, @theDayInQuestion)) 
SELECT 
    * 
FROM LastTimestampInEachInterval 
LEFT JOIN t 
    ON LastTimeStampInEachInterval.LastTimeStamp = t.column_ts 

(警告:腳本未必明天去工作......)

1
;WITH cte AS (
    SELECT 
     * 
     ,ROW_NUMBER() OVER (PARTITION BY 
       CASE 
        WHEN DATEPART(MINUTE,column_ts) > 30 THEN DATEADD(MINUTE,30 - DATEPART(MINUTE,column_ts),column_ts) 
        ELSE DATEADD(MINUTE,- DATEPART(MINUTE,column_ts),column_ts) 
       END 
      ORDER BY column_ts DESC) as RowNumber 
    FROM 
     @Table1 
) 

SELECT * 
FROM 
    cte 
WHERE 
    RowNumber = 1 

你可以像其他人一樣顯示每30分鐘產生一張表格,但實際情況是,如果少於30分鐘,或者如果超過30分鐘,則只需要小時標記。這將創建分組。所以不需要遞歸cte。

CASE 
    WHEN DATEPART(MINUTE,column_ts) => 30 THEN DATEADD(MINUTE,30 - DATEPART(MINUTE,column_ts),column_ts) 
    ELSE DATEADD(MINUTE,- DATEPART(MINUTE,column_ts),column_ts) 
END as HalfHourGroup 
1

@ petelids的答案看起來很合適,但我會提供一個替代方案,在計算中不使用文字日期。我想你甚至可能認爲它讀得更清楚一些。根據您的示例數據,我假設您沒有存儲秒。也可以用一些格式化選項忽略輸出中的秒數。無論如何,秒都與group by無關。

select 
    dateadd 
     minute, 
     -datepart(minute, min(column_ts)) % 30, 
     min(column_ts) 
    ) as timegroup, 
    max(column_ts) as max_time_in_window 
from T 
group by 
    cast(column_ts as date), 
    datepart(hour, column_ts), 
    datepart(minute, column_ts)/30; 

編輯 在重讀你的問題,我意識到,你想整行作爲你的結果。您仍然可以使用這種方法,儘管現在這種技術可能更常見,並且可能非常快。

select * from T 
where column_ts in (
    select max(column_ts) as max_time_in_window 
    from T 
    group by 
     cast(column_ts as date), 
     datepart(hour, column_ts), 
     datepart(minute, column_ts)/30 
); 

或使用row_number()

with data as (
    select *, 
     row_number() over (
      partition by 
       cast(column_ts as date), 
       datepart(hour, column_ts), 
       datepart(minute, column_ts)/30 
      order by 
       column_ts 
     ) as rn 
) 
select * 
from data 
where rn = 1; 
相關問題