2010-06-21 66 views
32

我有一張表(MySQL),每隔n秒捕獲一次採樣。該表有許多列,但所有重要的是兩個:時間戳(類型爲TIMESTAMP)和計數(類型爲INT)。SELECT/GROUP BY - 時間段(10秒,30秒等)

我想要做的是在一定時間範圍內獲得計數列的總和和平均值。例如,我每隔2秒記錄一次採樣,但是我希望所有采樣在10秒或30秒窗口內的所有采樣的計數列總和。

下面是數據的一個例子:

 
+---------------------+-----------------+ 
| time_stamp   | count   | 
+---------------------+-----------------+ 
| 2010-06-15 23:35:28 |    1 | 
| 2010-06-15 23:35:30 |    1 | 
| 2010-06-15 23:35:30 |    1 | 
| 2010-06-15 23:35:30 |    942 | 
| 2010-06-15 23:35:30 |    180 | 
| 2010-06-15 23:35:30 |    4 | 
| 2010-06-15 23:35:30 |    52 | 
| 2010-06-15 23:35:30 |    12 | 
| 2010-06-15 23:35:30 |    1 | 
| 2010-06-15 23:35:30 |    1 | 
| 2010-06-15 23:35:33 |   1468 | 
| 2010-06-15 23:35:33 |    247 | 
| 2010-06-15 23:35:33 |    1 | 
| 2010-06-15 23:35:33 |    81 | 
| 2010-06-15 23:35:33 |    16 | 
| 2010-06-15 23:35:35 |   1828 | 
| 2010-06-15 23:35:35 |    214 | 
| 2010-06-15 23:35:35 |    75 | 
| 2010-06-15 23:35:35 |    8 | 
| 2010-06-15 23:35:37 |   1799 | 
| 2010-06-15 23:35:37 |    24 | 
| 2010-06-15 23:35:37 |    11 | 
| 2010-06-15 23:35:37 |    2 | 
| 2010-06-15 23:35:40 |    575 | 
| 2010-06-15 23:35:40 |    1 | 
| 2010-06-17 10:39:35 |    2 | 
| 2010-06-17 10:39:35 |    2 | 
| 2010-06-17 10:39:35 |    1 | 
| 2010-06-17 10:39:35 |    2 | 
| 2010-06-17 10:39:35 |    1 | 
| 2010-06-17 10:39:40 |    35 | 
| 2010-06-17 10:39:40 |    19 | 
| 2010-06-17 10:39:40 |    37 | 
| 2010-06-17 10:39:42 |    64 | 
| 2010-06-17 10:39:42 |    3 | 
| 2010-06-17 10:39:42 |    31 | 
| 2010-06-17 10:39:42 |    7 | 
| 2010-06-17 10:39:42 |    246 | 
+---------------------+-----------------+ 

輸出我想(基於以上數據)應該是這樣的:

 
+---------------------+-----------------+ 
| 2010-06-15 23:35:00 |    1 | # This is the sum for the 00 - 30 seconds range 
| 2010-06-15 23:35:30 |   7544 | # This is the sum for the 30 - 60 seconds range 
| 2010-06-17 10:39:35 |    450 | # This is the sum for the 30 - 60 seconds range 
+---------------------+-----------------+ 

我已經用GROUP BY來收集這些數字由第二個,或由分鐘,但我似乎無法弄清楚的語法來獲取次分鐘或範圍秒GROUP BY命令正常工作。

我主要是使用這個查詢來從這個表中的數據虹吸到另一個表。

謝謝!

回答

58

GROUP BY UNIX_TIMESTAMP(time_stamp) DIV 30

,或者說由於某種原因,你想他們組20秒的時間間隔將是DIV 20等要更改GROUP BY值之間的界限,你可以使用

GROUP BY (UNIX_TIMESTAMP(time_stamp) + r) DIV 30

哪裏r是一個小於30的文字非負整數。所以

GROUP BY (UNIX_TIMESTAMP(time_stamp) + 5) DIV 30

應該給你hh:mm:05和hh:mm:35之間以及hh:mm:35和hh:mm + 1:05之間的總和。

+0

完美!那*完全*我需要的!謝謝一堆! – 2010-06-21 17:49:33

6

我在我的項目中嘗試了Hammerite的解決方案,但在系列中缺少樣本的地方沒有很好的工作。下面是被認爲由27分鐘的時間間隔來選擇時間戳(TS),用戶名和平均測量從metric_table和組的結果的查詢的示例:

select 
    min(ts), 
    user_name, 
    sum(measure)/27 
from metric_table 
where 
    ts between date_sub('2015-03-17 00:00:00', INTERVAL 2160 MINUTE) and '2015-03-17 00:00:00' 

group by unix_timestamp(ts) div 1620, user_name 
order by ts, user_name 
; 

注:27分鐘(在所選擇的)= 1620秒(在由基),2160分鐘= 3天(這是時間範圍)

當我跑了反對其中樣品不規則記錄一個時間系列的查詢(換句話說:對於任何給定時間標記沒有保證找到所有用戶名稱的度量值)結果沒有根據間隔加蓋(不是每27分鐘放置一次)。我懷疑這是由於min(ts)在某些組中返回的時間戳大於預期的時間間隔(ts0 + i *間隔)。我將前一個查詢修改爲這一個:

select 
    from_unixtime(unix_timestamp(ts) - unix_timestamp(ts) mod 1620) as ts1, 
    user_name, 
    sum(measure)/27 
from metric_table 
where 
    ts between date_sub('2015-03-17 00:00:00', INTERVAL 2160 MINUTE) and '2015-03-17 00:00:00' 

group by ts1, user_name 
order by ts1, user_name 
; 

即使樣本丟失,它也能正常工作。我認爲這是因爲一旦數學時間移動選擇它,可以保證ts1將與時間步長保持一致。

+0

感謝您提出這個問題,幫了我很多! – citysurrounded 2015-12-03 06:24:50

+0

精彩的東西!我現在需要的只是一個方法,當它在那個時間段內沒有樣本時,記錄一個「零」行... – 2016-11-22 18:21:11

+0

@DanielRhodes曾經指出過一個呢? – 2017-09-20 21:59:23

0

很奇怪,但這裏使用的解決方案:

Average of data for every 5 minutes in the given times

我們可以建議是這樣的:從meteor-m2_msgi

select convert(
(min(dt_record) div 50)*50 - 20*((convert(min(dt_record), datetime) div 50) mod 2), 

日期時間)爲DT, AVG(1das4hrz)其中dt_record> = '2016-11-13 05:00:00'和dt_record <'2016-11-14 00:00:00'convert by(dt_record,datetime)div 50;

select (
convert(
min(dt_record), datetime) div 50)*50 - 20*(
(convert(min(dt_record), datetime) div 50) mod 2 
) as dt, 
avg(column) from `your_table` 
where dt_record>='2016-11-13 05:00:00' 
and dt_record < '2016-11-14 00:00:00' 
group by convert(dt_record, datetime) div 50; 

50是因爲正常 1/2分鐘提供30秒,而 'INTEGER DATE FORMAT' 假設我們通過50

2

另一種解決方案來劃分。

要在您喜歡的任何時間間隔內取平均值,您可以將DT轉換爲時間戳並按您的間隔(示例中爲7秒)進行模組化。

select FROM_UNIXTIME(
    UNIX_TIMESTAMP(dt_record) - UNIX_TIMESTAMP(dt_record) mod 7 
) as dt, avg(1das4hrz) from `meteor-m2_msgi` 
where dt_record>='2016-11-13 05:00:00' 
and dt_record < '2016-11-13 05:02:00' 
group by FROM_UNIXTIME(
    UNIX_TIMESTAMP(dt_record) - UNIX_TIMESTAMP(dt_record) mod 7); 

爲了顯示它是如何工作的,我準備了一個請求,顯示計算結果。

select dt_record, minute(dt_record) as mm, SECOND(dt_record) as ss, 
UNIX_TIMESTAMP(dt_record) as uxt, UNIX_TIMESTAMP(dt_record) mod 7 as ux7, 
FROM_UNIXTIME(
    UNIX_TIMESTAMP(dt_record) - UNIX_TIMESTAMP(dt_record) mod 7) as dtsub, 
column from `yourtable` where dt_record>='2016-11-13 05:00:00' 
and dt_record < '2016-11-13 05:02:00'; 

+---------------------+--------------------+ 
| dt     | avg(column)  | 
+---------------------+--------------------+ 
| 2016-11-13 04:59:43 | 25434.85714285714 | 
| 2016-11-13 05:00:42 | 5700.728813559322 | 
| 2016-11-13 05:01:41 | 950.1016949152543 | 
| 2016-11-13 05:02:40 | 4671.220338983051 | 
| 2016-11-13 05:03:39 | 25468.728813559323 | 
| 2016-11-13 05:04:38 | 43883.52542372881 | 
| 2016-11-13 05:05:37 | 24589.338983050846 | 
+---------------------+--------------------+ 


+---------------------+-----+-----+------------+------+---------------------+----------+ 
| dt_record   | mm | ss | uxt  | ux7 | dtsub    | column | 
+---------------------+------+-----+------------+------+---------------------+----------+ 
| 2016-11-13 05:00:00 | 0 | 0 | 1479002400 | 1 | 2016-11-13 04:59:59 | 36137 | 
| 2016-11-13 05:00:01 | 0 | 1 | 1479002401 | 2 | 2016-11-13 04:59:59 | 36137 | 
| 2016-11-13 05:00:02 | 0 | 2 | 1479002402 | 3 | 2016-11-13 04:59:59 | 36137 | 
| 2016-11-13 05:00:03 | 0 | 3 | 1479002403 | 4 | 2016-11-13 04:59:59 | 34911 |  
| 2016-11-13 05:00:04 | 0 | 4 | 1479002404 | 5 | 2016-11-13 04:59:59 | 34911 | 
| 2016-11-13 05:00:05 | 0 | 5 | 1479002405 | 6 | 2016-11-13 04:59:59 | 34911 | 
| 2016-11-13 05:00:06 | 0 | 6 | 1479002406 | 0 | 2016-11-13 05:00:06 | 33726 | 
| 2016-11-13 05:00:07 | 0 | 7 | 1479002407 | 1 | 2016-11-13 05:00:06 | 32581 | 
| 2016-11-13 05:00:08 | 0 | 8 | 1479002408 | 2 | 2016-11-13 05:00:06 | 32581 | 
| 2016-11-13 05:00:09 | 0 | 9 | 1479002409 | 3 | 2016-11-13 05:00:06 | 31475 | 
+---------------------+-----+-----+------------+------+---------------------+----------+ 

任何人都可以提出更快的建議嗎?