2010-11-14 73 views
3

我試圖從我的網絡服務器分析一些網絡日誌。我把上週的所有日誌都放到了mysql數據庫中,我正在分析這些日誌。MySQL:計算一個範圍內的項目數

我已經生成的sessionID個表,並使用此mysql命令在會話的長度:

SELECT 
     Log_Analysis_RecordsToSesions.sessionID, 
     ABS(TIMEDIFF(
       MIN(Log_Analysis_Records.date), 
       MAX(Log_Analysis_Records.date) 
     )) as session_length 
FROM 
     Log_Analysis_RecordsToSesions, 
     Log_Analysis_Records 
WHERE 
     Log_Analysis_RecordsToSesions.recordID=Log_Analysis_Records.recordID 
GROUP BY 
     sessionID; 

-

+-----------+----------------+ 
| sessionID | session_length | 
+-----------+----------------+ 
|   1 | 2031.000000 | 
|   2 | 1954.000000 | 
|   3 |  401.000000 | 
... 

我想現在要做的就是修改語句,使它會產生這樣的事情:

Range (time)  Number of Sessions 
0 to 2   10 
2 to 4   4 
4 to 6   60 
... 

該範圍將是一個固定時間量,我想統計該範圍內的會話數量。我的第一個想法是用php來循環遍歷它,但這看起來非常耗時並且非常糟糕。有沒有辦法在MySQL中做到這一點?

回答

0

我編輯了你的帖子來添加一個別名,它使結果更具可讀性。現在,我想你可以嘗試這樣的事:

SELECT 
     Log_Analysis_RecordsToSesions.sessionID, 
     ABS(TIMEDIFF(
       MIN(Log_Analysis_Records.date), 
       MAX(Log_Analysis_Records.date) 
     )) as session_length, 
     CONCAT(session_length DIV 2, ' to ', session_length DIV 2 + 2) as range 
FROM 
     Log_Analysis_RecordsToSesions, 
     Log_Analysis_Records 
WHERE 
     Log_Analysis_RecordsToSesions.recordID=Log_Analysis_Records.recordID 
GROUP BY 
     range 
ORDER BY session_length; 
0

您可能希望創建另一個表,並稱其ranges

CREATE TABLE ranges (
    `range` int 
); 

INSERT INTO ranges VALUES (2), (4), (6), (8); 

那麼你可能要包裝你的查詢作爲派生表,和左加入ranges表派生表:

SELECT CONCAT(r.`range` - 2, ' to ', r.`range`) `range`, 
     COUNT(session_length) number_of_sessions 
FROM  ranges r 
LEFT JOIN (
    SELECT rs.sessionID, 
       ABS(TIMEDIFF(MIN(ar.date), MAX(ar.date))) session_length 
    FROM  Log_Analysis_RecordsToSesions rs, 
    JOIN  Log_Analysis_Records ar ON (rs.recordID = ar.recordID) 
    GROUP BY rs.sessionID; 
) dt ON (dt.session_length > r.`range` - 2 AND 
     dt.session_length <= r.`range`) 
GROUP BY r.`range`; 

爲測試條件下,讓我們創建一個虛擬表與一羣隨機會話長度,如您的示例:

CREATE TABLE sessions (
    session_id  int, 
    session_length int 
); 

INSERT INTO sessions VALUES (1, 2031); 
INSERT INTO sessions VALUES (2, 1954); 
INSERT INTO sessions VALUES (3, 401); 
INSERT INTO sessions VALUES (4, 7505); 

然後我們可以做到以下幾點,假設ranges表已創建:

SELECT CONCAT(r.`range` - 2, ' to ', r.`range`) `range`, 
     COUNT(session_length) number_of_sessions 
FROM  ranges r 
LEFT JOIN (
    SELECT session_id, session_length FROM sessions 
) dt ON (dt.session_length/1000 > r.`range` - 2 AND 
     dt.session_length/1000 <= r.`range`) 
GROUP BY r.`range`; 

結果:

+--------+--------------------+ 
| range | number_of_sessions | 
+--------+--------------------+ 
| 0 to 2 |     2 | 
| 2 to 4 |     1 | 
| 4 to 6 |     0 | 
| 6 to 8 |     1 | 
+--------+--------------------+ 
4 rows in set (0.00 sec) 
0

運行此查詢過您生成的表格:

SELECT 
    CONCAT((session_length div 2000)*2, ' to ', ((session_length+2000) div 2000)*2) AS `Range (time)`, 
    COUNT(*) AS `Number of sessions` 
FROM sessions 
GROUP BY session_length div 2000 
+0

你能解釋一下wh在第二行是幹什麼的? – sixtyfootersdude 2010-11-15 09:48:35

+0

它將'group by'中使用的顯示值與''to''和'group by'中使用的下一個組合(+ 2000ms)連接起來。 '* 2'用於顯示:因爲,例如,401和1500 div 2000是'0'(group by中使用的值),2031 div 2000是'1' - 另一個組。 – 2010-11-16 04:59:12