2010-03-19 47 views
1

如何去更換以下自連接使用分析:更換selfjoin與分析功能

SELECT 
t1.col1 col1, 
t1.col2 col2, 
SUM((extract(hour FROM (t1.times_stamp - t2.times_stamp)) * 3600 + extract(minute FROM (t1.times_stamp - t2.times_stamp)) * 60 + extract(second FROM (t1.times_stamp - t2.times_stamp)))) div, 
COUNT(*) tot_count 
FROM tab1 t1, 
tab1 t2 
WHERE t2.col1  = t1.col1 
AND t2.col2 = t1.col2 
AND t2.col3  = t1.sequence_num 
AND t2.times_stamp  < t1.times_stamp 
AND t2.col4   = 3 
AND t1.col4   = 4 
AND t2.col5 NOT IN(103,123) 
AND t1.col5  != 549 
GROUP BY t1.col1, t1.col2 

回答

1

我敢肯定,你將無法與分析,以取代selfjoin因爲你是使用行間操作(t1.time_stamp - t2.time_stamp)。 Google Analytics只能訪問當前行的值以及集合函數在行子集(窗口子句)上的值。

請參閱this article from Tom Kytethis paper以進一步分析分析的侷限性。

0

幾乎樣子,你可以消除自連接上t2與東西就COL4更換

t1.time_stamp - t2.time_stamp

t1.time_stamp - lag(t1.time_stamp) over (partition by col1, col2 order by time_stamp)

不同的過濾器上t1t2和col5是阻止你這樣做的原因。
分析函數應用於主要查詢的where/group by之後,因此您需要在t1上使用單個篩選器,以便使用lag/lead指定序列中的後續或前面的行。

此外,你需要通過推總和/組到外部查詢解析函數後彙總:

select col1, col2, sum(timestamp_diff) from (
    select col1, col2, timestamp - lag(timestamp) over(.....) as timestamp_diff 
    where .... 
) group by col1, col2