我建議使用窗口函數:
with
table1(activity_timestamp, activity) as (
values
('2016-12-23 13:53:47.608561'::timestamp, 'details viewed'),
('2017-01-09 14:15:52.570397', 'details viewed'),
('2016-12-27 16:06:39.138994', 'details viewed'),
('2016-12-24 21:09:56.159436', 'details viewed')),
table2(activity_timestamp, activity) as (
values
('2016-12-23 13:54:47.608561'::timestamp, 'reading'),
('2017-01-09 14:17:52.570397', 'reading'),
('2016-12-27 16:10:39.138994', 'reading'),
('2016-12-24 21:012:56.159436', 'reading'))
, lag AS (
select
*, lag(activity_timestamp) OVER (ORDER BY activity_timestamp)
from (
SELECT * FROM table1
UNION SELECT * FROM table2
) AS a
) SELECT *, lag - activity_timestamp
FROM lag
WHERE activity = 'reading'
ORDER BY 1
;
結果是:
activity_timestamp | activity | lag | ?column?
----------------------------+----------+----------------------------+-----------
2016-12-23 13:54:47.608561 | reading | 2016-12-23 13:53:47.608561 | -00:01:00
2016-12-24 21:12:56.159436 | reading | 2016-12-24 21:09:56.159436 | -00:03:00
2016-12-27 16:10:39.138994 | reading | 2016-12-27 16:06:39.138994 | -00:04:00
2017-01-09 14:17:52.570397 | reading | 2017-01-09 14:15:52.570397 | -00:02:00
(4 rows)
爲了與其他建議的版本i cre吃下一個腳本:
CREATE TABLE table1 AS
SELECT '2016-01-01'::timestamp + '1 min'::interval * (random() * 10 + 1) AS activity_timestamp,
'dv'::text AS activity
FROM generate_series(1, 100000);
CREATE TABLE table2 AS
SELECT activity_timestamp + '1 min'::interval * (random()) AS activity_timestamp,
'r'::text AS activity
FROM table1;
CREATE INDEX i1 ON table1 (activity_timestamp DESC);
CREATE INDEX i2 ON table2 (activity_timestamp DESC);
-- Proposed by Abelisto
explain analyze
select
*,
activity_timestamp - (select max(activity_timestamp)
from table1 as t1
where t2.activity_timestamp > t1.activity_timestamp
) as diff
from table2 as t2 order by activity_timestamp, activity;
-- Gordon Linoff - repaired
explain analyze
select date_part('minutes', a.activity_timestamp - b.activity_timestamp),
a.activity_timestamp, b.activity_timestamp
from table1 a left join
table2 b
on a.activity_timestamp < b.activity_timestamp + interval '20 minute' and
a.activity_timestamp > b.activity_timestamp
order by b.activity_timestamp;
-- My own version
explain analyze
WITH lag AS (
select
*, lag(activity_timestamp) OVER (ORDER BY activity_timestamp)
from (
SELECT * FROM table1
UNION SELECT * FROM table2
) AS a
) SELECT *, lag - activity_timestamp
FROM lag
WHERE activity = 'reading'
ORDER BY 1;
對於戈登查詢查詢時間太長(我不想等待)。 Abelisto:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=53399.41..53649.41 rows=100000 width=56) (actual time=944.918..957.470 rows=100000 loops=1)
Sort Key: t2.activity_timestamp, t2.activity
Sort Method: external merge Disk: 4104kB
-> Seq Scan on table2 t2 (cost=0.00..41675.09 rows=100000 width=56) (actual time=0.068..874.282 rows=100000 loops=1)
SubPlan 2
-> Result (cost=0.39..0.40 rows=1 width=8) (actual time=0.008..0.008 rows=1 loops=100000)
InitPlan 1 (returns $1)
-> Limit (cost=0.29..0.39 rows=1 width=8) (actual time=0.008..0.008 rows=1 loops=100000)
-> Index Only Scan using i1 on table1 t1 (cost=0.29..3195.63 rows=33167 width=8) (actual time=0.008..0.008 rows=1 loops=100000)
Index Cond: ((activity_timestamp IS NOT NULL) AND (activity_timestamp < t2.activity_timestamp))
Heap Fetches: 100000
Planning time: 0.392 ms
Execution time: 961.594 ms
(13 rows)
我自己:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=39214.47..39216.97 rows=1000 width=64) (actual time=325.461..325.461 rows=0 loops=1)
Sort Key: lag.activity_timestamp
Sort Method: quicksort Memory: 25kB
CTE lag
-> WindowAgg (cost=28162.14..34662.14 rows=200000 width=48) (actual time=131.906..265.747 rows=199982 loops=1)
-> Unique (cost=28162.14..29662.14 rows=200000 width=40) (actual time=131.900..200.937 rows=199982 loops=1)
-> Sort (cost=28162.14..28662.14 rows=200000 width=40) (actual time=131.899..167.072 rows=200000 loops=1)
Sort Key: table1.activity_timestamp, table1.activity
Sort Method: external merge Disk: 4000kB
-> Append (cost=0.00..5082.00 rows=200000 width=40) (actual time=0.007..27.569 rows=200000 loops=1)
-> Seq Scan on table1 (cost=0.00..1541.00 rows=100000 width=40) (actual time=0.007..8.584 rows=100000 loops=1)
-> Seq Scan on table2 (cost=0.00..1541.00 rows=100000 width=40) (actual time=0.007..7.248 rows=100000 loops=1)
-> CTE Scan on lag (cost=0.00..4502.50 rows=1000 width=64) (actual time=325.458..325.458 rows=0 loops=1)
Filter: (activity = 'reading'::text)
Rows Removed by Filter: 199982
Planning time: 0.103 ms
Execution time: 327.737 ms
(17 rows)
對於比較我也運行的所有查詢1000行: Abelisto:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=469.71..472.21 rows=1000 width=56) (actual time=8.817..8.882 rows=1000 loops=1)
Sort Key: t2.activity_timestamp, t2.activity
Sort Method: quicksort Memory: 103kB
-> Seq Scan on table2 t2 (cost=0.00..419.89 rows=1000 width=56) (actual time=0.058..8.441 rows=1000 loops=1)
SubPlan 2
-> Result (cost=0.39..0.40 rows=1 width=8) (actual time=0.008..0.008 rows=1 loops=1000)
InitPlan 1 (returns $1)
-> Limit (cost=0.28..0.39 rows=1 width=8) (actual time=0.008..0.008 rows=1 loops=1000)
-> Index Only Scan using i1 on table1 t1 (cost=0.28..38.91 rows=332 width=8) (actual time=0.007..0.007 rows=1 loops=1000)
Index Cond: ((activity_timestamp IS NOT NULL) AND (activity_timestamp < t2.activity_timestamp))
Heap Fetches: 1000
Planning time: 0.311 ms
Execution time: 8.948 ms
(13 rows)
戈登:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=21087.07..21364.85 rows=111111 width=24) (actual time=439.142..528.240 rows=452961 loops=1)
Sort Key: b.activity_timestamp
Sort Method: external merge Disk: 15016kB
-> Nested Loop Left Join (cost=0.28..9493.05 rows=111111 width=24) (actual time=0.056..280.036 rows=452961 loops=1)
-> Seq Scan on table1 a (cost=0.00..16.00 rows=1000 width=8) (actual time=0.007..0.114 rows=1000 loops=1)
-> Index Only Scan using i2 on table2 b (cost=0.28..7.81 rows=111 width=8) (actual time=0.006..0.171 rows=453 loops=1000)
Index Cond: (activity_timestamp < a.activity_timestamp)
Filter: (a.activity_timestamp < (activity_timestamp + '00:20:00'::interval))
Heap Fetches: 452952
Planning time: 0.102 ms
Execution time: 545.139 ms
(11 rows)
我自己的:
如果你只有時間戳,你不能有「記錄」,因爲可能有無限量的記錄。請添加你想要的結果和你知道的結果的解釋 –
b活動之後是否是一項活動? –
是在第一個表中他們將記錄那些比第二個記錄提前5-10分鐘 –