Monday, March 12, 2012

group data by time interval

What SQL will group timed events, so I can see how many records were recorde
d
in each 10 minute interval. There may be no data for some 10 minute
intervals, in which case I must report zero.
for example...with raw data in a table as
2006/06/19 12:00:00 record 1
2006/06/19 12:05:00 record 2
2006/06/19 12:21:00 record 3
2006/06/19 12:22:00 record 4
2006/06/19 12:23:00 record 5
2006/06/19 12:24:00 record 6
result should be
2 records for 12:00-12:10
0 records for 12:10-12:20
4 records for 12:20-12:30Well, should it be inclusive? e.g. what if there are three records with
2006-06-19 12:10:00? Which bucket do they fall in? The first, the second,
both?
"Tony" <Tony@.discussions.microsoft.com> wrote in message
news:B406B58A-DCA1-4AC4-828B-C057F91B0FCF@.microsoft.com...
> What SQL will group timed events, so I can see how many records were
> recorded
> in each 10 minute interval. There may be no data for some 10 minute
> intervals, in which case I must report zero.
> for example...with raw data in a table as
> 2006/06/19 12:00:00 record 1
> 2006/06/19 12:05:00 record 2
> 2006/06/19 12:21:00 record 3
> 2006/06/19 12:22:00 record 4
> 2006/06/19 12:23:00 record 5
> 2006/06/19 12:24:00 record 6
>
> result should be
> 2 records for 12:00-12:10
> 0 records for 12:10-12:20
> 4 records for 12:20-12:30
>|||Here's a start, perhaps.
SET NOCOUNT ON;
SET ANSI_WARNINGS OFF;
CREATE TABLE #blat (ts SMALLDATETIME);
INSERT #blat SELECT '20060619 12:00:00'
UNION ALL SELECT '20060619 12:05:00'
UNION ALL SELECT '20060619 12:21:00'
UNION ALL SELECT '20060619 12:22:00'
UNION ALL SELECT '20060619 12:23:00'
UNION ALL SELECT '20060619 12:24:00';
SELECT TOP 144 id = IDENTITY(INT,1,1)
INTO #foo
FROM sysobjects a
CROSS JOIN sysobjects b;
SELECT [start] = CONVERT(CHAR(5), [start], 108),
[end] = CONVERT(CHAR(5), [end], 108),
[count] = COALESCE(COUNT(b.ts), 0)
FROM
(
SELECT
[start] = DATEADD(MINUTE, (id-1)*10, 0+DATEDIFF(DAY, 0, GETDATE())),
[end] = DATEADD(MINUTE, id*10, 0+DATEDIFF(DAY, 0, GETDATE()))
FROM #foo
) f
LEFT OUTER JOIN
#blat b
ON
b.ts >= f.[start]
AND b.ts < f.[end]
WHERE
f.[end] > '20060619 12:00'
AND f.[end] <= '20060619 12:30'
GROUP BY CONVERT(CHAR(5), [start], 108),
CONVERT(CHAR(5), [end], 108)
ORDER BY 1;
DROP TABLE #blat, #foo;
"Tony" <Tony@.discussions.microsoft.com> wrote in message
news:B406B58A-DCA1-4AC4-828B-C057F91B0FCF@.microsoft.com...
> What SQL will group timed events, so I can see how many records were
> recorded
> in each 10 minute interval. There may be no data for some 10 minute
> intervals, in which case I must report zero.
> for example...with raw data in a table as
> 2006/06/19 12:00:00 record 1
> 2006/06/19 12:05:00 record 2
> 2006/06/19 12:21:00 record 3
> 2006/06/19 12:22:00 record 4
> 2006/06/19 12:23:00 record 5
> 2006/06/19 12:24:00 record 6
>
> result should be
> 2 records for 12:00-12:10
> 0 records for 12:10-12:20
> 4 records for 12:20-12:30
>|||>> What SQL will group timed events, so I can see how many records [sic] were rec
orded in each 10 minute interval. <<
Populate a table with the desired time slots, thus:
CREATE TABLE ReportTimeslots
(range_name CHAR(18) NOT NULL PRIMARY KEY,
start_time DATETIME NOT NULL,
end_time DATETIME NOT NULL,
CHECK (start_time < end_time));
Then do your query.
SELECT R.range_name, COUNT(*) AS event_cnt
FROM ReportTimeslots AS R
LEFT OUTER JOIN
Events AS E
ON E.event_time BETWEEN R.start_time AND R.end_time
GROUP BY R.range_name;
Another trick is a VIEW that sets itself each day by using the
CURRENT_TIMESTAMP for the range_name and does a little temporal math to
get the proper start and end times.|||Aaron,
Using your script as a start, I came up with this:
select cnt,
dateadd(mi, min10*10, '20060619') mfrom_inclusive,
dateadd(mi, min10*10 + 10, '20060619') mto_exclusive
from(
select count(*) cnt, datediff(mi, '20060619', ts)/10 min10 from
#blat
group by datediff(mi, '20060619', ts)/10
) t|||Yes, clever, and I considered inlining it in a similar way, but the
following row would be missing from the result:
0 records for 12:10-12:20
"Alexander Kuznetsov" <AK_TIREDOFSPAM@.hotmail.COM> wrote in message
news:1150744942.340640.153600@.h76g2000cwa.googlegroups.com...
> Aaron,
> Using your script as a start, I came up with this:
> select cnt,
> dateadd(mi, min10*10, '20060619') mfrom_inclusive,
> dateadd(mi, min10*10 + 10, '20060619') mto_exclusive
> from(
> select count(*) cnt, datediff(mi, '20060619', ts)/10 min10 from
> #blat
> group by datediff(mi, '20060619', ts)/10
> ) t
>|||You can write a TVF to generate your time-series and then left-outer-join it
with the original data to preserve groups.
Conor
"Aaron Bertrand [SQL Server MVP]" <ten.xoc@.dnartreb.noraa> wrote in message
news:uyqgdZ9kGHA.4444@.TK2MSFTNGP02.phx.gbl...
> Yes, clever, and I considered inlining it in a similar way, but the
> following row would be missing from the result:
> 0 records for 12:10-12:20
>
> "Alexander Kuznetsov" <AK_TIREDOFSPAM@.hotmail.COM> wrote in message
> news:1150744942.340640.153600@.h76g2000cwa.googlegroups.com...
>|||I agree, thanks for the feedback. This takes care of it:
select coalesce(cnt, 0),
dateadd(mi, (id-1)*10, '20060619') mfrom_inclusive,
dateadd(mi, (id-1)*10 + 10, '20060619') mto_exclusive
from #foo left join (
select count(*) cnt, datediff(mi, '20060619', ts)/10 min10 from
#blat
group by datediff(mi, '20060619', ts)/10
) t
on #foo.id = t.min10+1
I would be interesting to compare the performance of both queries on
different hardware.|||Hi Alexander,
The query written is neat and clean
But would it work for boundary value '
like if I reduce the interval from 10 to 5 minute and it show 1 rows in
count instead of 2.
for 12:05:00
With warm regards
Alexander Kuznetsov wrote:
> Aaron,
> Using your script as a start, I came up with this:
> select cnt,
> dateadd(mi, min10*10, '20060619') mfrom_inclusive,
> dateadd(mi, min10*10 + 10, '20060619') mto_exclusive
> from(
> select count(*) cnt, datediff(mi, '20060619', ts)/10 min10 from
> #blat
> group by datediff(mi, '20060619', ts)/10
> ) t

No comments:

Post a Comment