Monday, March 26, 2012

Grouping consecutive rows of data

This is driving me crazy, I have the following table
Unit Line ProductCode StartTime EndTime
-- -- -- --
--
3 1 120064 2002-08-30 12:31:26.810 2003-05-27
11:39:51.157
3 1 283724 2003-05-27 11:39:51.157 2003-05-27
12:51:52.423
3 1 283724 2003-05-27 12:51:52.423 2003-05-30
07:12:38.997
3 1 285775 2003-05-30 07:12:38.997 2003-06-16
10:37:01.813
3 1 230571 2003-06-16 10:37:01.813 2003-07-04
12:40:32.097
3 1 260775 2003-07-04 12:40:32.097 2003-07-07
14:25:18.483
3 1 260775 2003-07-07 14:25:18.483 2003-07-08
07:02:04.303
3 1 265775 2003-07-08 07:02:04.303 2003-07-14
12:37:42.513
3 1 215690 2003-07-14 12:37:42.513 2003-07-14
12:38:23.450
3 1 255779 2003-07-14 12:38:23.450 2003-07-15
08:09:30.030
3 1 255779 2003-07-15 08:09:30.030 2003-07-15
16:05:53.323
3 1 205716 2003-07-15 16:05:53.323 2003-07-15
16:06:17.527
3 1 203716 2003-07-15 16:06:17.527 2003-07-18
10:53:48.927
3 1 203716 2003-07-18 10:53:48.927 2003-07-18
10:55:17.677
3 1 102001 2003-07-18 10:55:17.677 2003-07-18
10:55:54.190
3 1 203716 2003-07-18 10:55:54.190 2003-07-18
15:40:57.113
3 1 203716 2003-07-18 15:40:57.113 2003-07-18
15:41:05.660
3 1 203716 2003-07-18 15:41:05.660 2003-07-18
16:22:23.050
3 1 203716 2003-07-18 16:22:23.050 2003-07-23
16:25:37.407
I want to be able to group each Unit, Line and ProductCode together so
I can pick up the StartTime and EndTime for consecutive product runs.
i.e.ProductCode 203716 starts at 2003-07-15 16:06:17.527 and runs
until 2003-07-18 10:55:17.677.
This is then replaced by ProductCode 102001 which runs from 2003-07-18
10:55:17.677 until 2003-07-18 10:55:54.190 and is then replaced by
203716 which runs from 2003-07-18 10:55:54.190 until 2003-07-23
16:25:37.407.
I tried the following view:
SELECT TOP 100 PERCENT Unit, Line, ProductCode, MIN(StartTime) AS
StartTime, MAX(EndTime) AS EndTime FROM D_ProductionLog
GROUP BY Unit, Line, ProductCode
ORDER BY StartTime, Unit, Line
But it takes the first time a Product is run and the last time it was
run i.e.
for ProductCode 203716
StartTime 2003-07-15 16:06:17.527
EndTime 2003-07-23 16:25:37.407
Instead of two records.
Many thanks
JimThis is a multi-part message in MIME format.
--=_NextPart_000_014B_01C35111.500514B0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Try:
select
Unit
, Line
, ProductCode
, Type
, case when Type =3D 1 then min (StartTime) else max (EndTime)
from
MyTable
cross join
(
select 1 as Type
union all
select 2
) as x
group by
Unit
, Line
, ProductCode
, Type
-- Tom
---
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com/sql
"Jim" <jim.holmes@.devro-casings.com> wrote in message =news:68dfae14.0307230741.2f1a9b49@.posting.google.com...
This is driving me crazy, I have the following table
Unit Line ProductCode StartTime EndTime
-- -- -- --
--
3 1 120064 2002-08-30 12:31:26.810 2003-05-27
11:39:51.157
3 1 283724 2003-05-27 11:39:51.157 2003-05-27
12:51:52.423
3 1 283724 2003-05-27 12:51:52.423 2003-05-30
07:12:38.997
3 1 285775 2003-05-30 07:12:38.997 2003-06-16
10:37:01.813
3 1 230571 2003-06-16 10:37:01.813 2003-07-04
12:40:32.097
3 1 260775 2003-07-04 12:40:32.097 2003-07-07
14:25:18.483
3 1 260775 2003-07-07 14:25:18.483 2003-07-08
07:02:04.303
3 1 265775 2003-07-08 07:02:04.303 2003-07-14
12:37:42.513
3 1 215690 2003-07-14 12:37:42.513 2003-07-14
12:38:23.450
3 1 255779 2003-07-14 12:38:23.450 2003-07-15
08:09:30.030
3 1 255779 2003-07-15 08:09:30.030 2003-07-15
16:05:53.323
3 1 205716 2003-07-15 16:05:53.323 2003-07-15
16:06:17.527
3 1 203716 2003-07-15 16:06:17.527 2003-07-18
10:53:48.927
3 1 203716 2003-07-18 10:53:48.927 2003-07-18
10:55:17.677
3 1 102001 2003-07-18 10:55:17.677 2003-07-18
10:55:54.190
3 1 203716 2003-07-18 10:55:54.190 2003-07-18
15:40:57.113
3 1 203716 2003-07-18 15:40:57.113 2003-07-18
15:41:05.660
3 1 203716 2003-07-18 15:41:05.660 2003-07-18
16:22:23.050
3 1 203716 2003-07-18 16:22:23.050 2003-07-23
16:25:37.407
I want to be able to group each Unit, Line and ProductCode together so
I can pick up the StartTime and EndTime for consecutive product runs.
i.e.ProductCode 203716 starts at 2003-07-15 16:06:17.527 and runs
until 2003-07-18 10:55:17.677.
This is then replaced by ProductCode 102001 which runs from 2003-07-18
10:55:17.677 until 2003-07-18 10:55:54.190 and is then replaced by
203716 which runs from 2003-07-18 10:55:54.190 until 2003-07-23
16:25:37.407.
I tried the following view:
SELECT TOP 100 PERCENT Unit, Line, ProductCode, MIN(StartTime) AS
StartTime, MAX(EndTime) AS EndTime FROM D_ProductionLog
GROUP BY Unit, Line, ProductCode
ORDER BY StartTime, Unit, Line
But it takes the first time a Product is run and the last time it was
run i.e.
for ProductCode 203716 StartTime 2003-07-15 16:06:17.527 EndTime 2003-07-23 16:25:37.407
Instead of two records.
Many thanks
Jim
--=_NextPart_000_014B_01C35111.500514B0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
&

Try:
select
=Unit
, =Line
, ProductCode
, =Type
, case when =Type =3D 1 then min (StartTime) else max (EndTime)
from
=MyTable
cross join
(
select 1 as =Type union all select 2) as x
group by
=Unit
, =Line
, ProductCode
, =Type-- Tom
---T=homas A. Moreau, BSc, PhD, MCSE, MCDBASQL Server MVPColumnist, SQL =Server ProfessionalToronto, ON Canadahttp://www.pinnaclepublishing.com/sql">www.pinnaclepublishing.com=/sql
"Jim" wrote in message news:68dfae=14.0307230741.2f1a9b49@.posting.google.com...This is driving me crazy, I have the following tableUnit =Line ProductCode StartTime &nbs=p; EndTime-- -- -- ----3  =; 1 120064 2002-08-30 =12:31:26.810 2003-05-2711:39:51.1573 =1 283724 2003-05-27 11:39:51.157 2003-05-2712:51:52.4233 =1 283724 2003-05-27 12:51:52.423 2003-05-3007:12:38.9973 =1 285775 2003-05-30 07:12:38.997 2003-06-1610:37:01.8133 =1 230571 2003-06-16 10:37:01.813 2003-07-0412:40:32.0973 =1 260775 2003-07-04 12:40:32.097 2003-07-0714:25:18.4833 =1 260775 2003-07-07 14:25:18.483 2003-07-0807:02:04.3033 =1 265775 2003-07-08 07:02:04.303 2003-07-1412:37:42.5133 =1 215690 2003-07-14 12:37:42.513 2003-07-1412:38:23.4503 =1 255779 2003-07-14 12:38:23.450 2003-07-1508:09:30.0303 =1 255779 2003-07-15 08:09:30.030 2003-07-1516:05:53.3233 =1 205716 2003-07-15 16:05:53.323 2003-07-1516:06:17.5273 =1 203716 2003-07-15 16:06:17.527 2003-07-1810:53:48.9273 =1 203716 2003-07-18 10:53:48.927 2003-07-1810:55:17.6773 =1 102001 2003-07-18 10:55:17.677 2003-07-1810:55:54.1903 =1 203716 2003-07-18 10:55:54.190 2003-07-1815:40:57.1133 =1 203716 2003-07-18 15:40:57.113 2003-07-1815:41:05.6603 =1 203716 2003-07-18 15:41:05.660 2003-07-1816:22:23.0503 =1 203716 2003-07-18 16:22:23.050 2003-07-2316:25:37.407I want to be able to group each Unit, =Line and ProductCode together soI can pick up the StartTime and EndTime for consecutive product runs.i.e.ProductCode 203716 starts at 2003-07-15 =16:06:17.527 and runsuntil 2003-07-18 10:55:17.677.This is =then replaced by ProductCode 102001 which runs from =2003-07-1810:55:17.677 until 2003-07-18 10:55:54.190 and is then replaced by203716 which runs =from 2003-07-18 10:55:54.190 until 2003-07-2316:25:37.407.I tried =the following view:SELECT TOP 100 PERCENT =Unit, Line, ProductCode, MIN(StartTime) ASStartTime, MAX(EndTime) AS =EndTime FROM D_ProductionLogGROUP BY Unit, Line, ProductCodeORDER =BY StartTime, Unit, LineBut it takes the first time a Product is =run and the last time it wasrun i.e.for ProductCode 203716 StartTime 2003-07-15 16:06:17.527 EndTime 2003-07-23 16:25:37.407 Instead =of two records.Many thanksJim

--=_NextPart_000_014B_01C35111.500514B0--|||I am confused, do you want to get one record per ProductCode or you just
want to order them together?
"Jim" <jim.holmes@.devro-casings.com> wrote in message
news:68dfae14.0307230741.2f1a9b49@.posting.google.com...
> This is driving me crazy, I have the following table
>
> Unit Line ProductCode StartTime EndTime
> -- -- -- --
> --
> 3 1 120064 2002-08-30 12:31:26.810 2003-05-27
> 11:39:51.157
> 3 1 283724 2003-05-27 11:39:51.157 2003-05-27
> 12:51:52.423
> 3 1 283724 2003-05-27 12:51:52.423 2003-05-30
> 07:12:38.997
> 3 1 285775 2003-05-30 07:12:38.997 2003-06-16
> 10:37:01.813
> 3 1 230571 2003-06-16 10:37:01.813 2003-07-04
> 12:40:32.097
> 3 1 260775 2003-07-04 12:40:32.097 2003-07-07
> 14:25:18.483
> 3 1 260775 2003-07-07 14:25:18.483 2003-07-08
> 07:02:04.303
> 3 1 265775 2003-07-08 07:02:04.303 2003-07-14
> 12:37:42.513
> 3 1 215690 2003-07-14 12:37:42.513 2003-07-14
> 12:38:23.450
> 3 1 255779 2003-07-14 12:38:23.450 2003-07-15
> 08:09:30.030
> 3 1 255779 2003-07-15 08:09:30.030 2003-07-15
> 16:05:53.323
> 3 1 205716 2003-07-15 16:05:53.323 2003-07-15
> 16:06:17.527
> 3 1 203716 2003-07-15 16:06:17.527 2003-07-18
> 10:53:48.927
> 3 1 203716 2003-07-18 10:53:48.927 2003-07-18
> 10:55:17.677
> 3 1 102001 2003-07-18 10:55:17.677 2003-07-18
> 10:55:54.190
> 3 1 203716 2003-07-18 10:55:54.190 2003-07-18
> 15:40:57.113
> 3 1 203716 2003-07-18 15:40:57.113 2003-07-18
> 15:41:05.660
> 3 1 203716 2003-07-18 15:41:05.660 2003-07-18
> 16:22:23.050
> 3 1 203716 2003-07-18 16:22:23.050 2003-07-23
> 16:25:37.407
> I want to be able to group each Unit, Line and ProductCode together so
> I can pick up the StartTime and EndTime for consecutive product runs.
> i.e.ProductCode 203716 starts at 2003-07-15 16:06:17.527 and runs
> until 2003-07-18 10:55:17.677.
> This is then replaced by ProductCode 102001 which runs from 2003-07-18
> 10:55:17.677 until 2003-07-18 10:55:54.190 and is then replaced by
> 203716 which runs from 2003-07-18 10:55:54.190 until 2003-07-23
> 16:25:37.407.
> I tried the following view:
> SELECT TOP 100 PERCENT Unit, Line, ProductCode, MIN(StartTime) AS
> StartTime, MAX(EndTime) AS EndTime FROM D_ProductionLog
> GROUP BY Unit, Line, ProductCode
> ORDER BY StartTime, Unit, Line
> But it takes the first time a Product is run and the last time it was
> run i.e.
> for ProductCode 203716
> StartTime 2003-07-15 16:06:17.527
> EndTime 2003-07-23 16:25:37.407
> Instead of two records.
> Many thanks
> Jim|||Greetings,
Unless you breathe,sniff,snort and live S2k sql queries like
this are difficult to construct.An easy way to solve this
problem is to create a column with the same value for every
combination of Unit,Line and ProductCode based on the sort
order of StartTime.This new column or rank can then be used
as a grouping column to find the min and max times for each
Unit,Line and ProductCode combination.Current versions of
Oracle and DB2 have this sql99 functionality,S2k does not.
create table #D_ProductionLog(Unit int,Line int,
ProductCode int,StartTime datetime,EndTime datetime)
go
insert #D_ProductionLog values(3,1,120064,'2002-08-30
12:31:26.810','2003-05-27 11:39:51.157')
insert #D_ProductionLog values(3,1,283724,'2003-05-27
11:39:51.157','2003-05-27 12:51:52.423')
insert #D_ProductionLog values(3,1,283724,'2003-05-27
12:51:52.423','2003-05-30 07:12:38.997')
insert #D_ProductionLog values(3,1,285775,'2003-05-30
07:12:38.997','2003-06-16 10:37:01.813')
insert #D_ProductionLog values(3,1,230571,'2003-06-16
10:37:01.813','2003-07-04 12:40:32.097')
insert #D_ProductionLog values(3,1,260775,'2003-07-04
12:40:32.097','2003-07-07 14:25:18.483')
insert #D_ProductionLog values(3,1,260775,'2003-07-07
14:25:18.483','2003-07-08 07:02:04.303')
insert #D_ProductionLog values(3,1,265775,'2003-07-08
07:02:04.303','2003-07-14 12:37:42.513')
insert #D_ProductionLog values(3,1,215690,'2003-07-14
12:37:42.513','2003-07-14 12:38:23.450')
insert #D_ProductionLog values(3,1,255779,'2003-07-14
12:38:23.450','2003-07-15 08:09:30.030')
insert #D_ProductionLog values(3,1,255779,'2003-07-15
08:09:30.030','2003-07-15 16:05:53.323')
insert #D_ProductionLog values(3,1,205716,'2003-07-15
16:05:53.323','2003-07-15 16:06:17.527')
insert #D_ProductionLog values(3,1,203716,'2003-07-15
16:06:17.527','2003-07-18 10:53:48.927')
insert #D_ProductionLog values(3,1,203716,'2003-07-18
10:53:48.927','2003-07-18 10:55:17.677')
insert #D_ProductionLog values(3,1,102001,'2003-07-18
10:55:17.677','2003-07-18 10:55:54.190')
insert #D_ProductionLog values(3,1,203716,'2003-07-18
10:55:54.190','2003-07-18 15:40:57.113')
insert #D_ProductionLog values(3,1,203716,'2003-07-18
15:40:57.113','2003-07-18 15:41:05.660')
insert #D_ProductionLog values(3,1,203716,'2003-07-18
15:41:05.660','2003-07-18 16:22:23.050')
insert #D_ProductionLog values(3,1,203716,'2003-07-18
16:22:23.050','2003-07-23 16:25:37.407')
You can use the RAC utility for S2k to easily create this
virtual column.The 'drank' column binds together the same Unit,Line
and ProductCode combinations in the sort order of StartTime.
Exec Rac
@.transform='_dummy_',@.style='121',@.datelen='25',
@.rows='Unit & Line & ProductCode & StartTime(date) & EndTime(date)',
@.rowsort='StartTime & Unit & Line & ProductCode',
@.pvtcol='Report Mode',
@.from='#D_ProductionLog',
-- create the virtual column drank using the @.rowindicators parameter.
@.rowindicators='ProductCode{drank}',@.counterdatatype='integer',
@.grand_totals='n',@.rowbreak='n',@.racheck='y',@.defaultexceptions='dumy'
This result shows drank* column and how it binds.It is simply an
incrementing integer for each combination.
Unit Line ProductCode StartTime EndTime
drank*
-- -- -- -- --
-- --
3 1 120064 2002-08-30 12:31:26.810 2003-05-27 11:39:51.157
1
3 1 283724 2003-05-27 11:39:51.157 2003-05-27 12:51:52.423
2
3 1 283724 2003-05-27 12:51:52.423 2003-05-30 07:12:38.997
2
3 1 285775 2003-05-30 07:12:38.997 2003-06-16 10:37:01.813
3
3 1 230571 2003-06-16 10:37:01.813 2003-07-04 12:40:32.097
4
3 1 260775 2003-07-04 12:40:32.097 2003-07-07 14:25:18.483
5
3 1 260775 2003-07-07 14:25:18.483 2003-07-08 07:02:04.303
5
3 1 265775 2003-07-08 07:02:04.303 2003-07-14 12:37:42.513
6
3 1 215690 2003-07-14 12:37:42.513 2003-07-14 12:38:23.450
7
3 1 255779 2003-07-14 12:38:23.450 2003-07-15 08:09:30.030
8
3 1 255779 2003-07-15 08:09:30.030 2003-07-15 16:05:53.323
8
3 1 205716 2003-07-15 16:05:53.323 2003-07-15 16:06:17.527
9
3 1 203716 2003-07-15 16:06:17.527 2003-07-18 10:53:48.927
10
3 1 203716 2003-07-18 10:53:48.927 2003-07-18 10:55:17.677
10
3 1 102001 2003-07-18 10:55:17.677 2003-07-18 10:55:54.190
11
3 1 203716 2003-07-18 10:55:54.190 2003-07-18 15:40:57.113
12
3 1 203716 2003-07-18 15:40:57.113 2003-07-18 15:41:05.660
12
3 1 203716 2003-07-18 15:41:05.660 2003-07-18 16:22:23.050
12
3 1 203716 2003-07-18 16:22:23.050 2003-07-23 16:25:37.407
12
Using the drank column in a GROUP BY easily solves the problem in
a single execution of RAC.
Exec Rac
@.transform='_dummy_',@.style='121',@.datelen='25',
@.rows='Unit & Line & ProductCode & StartTime(date) & EndTime(date)',
@.rowsort='StartTime & Unit & Line & ProductCode',
@.pvtcol='Report Mode',
@.from='#D_ProductionLog',
@.rowindicators='ProductCode{drank}',@.counterdatatype='integer',
@.grand_totals='n',@.rowbreak='n',@.racheck='y',
-- Use a simple group by query to solve the problem.
-- You could also include the count of each combination if desired.
@.select='select Unit,Line,ProductCode,MIN(StartTime) AS StartTime,
MAX(EndTime) AS EndTime
from rac
group by drank,Unit,Line,ProductCode
order by drank'
Unit Line ProductCode StartTime EndTime
-- -- -- -- --
--
3 1 120064 2002-08-30 12:31:26.810 2003-05-27 11:39:51.157
3 1 283724 2003-05-27 11:39:51.157 2003-05-30 07:12:38.997
3 1 285775 2003-05-30 07:12:38.997 2003-06-16 10:37:01.813
3 1 230571 2003-06-16 10:37:01.813 2003-07-04 12:40:32.097
3 1 260775 2003-07-04 12:40:32.097 2003-07-08 07:02:04.303
3 1 265775 2003-07-08 07:02:04.303 2003-07-14 12:37:42.513
3 1 215690 2003-07-14 12:37:42.513 2003-07-14 12:38:23.450
3 1 255779 2003-07-14 12:38:23.450 2003-07-15 16:05:53.323
3 1 205716 2003-07-15 16:05:53.323 2003-07-15 16:06:17.527
3 1 203716 2003-07-15 16:06:17.527 2003-07-18 10:55:17.677
3 1 102001 2003-07-18 10:55:17.677 2003-07-18 10:55:54.190
3 1 203716 2003-07-18 10:55:54.190 2003-07-23 16:25:37.407
RAC v2.2 and QALite released.
www.rac4sql.net

No comments:

Post a Comment