Monday, March 12, 2012

GROUP BY, GROUP BY, and DISTINCT (Alejandro, you are a genius.)

Alejandro, you are a genius. Thanks! That's exactly what I needed.
(And a very elegant, short, simple solution, too!)
I was even going to try to add a new 4th column... but with (another) twist.
This does NOT work:
select
colA, colB, count(distinct colC), sum(distinct colD)
from
t
group by
colA, colB
order by
colA, colB
The "twist" is that colD's SUM... has to obey colC's "distinct'ness".
(It doesn't matter if colD has distinct values from itself.)
So the output for 24-Jan-2005 would be:
24-Jan-2005 Bill 2 120
The SUM(colD) column needs to be 120 (not 60), because 2 unique values were
in
the colC column. (Doesn't matter than colD has all the same values for
that day.)
Given this new data for input...
22-Jan-2005 Bill BA39833J 30
22-Jan-2005 Bill RJ3399K 30
22-Jan-2005 Bill KL9833LL 60
22-Jan-2005 Bill BA39833J 60
23-Jan-2005 Bill HP54599K 30
23-Jan-2005 Bill AA9833LL 90
23-Jan-2005 Bill BA90330Q 30
24-Jan-2005 Bill BA8993PPQ 60
24-Jan-2005 Bill BA8993PPQ 60
24-Jan-2005 Bill XX93939 60
24-Jan-2005 Bill BA8993PPQ 60
24-Jan-2005 Bill BA8993PPQ 60
(Thanks again)
-- Original Message --
From: "Alejandro Mesa" <AlejandroMesa@.discussions.microsoft.com>
Newsgroups: microsoft.public.sqlserver.programming
Sent: Monday, January 31, 2005 7:47 PM
Subject: RE: GROUP BY, GROUP BY, and DISTINCT
> Try,
> select
> colA,
> colB,
> count(distinct colC)
> from
> t
> group by
> colA,
> colB
> order by
> colA,
> colB
> AMB
> ""A_Michigan_User"" wrote:
>In this particular case, your desired sum() can be 120. But that isn't
accurate if the values for colD are different between the rows.
Anyway, here are 2 probable scenarios. Pick whichever you want for your
case.
--#1 only the min
select colA,colB,count(*),sum(colD)
from
(select
colA, colB, colC, min(colD) colD
from
t
group by
colA, colB,colC) derived
order by
colA, colB
--#2 only the max
select colA,colB,count(*),sum(colD)
from
(select
colA, colB, colC, max(colD) colD
from
t
group by
colA, colB,colC) derived
order by
colA, colB
-oj
""A_Michigan_User"" <Michigan_RE_M0VE@.ameritech.net> wrote in message
news:eOCaWWLCFHA.2540@.TK2MSFTNGP09.phx.gbl...
> Alejandro, you are a genius. Thanks! That's exactly what I needed.
> (And a very elegant, short, simple solution, too!)
> I was even going to try to add a new 4th column... but with (another)
> twist.
> This does NOT work:
> select
> colA, colB, count(distinct colC), sum(distinct colD)
> from
> t
> group by
> colA, colB
> order by
> colA, colB
> The "twist" is that colD's SUM... has to obey colC's "distinct'ness".
> (It doesn't matter if colD has distinct values from itself.)
> So the output for 24-Jan-2005 would be:
> 24-Jan-2005 Bill 2 120
> The SUM(colD) column needs to be 120 (not 60), because 2 unique values
> were in
> the colC column. (Doesn't matter than colD has all the same values for
> that day.)
> Given this new data for input...
> 22-Jan-2005 Bill BA39833J 30
> 22-Jan-2005 Bill RJ3399K 30
> 22-Jan-2005 Bill KL9833LL 60
> 22-Jan-2005 Bill BA39833J 60
> 23-Jan-2005 Bill HP54599K 30
> 23-Jan-2005 Bill AA9833LL 90
> 23-Jan-2005 Bill BA90330Q 30
> 24-Jan-2005 Bill BA8993PPQ 60
> 24-Jan-2005 Bill BA8993PPQ 60
> 24-Jan-2005 Bill XX93939 60
> 24-Jan-2005 Bill BA8993PPQ 60
> 24-Jan-2005 Bill BA8993PPQ 60
> (Thanks again)
> -- Original Message --
> From: "Alejandro Mesa" <AlejandroMesa@.discussions.microsoft.com>
> Newsgroups: microsoft.public.sqlserver.programming
> Sent: Monday, January 31, 2005 7:47 PM
> Subject: RE: GROUP BY, GROUP BY, and DISTINCT
>
>|||Your code is probably right... but when I plugged in my column-names, and
my other requirements... I get this error:
Err #-2147217900
[Microsoft][ODBC SQL Server Driver][SQL Server]
No column was specified for column 1 of 'DerivedTable'.
(My Direvied-table seems to work fine, when run by itself... but
not when I nest it inside the outer SELECT code.)
SELECT
CONVERT(VarChar(10), DubbedOn, 110), -- I only need the date, not the
time
LEFT(DubbedBy,3), -- I only need the user's initials,
not the full name
COUNT(Clip), -- Count these. I used the fieldname,
instead of "*"
SUM(MyLen) -- Total up these
FROM
(
SELECT
CONVERT(VarChar(10), DubbedOn, 110),
LEFT(DubbedBy,3),
Clip,
MIN(ShwLen) AS MyLen -- I added the "AS" for this alias
FROM
MainTable -- The "real" table
GROUP BY
CONVERT(VarChar(10), DubbedOn, 110),
LEFT(DubbedBy,3),Clip
) AS MyDerivedTable -- I added the "AS" for this
derived-table
ORDER BY
CONVERT(VarChar(10), DubbedOn, 110),
LEFT(DubbedBy,3)
Am I close?
Thanks again
"oj" <nospam_ojngo@.home.com> wrote in message
news:u0jBxcOCFHA.3588@.TK2MSFTNGP11.phx.gbl...
> In this particular case, your desired sum() can be 120. But that isn't
> accurate if the values for colD are different between the rows.
Each ColC value... uniquely matches a ColD value.
(So MIN or MAX should both work.)

> Anyway, here are 2 probable scenarios. Pick whichever you want for your
> case.
> --#1 only the min
> select colA,colB,count(*),sum(colD)
> from
> (select
> colA, colB, colC, min(colD) colD
> from
> t
> group by
> colA, colB,colC) derived
> order by
> colA, colB
> --#2 only the max
> select colA,colB,count(*),sum(colD)
> from
> (select
> colA, colB, colC, max(colD) colD
> from
> t
> group by
> colA, colB,colC) derived
> order by
> colA, colB
>
> --
> -oj
>
> ""A_Michigan_User"" <Michigan_RE_M0VE@.ameritech.net> wrote in message
> news:eOCaWWLCFHA.2540@.TK2MSFTNGP09.phx.gbl...
>|||On Thu, 3 Feb 2005 13:29:43 -0500, "A_Michigan_User" wrote:

>Your code is probably right... but when I plugged in my column-names, and
>my other requirements... I get this error:
>Err #-2147217900
>[Microsoft][ODBC SQL Server Driver][SQL Server]
>No column was specified for column 1 of 'DerivedTable'.
>(My Direvied-table seems to work fine, when run by itself... but
>not when I nest it inside the outer SELECT code.)
Hi A_Michigan_User,
That your derived table runs by itself is good - it is one of the
requirements for derived tables. The other requirement is that all columns
should have a name. In your query, the columns Clip and MIN(ShwLen) AS
MyLen have a name; the columns CONVERT(...) and LEFT(...) don't. You can
use AS to alias these columns as well; in my example below, I'll use
another technique with the same effect.
The outer query can't use the columns from the table(s) used in the
derived query, only the ouput that you'd get if the derived query is run
by itself. So you can't use LEFT(DubbedBy,3) in the outer query, as
DubbedBy is not expoesed to the outer query - instead, you give this
column a name in the derived table and use that name in the outer query.
Besides, there is also no need to use LEFT(..) in the outer query, as the
results from the derived table is already trimmed down to the first three
characters.
Finally, when you change COUNT(*) to COUNT(Clip), you should be aware of
the difference between these two expressions. COUNT(*) gives the number of
rows, period. COUNT(Clip) will result in the number of rows with a
non-NULL value for Clip. Using COUNT(column name) instead of COUNT(*) can
result in a less efficient execution plan, as SQL Server will have to
check if the value is NULL or not. Of course, if your requirements are
such that you *need* to exclude the NULLs from the count, than you should
use the column name :-)
Here's my attempt to rewrite your query.
SELECT
DubbedOnDate,
DubbedByInitials,
COUNT(Clip), -- or COUNT(*) '
SUM(MyLen)
FROM
(
SELECT
CONVERT(VarChar(10), DubbedOn, 110),
LEFT(DubbedBy,3),
Clip,
MIN(ShwLen) -- Note: no AS here!!
FROM
MainTable -- The "real" table
GROUP BY
CONVERT(VarChar(10), DubbedOn, 110),
LEFT(DubbedBy,3),
Clip
) AS MyDerivedTable (DubbedOnDate, DubbedByInitials, Clip, MyLen)
-- Note how the columns are named in the above AS
-- I could just as well have used AS on each column and
-- supplied only an alias for the derived table here.
GROUP BY -- You had ORDER BY, but I assume this was a typo!
DubbedOnDate,
DubbedByInitials,
(untested)
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)

No comments:

Post a Comment