Friday, February 24, 2012

group by

Hi.

I have read lots, but obviously not the right stuff.
A reference pointer would be fine, or an answer. I in fact got an oracle
solution that I am still researching, but was wondering about either a generic
or an ms sql server solution?

I'd like to use sql to find out:

given a metadata table
where object is a table name
attribute is a col name,
keyseq is if the attribute is
part of a primary key

AND

given (by rules, not db design) that you should only have a single attribute
with a given non null value in keyseq

AND

table metatable(object varchar2, attribute varchar2, keyseq int)
table1, col1, 1
table1, col2, 2
table1, col3, null
table1, col4, null
table2, col1, 1
table2, col2, 1
table2, col3, null
table2, col4, null
table3, col1, 1
table3, col2, 2
table3, col3, null
table3, col4, null

I'd like to easily find out what tables have more than one attribute for any
keyseq that is not null, and which attribute it was with the problem.

Is it possible?

I started off with select object, attribute, keyseq from metatable
where keyseq is not null, and got a few thousand rows.

Then I tried a count(*) and group by (I wish I knew this aggregate stuff
better) like this:

select object, keyseq, count(keyseq) from metatable
where keyseq is not null
group by object, keycolseq
order by (count(keyseq))

and got a few thousand rows, where the very last row told me the object that
had more than one attribute with a single keyseq value.

So yes, I can figure it out, but I was hoping/wondering if there was a better
way, using just sql, that would return what object/attribute had the
'violation'.

I thought I might be able to use the above query in some sort of
subquery/exists combination, but I'm at a loss.

Some helpful person (thanks m cadot) in the Oracle group gave me some
guidance and made an initial suggestion which I show here, and a more advanced
solution I'm not showing because I don't understand it, and it might be oracle
specific.

select object, keyseq, count(keyseq)
from metatable
where keyseq is not null
group by object, keyseq
having count(*) > 1 --<-- just this line to add
order by count(keyseq)

thanks for your time and knowledge.

Thanks
Jeff Kish
Jeff KishThis should work. WHERE limits which cases go into the tally.
HAVING limits which results are output after the tally is completed.

In this case, it limits the output to those object-keyseq combinations
where the count is greater than one, which I believe is what you're
after.

> Some helpful person (thanks m cadot) in the Oracle group gave me some
> guidance and made an initial suggestion which I show here, and a more advanced
> solution I'm not showing because I don't understand it, and it might be oracle
> specific.
> select object, keyseq, count(keyseq)
> from metatable
> where keyseq is not null
> group by object, keyseq
> having count(*) > 1 --<-- just this line to add
> order by count(keyseq)|||Jeff Kish (jeff.kish@.mro.com) writes:
> Some helpful person (thanks m cadot) in the Oracle group gave me some
> guidance and made an initial suggestion which I show here, and a more
> advanced solution I'm not showing because I don't understand it, and it
> might be oracle specific.
> select object, keyseq, count(keyseq)
> from metatable
> where keyseq is not null
> group by object, keyseq
> having count(*) > 1 --<-- just this line to add
> order by count(keyseq)

It can't be any more standard SQL than this. The above SELECT should
run on any RDBMS that supports SQL.

HAVING is like WHERE, but is applied after the GROUP BY, and thus
permits filters with aggregate functions.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||On Mon, 19 Jun 2006 21:58:38 +0000 (UTC), Erland Sommarskog
<esquel@.sommarskog.se> wrote:

>Jeff Kish (jeff.kish@.mro.com) writes:
>> Some helpful person (thanks m cadot) in the Oracle group gave me some
>> guidance and made an initial suggestion which I show here, and a more
>> advanced solution I'm not showing because I don't understand it, and it
>> might be oracle specific.
>>
>> select object, keyseq, count(keyseq)
>> from metatable
>> where keyseq is not null
>> group by object, keyseq
>> having count(*) > 1 --<-- just this line to add
>> order by count(keyseq)
>It can't be any more standard SQL than this. The above SELECT should
>run on any RDBMS that supports SQL.
>HAVING is like WHERE, but is applied after the GROUP BY, and thus
>permits filters with aggregate functions.
How would I add the attribute column so I know exactly which ones are
of interest? That was what is the most difficult thing for me to
figure out.

Thanks|||Jeff Kish (kishjjrjj@.charter.net) writes:
> How would I add the attribute column so I know exactly which ones are
> of interest? That was what is the most difficult thing for me to
> figure out.

Not sure that I understood your original question, but try:

SELECT a.object, a.attribute, a.keyseq
FROM metatable a
JOIN (SELECT object, keyseq
FROM metatable
WHERE keyseq IS NOT NULL
GROUP BY object, keyseq
HAVING COUNT(*) > 1) AS b ON a.object = b.object
AND a.keyseq = b.keyseq

The thing in parenthesis is a derived table. You could think of a
derived table as a temp table within the query, but never materialised.
Or even computed, the optimizer may recasts the computation order
to get performance. This is a great tool to write complex queries
effeciently.

And, yes, this is ANSI-SQL that should run on most RDBMS. (It's not
as vintage as HAVING though.)

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||On Tue, 20 Jun 2006 22:04:41 +0000 (UTC), Erland Sommarskog
<esquel@.sommarskog.se> wrote:

>Jeff Kish (kishjjrjj@.charter.net) writes:
>> How would I add the attribute column so I know exactly which ones are
>> of interest? That was what is the most difficult thing for me to
>> figure out.
>Not sure that I understood your original question, but try:
> SELECT a.object, a.attribute, a.keyseq
> FROM metatable a
> JOIN (SELECT object, keyseq
> FROM metatable
> WHERE keyseq IS NOT NULL
> GROUP BY object, keyseq
> HAVING COUNT(*) > 1) AS b ON a.object = b.object
> AND a.keyseq = b.keyseq
>The thing in parenthesis is a derived table. You could think of a
>derived table as a temp table within the query, but never materialised.
>Or even computed, the optimizer may recasts the computation order
>to get performance. This is a great tool to write complex queries
>effeciently.
>And, yes, this is ANSI-SQL that should run on most RDBMS. (It's not
>as vintage as HAVING though.)
Wow.
I'll have to study up some more on the 'ON' and 'AS' syntax.

Thanks so much Erland.

I appreciate both the answer and the education.
kind regards|||Jeff Kish (kishjjrjj@.charter.net) writes:
> Wow.
> I'll have to study up some more on the 'ON' and 'AS' syntax.

The AS is optional and is only for defining an alias for the derived
table. The alias is always mandatory, though.

The ON is part of the newer ANSI syntax for joins which was introduced to
handle left joins, but once you have get used to it, you will use it for
all you joins, because it makes the queries so much clearer.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||On Wed, 21 Jun 2006 06:56:24 +0000 (UTC), Erland Sommarskog
<esquel@.sommarskog.se> wrote:

>Jeff Kish (kishjjrjj@.charter.net) writes:
>> Wow.
>> I'll have to study up some more on the 'ON' and 'AS' syntax.
>The AS is optional and is only for defining an alias for the derived
>table. The alias is always mandatory, though.
>The ON is part of the newer ANSI syntax for joins which was introduced to
>handle left joins, but once you have get used to it, you will use it for
>all you joins, because it makes the queries so much clearer.
study study study.. it never ends...
Thanks again. Bonus that it works for sql server 2000 as well, I imagine, the
newer version.
Regards
Jeff Kish

No comments:

Post a Comment