Wednesday, March 7, 2012

group by issue

Alright, so I'm trying to bind a list of pictures to a repeater but the sql I'm trying to write is killing me. The goal is simple: Write a query that returns all the pictures where the selected people are tagged but only return those pictures where all selected people are in them. So far I have the following SQL statement.

SELECT picture.id, picture.name, picture.album_id,
tag.id , tag.picture_id, tag.people_id
FROM picture LEFT OUTER JOIN
tag ON picture.id = tag.picture_id
WHERE (tag.people_id = '1') OR
(tag.people_id = '3')
GROUP BY picture.id
HAVING (COUNT(picture.id) > 1)

Now the problem is that you have to include all the columns that appear in the SELECT statement in the GROUP BY clause otherwise it errors out (this query will do that) but if I include them all then I get zero results when I know I should have 1 result given this particular query's desired result. Now when I was working with mySQL you could leave out columns in the GROUP BY and mySQL would handle it but seems SQL Server is not so forgiving.

Any ideas?

Without seeing the data, etc, it's hard to say for sure, but I suspect that the group by is not your problem (and btw, a group by without including the group by fields in the select list makes absolutely no sense, even though I know some databases may implicitely add the columsn for you -- not sure how mySQL handles it). I suspect you may be running into a subtle problem with the new ansi syntax for outer joins (maybe I'm wrong) when you include a test on the outer table. Seehttp://www.databasejournal.com/features/mssql/article.php/1438001 and scroll down to "outer join gotchas"

|||

I think I have the outer join right...see below

SELECT picture.name, picture.id, folders.pictures + picture.name AS filename
FROM folders CROSS JOIN
picture LEFT OUTER JOIN
tag ON picture.id = tag.picture_id
WHERE (tag.people_id = '1') OR
(tag.people_id = '3')

Produces (just test data):

nameidfilenameCopy of Copy of Copy of IMG_0052.jpg90~\Pictures\Copy of Copy of Copy of IMG_0052.jpgCopy of Copy of Copy of IMG_0052.jpg90~\Pictures\Copy of Copy of Copy of IMG_0052.jpgCopy of Copy of Copy of IMG_0053.jpg91~\Pictures\Copy of Copy of Copy of IMG_0053.jpgCopy of Copy of Copy of IMG_0054.jpg92~\Pictures\Copy of Copy of Copy of IMG_0054.jpg

Now this produces an output of 4 results because person 1 & 3 are both tagged in two pictures with one of those pictures containing both people. Now what I need to do is tweak the query so that it only returns that one record, the one that is duplicated. And now that I've typed this all out something has come to me so I tried the following query.

SELECT picture.name, picture.id, folders.pictures + picture.name AS filename
FROM folders CROSS JOIN
picture LEFT OUTER JOIN
tag ON picture.id = tag.picture_id
WHERE (tag.people_id = '1') OR
(tag.people_id = '3')
GROUP BY picture.name, picture.id, folders.pictures, picture.name
HAVING (COUNT(picture.id) > 1)

And that did it. Finally got the number that I was looking for. And the below query produces the count on the previous page to notify the user how many results the above query will return when they click search

SELECT COUNT(DISTINCT picture.name) FROM picture left OUTER JOIN tag ON picture.id = tag.picture_id where ( people_id = '1' or people_id = '3' ) GROUP BY picture.name HAVING (COUNT(picture.name) > 1)

Thanks for making me think a little more into it. Sometimes you get so stuck in one way of thinking it takes something else to get you to think about it from a different perspective, plus the fact I always have trouble with group by clauses

No comments:

Post a Comment