Monday, March 12, 2012

GROUP BY with multiple columns

Hi all,
Iam havin a rather complex query and need to add another column in the
resultset. That new column is a COUNT aggregation and I need to use the GROU
P
BY clause. Below is the query that I tried. However, there is a problem with
text, ntext or image columns being in the GROUP BY clause. Is there another
way?
QUERY:
--
SELECT
p.id, p.title, p.state, p.infile, p.description, p.price, p.link,
p.match_id, p.shop_id,
p.small_image_id, p.big_image_id, s.state AS small_state, b.state AS
big_state, m.category_name,
m.subcategory_id, COUNT(v.filter_value_id) FROM
cds_products p
JOIN
cds_matched_categories m
ON
p.match_id = m.id
JOIN
cds_small_images s
ON
p.small_image_id = s.id
JOIN
cds_big_images b
ON
p.big_image_id = b.id
JOIN
cds_product_2_filter_values v
ON
p.id = v.product_id
WHERE p.shop_id = 66
GROUP BY p.id, p.title, p.state, p.infile, p.description, p.price, p.link,
p.match_id, p.shop_id,
p.small_image_id, p.big_image_id, s.state, b.state, m.category_name,
m.subcategory_id
SCHEMA:
--
CREATE TABLE [dbo].[cds_big_images] (
[id] [int] IDENTITY (1, 1) NOT NULL ,
[path] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
[state] [int] NOT NULL ,
[shop_id] [int] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[cds_matched_categories] (
[id] [int] NOT NULL ,
[shop_id] [int] NOT NULL ,
[category_name] [nvarchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
[subcategory_id] [int] NULL ,
[state] [int] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[cds_product_2_filter_values] (
[product_id] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,
[filter_value_id] [int] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[cds_products] (
[id] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,
[title] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
[state] [int] NOT NULL ,
[infile] [int] NOT NULL ,
[description] [text] COLLATE Latin1_General_CI_AS NULL ,
[price] [money] NOT NULL ,
[link] [text] COLLATE Latin1_General_CI_AS NOT NULL ,
[match_id] [int] NOT NULL ,
[shop_id] [int] NOT NULL ,
[small_image_id] [int] NOT NULL ,
[big_image_id] [int] NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
CREATE TABLE [dbo].[cds_small_images] (
[id] [int] IDENTITY (1, 1) NOT NULL ,
[path] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
[state] [int] NOT NULL ,
[shop_id] [int] NOT NULL
) ON [PRIMARY]
GOYou cannot group by large objects.
May be you do something like this..
Instead of grouping by the all the columns
you can join with a subquery.
instead of joining with cds_product_2_filter_values
join it with
(select product_id, count(filter_value_id) as tot_count from
cds_product_2_filter_values) as v
and then directly select tot_count.
I don't know the busniess or the type of relationship with the table.
May be you can think in these lines and try to find the count without
grouping by the text col.
Hope this helps.
-Omnibuzz (The SQL GC)
http://omnibuzz-sql.blogspot.com/
"Reik" wrote:

> Hi all,
> Iam havin a rather complex query and need to add another column in the
> resultset. That new column is a COUNT aggregation and I need to use the GR
OUP
> BY clause. Below is the query that I tried. However, there is a problem wi
th
> text, ntext or image columns being in the GROUP BY clause. Is there anothe
r
> way?
> QUERY:
> --
> SELECT
> p.id, p.title, p.state, p.infile, p.description, p.price, p.link,
> p.match_id, p.shop_id,
> p.small_image_id, p.big_image_id, s.state AS small_state, b.state AS
> big_state, m.category_name,
> m.subcategory_id, COUNT(v.filter_value_id) FROM
> cds_products p
> JOIN
> cds_matched_categories m
> ON
> p.match_id = m.id
> JOIN
> cds_small_images s
> ON
> p.small_image_id = s.id
> JOIN
> cds_big_images b
> ON
> p.big_image_id = b.id
> JOIN
> cds_product_2_filter_values v
> ON
> p.id = v.product_id
> WHERE p.shop_id = 66
> GROUP BY p.id, p.title, p.state, p.infile, p.description, p.price, p.link,
> p.match_id, p.shop_id,
> p.small_image_id, p.big_image_id, s.state, b.state, m.category_name,
> m.subcategory_id
>
>
> SCHEMA:
> --
> CREATE TABLE [dbo].[cds_big_images] (
> [id] [int] IDENTITY (1, 1) NOT NULL ,
> [path] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
> [state] [int] NOT NULL ,
> [shop_id] [int] NOT NULL
> ) ON [PRIMARY]
> GO
> CREATE TABLE [dbo].[cds_matched_categories] (
> [id] [int] NOT NULL ,
> [shop_id] [int] NOT NULL ,
> [category_name] [nvarchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
> [subcategory_id] [int] NULL ,
> [state] [int] NOT NULL
> ) ON [PRIMARY]
> GO
> CREATE TABLE [dbo].[cds_product_2_filter_values] (
> [product_id] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,
> [filter_value_id] [int] NOT NULL
> ) ON [PRIMARY]
> GO
> CREATE TABLE [dbo].[cds_products] (
> [id] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,
> [title] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
> [state] [int] NOT NULL ,
> [infile] [int] NOT NULL ,
> [description] [text] COLLATE Latin1_General_CI_AS NULL ,
> [price] [money] NOT NULL ,
> [link] [text] COLLATE Latin1_General_CI_AS NOT NULL ,
> [match_id] [int] NOT NULL ,
> [shop_id] [int] NOT NULL ,
> [small_image_id] [int] NOT NULL ,
> [big_image_id] [int] NOT NULL
> ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
> GO
> CREATE TABLE [dbo].[cds_small_images] (
> [id] [int] IDENTITY (1, 1) NOT NULL ,
> [path] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
> [state] [int] NOT NULL ,
> [shop_id] [int] NOT NULL
> ) ON [PRIMARY]
> GO
>|||If you didn't mind truncating a little data in the resultset you could
CAST() the 2 text columns (description & link) to VARCHAR(8000) in the
select list and the group by clause. Do those columns both really need
to be more than 8000 characters? I'm assuming link is a URL or some
kind of reference to another product.
Has dbo.cds_products got a primary key? I assume the id column is the
primary key. If so then you could pull all the stuff from
dbo.cds_products out into a main query and join in the GROUP BY stuff in
a derived table like this (untested):
SELECT
prod.id, prod.title, prod.state, prod.infile, prod.description,
prod.price, prod.link, prod.match_id, prod.shop_id,
prod.small_image_id, prod.big_image_id,
d.small_state, d.big_state, d.category_name, d.subcategory_id,
d.filter_count
FROM dbo.cds_products as prod
INNER JOIN
(
SELECT
p.id, s.state AS small_state, b.state AS big_state,
m.category_name,
m.subcategory_id, COUNT(v.filter_value_id) as filter_count
FROM cds_products as p
INNER JOIN cds_matched_categories AS m ON p.match_id = m.id
INNER JOIN cds_small_images AS s ON p.small_image_id = s.id
INNER JOIN cds_big_images AS b ON p.big_image_id = b.id
INNER JOIN cds_product_2_filter_values as v ON p.id =
v.product_id
WHERE p.shop_id = 66
GROUP BY p.id, s.state, b.state, m.category_name,
m.subcategory_id
) AS d ON d.id = prod.id
That way you don't need to GROUP BY the text columns as the
cds_products.id column is enough to do the grouping from that table.
*mike hodgson*
http://sqlnerd.blogspot.com
Reik wrote:

>Hi all,
>Iam havin a rather complex query and need to add another column in the
>resultset. That new column is a COUNT aggregation and I need to use the GRO
UP
>BY clause. Below is the query that I tried. However, there is a problem wit
h
>text, ntext or image columns being in the GROUP BY clause. Is there another
>way?
>QUERY:
>--
>SELECT
> p.id, p.title, p.state, p.infile, p.description, p.price, p.link,
>p.match_id, p.shop_id,
> p.small_image_id, p.big_image_id, s.state AS small_state, b.state AS
>big_state, m.category_name,
> m.subcategory_id, COUNT(v.filter_value_id) FROM
> cds_products p
>JOIN
> cds_matched_categories m
>ON
> p.match_id = m.id
>JOIN
> cds_small_images s
>ON
> p.small_image_id = s.id
>JOIN
> cds_big_images b
>ON
> p.big_image_id = b.id
>JOIN
> cds_product_2_filter_values v
>ON
> p.id = v.product_id
>WHERE p.shop_id = 66
>GROUP BY p.id, p.title, p.state, p.infile, p.description, p.price, p.link,
>p.match_id, p.shop_id,
> p.small_image_id, p.big_image_id, s.state, b.state, m.category_name,
> m.subcategory_id
>
>
>SCHEMA:
>--
>CREATE TABLE [dbo].[cds_big_images] (
> [id] [int] IDENTITY (1, 1) NOT NULL ,
> [path] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
> [state] [int] NOT NULL ,
> [shop_id] [int] NOT NULL
> ) ON [PRIMARY]
>GO
>CREATE TABLE [dbo].[cds_matched_categories] (
> [id] [int] NOT NULL ,
> [shop_id] [int] NOT NULL ,
> [category_name] [nvarchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
> [subcategory_id] [int] NULL ,
> [state] [int] NOT NULL
> ) ON [PRIMARY]
>GO
>CREATE TABLE [dbo].[cds_product_2_filter_values] (
> [product_id] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,
> [filter_value_id] [int] NOT NULL
> ) ON [PRIMARY]
>GO
>CREATE TABLE [dbo].[cds_products] (
> [id] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,
> [title] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
> [state] [int] NOT NULL ,
> [infile] [int] NOT NULL ,
> [description] [text] COLLATE Latin1_General_CI_AS NULL ,
> [price] [money] NOT NULL ,
> [link] [text] COLLATE Latin1_General_CI_AS NOT NULL ,
> [match_id] [int] NOT NULL ,
> [shop_id] [int] NOT NULL ,
> [small_image_id] [int] NOT NULL ,
> [big_image_id] [int] NOT NULL
> ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
>GO
>CREATE TABLE [dbo].[cds_small_images] (
> [id] [int] IDENTITY (1, 1) NOT NULL ,
> [path] [varchar] (255) COLLATE Latin1_General_CI_AS NOT NULL ,
> [state] [int] NOT NULL ,
> [shop_id] [int] NOT NULL
> ) ON [PRIMARY]
>GO
>
>|||Worked excellent thanks. One more question. You were right about these two
text columns. Somehow I didnt realize a varchar column could be more than 25
5
chars. Is there a rule of thumb whever you wanna use a large varchar column
or a text column? In my example, the descriptions will likely be less than
2000 chars and the URL's in the link column might be up to 300 chars. Should
I stick with a varchar column or is there any performance/storage drawback
with that?
"Mike Hodgson" wrote:

> If you didn't mind truncating a little data in the resultset you could
> CAST() the 2 text columns (description & link) to VARCHAR(8000) in the
> select list and the group by clause. Do those columns both really need
> to be more than 8000 characters? I'm assuming link is a URL or some
> kind of reference to another product.
> Has dbo.cds_products got a primary key? I assume the id column is the
> primary key. If so then you could pull all the stuff from
> dbo.cds_products out into a main query and join in the GROUP BY stuff in
> a derived table like this (untested):
> SELECT
> prod.id, prod.title, prod.state, prod.infile, prod.description,
> prod.price, prod.link, prod.match_id, prod.shop_id,
> prod.small_image_id, prod.big_image_id,
> d.small_state, d.big_state, d.category_name, d.subcategory_id,
> d.filter_count
> FROM dbo.cds_products as prod
> INNER JOIN
> (
> SELECT
> p.id, s.state AS small_state, b.state AS big_state,
> m.category_name,
> m.subcategory_id, COUNT(v.filter_value_id) as filter_count
> FROM cds_products as p
> INNER JOIN cds_matched_categories AS m ON p.match_id = m.i
d
> INNER JOIN cds_small_images AS s ON p.small_image_id = s.i
d
> INNER JOIN cds_big_images AS b ON p.big_image_id = b.id
> INNER JOIN cds_product_2_filter_values as v ON p.id =
> v.product_id
> WHERE p.shop_id = 66
> GROUP BY p.id, s.state, b.state, m.category_name,
> m.subcategory_id
> ) AS d ON d.id = prod.id
> That way you don't need to GROUP BY the text columns as the
> cds_products.id column is enough to do the grouping from that table.
> --
> *mike hodgson*
> http://sqlnerd.blogspot.com
>
> Reik wrote:
>
>|||varchar column is better than text column anytime.. and you can store to a
max of 8000 characters. But your table page size is 8 k. So, you rowsize
cannot go beyond 8k bytes. And varchar is better than text performance wise,
manageability and you can apply more functions on it :)
--
-Omnibuzz (The SQL GC)
http://omnibuzz-sql.blogspot.com/|||Nice. Then I will change the datatype of these two columns to varchar. Thank
s!
"Omnibuzz" wrote:

> varchar column is better than text column anytime.. and you can store to a
> max of 8000 characters. But your table page size is 8 k. So, you rowsize
> cannot go beyond 8k bytes. And varchar is better than text performance wis
e,
> manageability and you can apply more functions on it :)
> --
> -Omnibuzz (The SQL GC)
> http://omnibuzz-sql.blogspot.com/
>

No comments:

Post a Comment