Wednesday, March 7, 2012

Group By in Xml

Hi everybody!
First of all, execute me for my english, because I'm not English ...
I've got this kind Xml stored in a column (tmx) of my db (the name of
the table is TA_TMX) :
<tmx>
<header>
...
</header>
<body>
<tu>
<tuv lang=3D"EN">
<seg> Hello </seg>
</tuv>
<tuv lang=3D"Es">
<seg> Hola </seg>
</tuv>
</tu>
<tu>
<tuv lang=3D"EN">
<seg> Bye </seg>
</tuv>
<tuv lang=3D"Es">
<seg> Adios </seg>
</tuv>
</tu>
</body>
</tmx>
I want to do a query to obtain this:
IdTu Language Text
-- -- --
1 EN Hello
1 ES Hola
2 EN Bye
2 ES Adios
I mean, I want to get the language (lang attribute of tuv node) and the
text (text of seg node) of each tuv node. I've done this like this and
it works.
select ref.value('@.lang[1]','nvarchar(10)') as Language,
ref.value('seg[1]','nvarchar(max)') as Text
from ta_tmx cross apply tmx.nodes('/tmx/body/tu/tuv') as
R(ref)
where id_tmx=3D1
Besides I would like to know if it's possible to get the number of tu
node which belongs that tuv node to. So, =BFIs it possible to group by
tu node and get that column of numbers'
Thanks!Here's a solution, although I suspect there may
be a better way.
It requires a numbers table (created on the fly here).
with Digits(Num) as
(select 0 union all select 1 union all select 2 union all
select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all
select 9),
Numbers(Num) as
(select cast(a.Num + 10*b.Num as int)
from Digits a,Digits b)
select Numbers.Num,
ref.value('@.lang[1]','nvarchar(10)') as Language,
ref.value('seg[1]','nvarchar(max)') as Text
from ta_tmx
cross join Numbers
cross apply
tmx.nodes('/tmx/body/tu[position()=sql:column("Numbers.Num")]/tuv') as
R(ref)|||Slight improvement, change
cross join Numbers
to
inner join Numbers on Num between 1 and
tmx.value('count(/tmx/body/tu)','int')|||A simpler way
select ref.value('for $a in .. return count(../../*[. << $a]) + 1',
'int') as IdTu,
ref.value('@.lang[1]','nvarchar(10)') as Language,
ref.value('seg[1]','nvarchar(max)') as Text
from ta_tmx
cross apply tmx.nodes('/tmx/body/tu/tuv') as R(ref)
where id_tmx=1|||It works. Thank you very much.
Would you mind explaining me what you're doing here ?
" ref.value('for $a in .. return count(../../*[. << $a]) + 1', 'int')",
because I don't understand it.
Thanks in advance for your help.
Regards.|||for $a in .. return count(../../*[. << $a]) + 1
Here's a brief explanation of the above
The query uses nodes('/tmx/body/tu/tuv')
which generates subtrees from tuv downwards
such that . starts at the tuv level.
.. is the nodes parent.
"for a$ in .." assigns a$ to the current nodes parent (the tu node).
"count(../../*[. << $a])" returns the number of nodes
at the "tu" level that precedes the current node. The
../../* simply navigates back to the root from the "tu" node.
As this count won't include the current node we add one to give
the desired result.

No comments:

Post a Comment