[tmql-wg] Result set requirements

Robert Barta rho@bigpond.net.au
Sat, 21 Feb 2004 10:37:48 +1000


On Fri, Feb 20, 2004 at 11:38:48PM +0100, Rani Pinchuk wrote:
> > In AsTMa? we do not allow to "generate arbitrary text". We only allow
> > 
> >   - lists
> >   - XML as an internal representation, and
> >   - TM data (again in internal representation)

> Indeed I was thinking that the idea is to create any textual format and 
> not only the three formats you pointed out.
> 
> But when looking at the example in the the tutorial of AsTMa?:
> 
> function test (map $m) as xml return 
> <albums group="garbage">{
>    forall [ $a (album)
>             bn: $bn ] in $m
>    return
>       <album id="{$a}">{$bn}</album>
> }</albums>
> 
> I cannot avoid from thinking about similar example like generating html:
> 
> function test2 (map $m) as text return
> <table><tr>{
>    forall [ $a (album)
>             bn: $bn ] in $m
>    return 
>       <td>{$a}</td><td>{$bn}</td></tr><tr>
> }</tr></table>

Could be that you are lured into thinking this. :-) The fact is that
it is XML in both cases and that the AsTMa? parser will, for instance,
refuse to accept this:

function test2 (map $m) as text return
<table><tr>{
   forall [ $a (album)
            bn: $bn ] in $m
   return
      <td>{$a}<td>{$bn}</tr>   # superfluous </tr> here and <td> not closed
}</tr></table>

as this will never be able to generate well-formed XML.

It goes without saying (I like that phrase) that adopting particular
XML vocabularies does not lead to anywhere.

> or generating comma separated records:
> 
> function test3 (map $m) as text return
> {
>    forall [ $a (album)
>             bn: $bn ] in $m
>    return 
>       {$a},{$bn}
> }

Right, but AsTMa? does NOT allow this. You will have to open a tuple
like this:

 function test3 (map $m) as list return {
    forall [ $a (album)
             bn: $bn ] in $m
    return
       ({$a},{$bn})
 }

So it is indicated by () that you plan to return a list of tuples.
If you start with <something, then AsTMa? assumes you are using
XML, otherwise it is TM (AsTMa=) content.

> I mean - unless I missed something, the _syntax_ of AsTMa? does not
> limit the output only to XML (apart from "as xml" declaration) and
> if there is such a limitation, I find it maybe a bit artificial.

Indeed AsTMa? DOES limit you to use these content types:

   http://astma.it.bond.edu.au/astma%3F-spec.dbk?section=7

The reasoning is: To (a) work with TMs which encourage/force you to
structure your content and (b) work with XML which forces you to
structure your content and then (c) to allow to generate arbitrary
text seems rather hypocritical to me :-)

> If, for example, XML is generated by the defining DTD and tying the result
> set to different elements, the language is really built to generate only
> XML and in my opinion, is also simpler because the fact that we narrowing 
> down to XML only is used to make it simpler for the user. Example (totally 
> imaginary extension to Toma which probably is not that consistent, but 
> hopefully will make sense):
> 
> # define dtd for the xml to be produced:
> define dtd 'albums':
>   <?xml version="1.0"?>
>   <!DOCTYPE albums [
>         <!ELEMENT albums (album)*>
>         <!ATTLIST albums group CDATA #REQUIRED>
>         <!ELEMENT album (#PCDATA)>
>         <!ATTLIST album id CDATA #REQUIRED>
>   ]>
> ;
> 
> # for clarity: without the presentation, the query looks like:
> # select $a.id, $a.bn where exists $a;
> # 
> select 'garbage' as attribute(albums, group),
>        $a.id as attribute(album, id),
>        $a.bn as element_data(album)
>        with dtd 'albums'
> where exists $a;

This is exactly, what AsTMa? does except that it works without a
necessary commitment to DTD/WXS/Relax/Schematron/... and it has the
content inlined. I found that convenient at that time.

I could imagine that Toma uses XPath:

  select 'garbage' into /albums/@group,
         $a.id     into /albums/album/@id,
         $a.bn     into /albums/album/text()
         returns # type information here
           <!ELEMENT albums (album)*>
           <!ATTLIST albums group CDATA #REQUIRED>
           <!ELEMENT album (#PCDATA)>
           <!ATTLIST album id CDATA #REQUIRED>

and the returns clause could be optional.

> Here there the fact that we use DTD, means that build in the syntax
> we limit the language to produce XML. Hopefully this lead to simpler
> code ....

That was the idea. But as Lars pointed out: "It depends". In some
cases I'd rather prefer to have a list. Maybe I hate working with XML,
or do not like the overhead or do not need the structure, because my
content is just a simple string. Then forcing XML onto a user would be
rather impolite.

In some cases I _definitely_ want a TM as output. I am into content
syndication and without that feature all my solutions and ideas would
not work. If I cannot express a TM -> TM transformation which converts
content between two different ontologies, I would need dedicated
software for each of these transformations. A nightmare.

\rho