[tmql-wg] Result set requirements

Sat, 28 Feb 2004 19:36:40 +0100

> And my claim is that you cannot do that if you want TMQL have other
> results than lists of topic map items. And even then you have
> specified implicitely a 'list of topic map items':
> 
>    select $basename, $occurrence, ......
> 
> in contrast to, say,
> 
>    select $occurrence, $basename
It is true that any query language should be able to deliver 
somehow the results. The question is how complex you make this delivery. 

I see here kind of scale. You can start with something extremely simple 
like single string. You can continue with something like what SQL gives.
You can still continue with let's say adding the ability to deliver
whatever XML. Next step might be the ability to deliver any format or
structure. Actually you can still continue letting the user write
algorithms within the query language.
Something like (obviously, very approximate):

 ---+--------+---------+----------+-----------+---------
   single   SQL       XML        Any       algorithms
   string   like                Format

As long as we go to the right on this scale, the language becomes more
complex and hard to learn. Beside, that separation that I speak about
become impossible because the mix is build in the language.

I am not totally sure where we should end up in that scale. I guess I 
would prefer somewhere between SQL-like and XML.

You gave good reasons to include (any) XML. However, I think the query
language could do without it. I have two reasons for that:
1. It seems that in AsTMa (as well as in TMTL) the creation of XML
   (or other formats) is based on processing strings. So actually the 
   separation is easy to achieve:

   <albums>{
        forall $t [ $a (album)
                    bn: $bn ] in $m
        return
       <album id="{$a}">{$bn}</album>
   }
   </albums>

   Could also be written like:

   template: 
    <while condition="loop_over_query_results">   
      <albums>
        <album id="$a">$b</album>
      </album>
    </while>

   where "loop_over_query_results" is a callback function that gets the 
   query from a phrasebook of queries, runs it and place in $a and $b 
   the results (possibly with some extra processing of the results
   before sending them to the template). 

2. I am not sure what is the correct way to generate XML like the 
   above. Maybe if we decide in the end that TMQL can return topic
   objects and association objects as excerpts of XTM (so XML elements),
   we should use XSLT to generate other XMLs. But maybe this is too
   complex, and we should do something like you do it in AsTMa, or
   maybe we should do it using DTD and Xpath (actually your ideas 
   as well) described in
   http://www.spaceapplications.com/toma/Toma.html#xml
   So unless someone really know what is the correct way to do it,
   I think it is wrong to include it into TMQL.

> My reasoning is, that most people are familiar with list and XML
> data-structures and that there are enough programmatic means (even
> sometimes within the language) to post-process these.
True, and this is why I am not totally sure that we should not include 
XML output in TMQL. Maybe it is a real must in a modern language. Maybe
we are missing a standard of how to do this kind of things "correctly".
Anyway this is why I did include that DTD+Xpath->XML in the
missing/future section of Toma description.

> So, I guess, if a topic has a type, you include the type topic. If an
> association has a particular role you add the topic node for that? Will
> this not result in rather big submaps. And - in some cases which I could
> construct - every submap would be identical with the original map :-)
This is correct. I think that TMQL should be able to generate back a
topicmap that can be read into a topic map engine (so nothing is missing
to make it a complete topic map). BTW, It is not my idea - I am not sure
who wrote it, but it was written by someone on tmql-wg mailing list, and
I found it very correct.

> If think I understand your concern, namely, if someone has a particular
> query like
> 
>    forall [ some pattern P ]
>    return
>        some constructor C
> 
> that you want to factor out the way the data is used in the constructur
> and avoid that the repetition of the pattern P
> 
>    forall [ some pattern P ]
>    return
>        another constructor D
> 
> That would only be possible if you store the captured data which was
> detected in the pattern in an intermediate store which you can then
> reuse with different constructors C and D.
> 
> In any serious language that intermediate store is SO COMPLEX that
> reusing it becomes definitely more expensive than simply repeating
> the pattern.
I am not sure that we understood each other. My only gain in the 
separation is that the final application is more readable and
maintainable. The price I pay is performance and complexity.

It is very similar to the usual way one deals with error messages:
you can hard code them into your application, which makes it  
faster to code, and to run. Or you can place them in a separate file,
which make your code more readable and maintainable in the price of 
some overhead (performance and complexity) with getting the error
message from the other file.   

So I don't try to avoid running the same query. I try to avoid
hard-coding the same query. And the same with the code that generate 
the output - I try to avoid mixing those two, and hard-code them in 
more then one place. 

Rani

-- 
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium

Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44

http://www.spaceapplications.com