[tmql-wg] Result set requirements

Sat, 28 Feb 2004 21:48:16 +1000

On Sat, Feb 28, 2004 at 11:26:22AM +0100, Rani Pinchuk wrote:
> 
> > OK, but isn't it ALWAYS the case that - if you would like to convert
> > from one kind of storage into another kind of storage - there MUST be
> > something which defines this connection?

> Sure, I just suggest not to build that "something" into the TMQL.

And my claim is that you cannot do that if you want TMQL have other
results than lists of topic map items. And even then you have
specified implicitely a 'list of topic map items':

   select $basename, $occurrence, ......

in contrast to, say,

   select $occurrence, $basename

> Because when it is not in the TMQL, it gives the user of TMQL the
> opportunity to separate the queries (what data is retrieved) from
> the definition of how that data is delivered.

No. Because if the user does not specify HOW the data (which has
been detected within the query) should be embedded in the outgoing
data structure (list, XML, whatever), then how is that data organized?

As list?

And how are the list items then mapped into the outgoing data structure?
With another template (or skin in your paper)?

If so, we would squeeze a multidimensional data structure through a
list, just to embed into a text template which - in case it is XML -
would have to be post-parsed to make accessible to the application.

This does not sound too good to me :-)

> You could say that we can add it to TMQL in a way that keep the WHAT
> and the HOW separated, but usually in the different environments you
> will find other techniques for the HOW, so I don't think it is good
> to enforce another technique on the users.

My reasoning is, that most people are familiar with list and XML
data-structures and that there are enough programmatic means (even
sometimes within the language) to post-process these.

> > What would be the difference between a 'TM slice' and a set of
> > topics/association objects?

> "TM slice" is what I call a sub topic map. A topic map that contains all
> the topics and/or associations from the query AND all the other topics
> and associations to complete it to a topic map that can be used as a
> stand alone topic map.

So, I guess, if a topic has a type, you include the type topic. If an
association has a particular role you add the topic node for that? Will
this not result in rather big submaps. And - in some cases which I could
construct - every submap would be identical with the original map :-)

> > What I cannot see is that you try to separate this into different
> > documents.

> That separation served me so well in the past that I co-authored the 
> following:  
> http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/Sharon/Sharon.pdf

> Both explain why to separate and how.

I do not think so. You describe there templates. They, btw, suffer
from the problem that they need 'if' and 'loop' statements. So, if you
already have an if statement and a loop statement in your development
language (in our case TMQL), then you would need yet another language
for the template. By splitting this into two languages you get no
advantage but incur costs of two languages.

If think I understand your concern, namely, if someone has a particular
query like

   forall [ some pattern P ]
   return
       some constructor C

that you want to factor out the way the data is used in the constructur
and avoid that the repetition of the pattern P

   forall [ some pattern P ]
   return
       another constructor D

That would only be possible if you store the captured data which was
detected in the pattern in an intermediate store which you can then
reuse with different constructors C and D.

In any serious language that intermediate store is SO COMPLEX that
reusing it becomes definitely more expensive than simply repeating
the pattern.

And, worse, you would undermine a lot of optimization opportunities.
Think about this example:

   forall [ $p (person)
            bn: $bn ]
   return
        {$bn}

If the language processor could not see the constructor and would have
to keep all relevant data, it could not throw away the binding for
$p. $p is useless, because it is never used for output.

\rho