# [tmql-wg] Result set requirements

Robert Barta rho@bigpond.net.au
Sun, 28 Mar 2004 13:30:37 +1000

On Tue, Mar 16, 2004 at 10:54:34AM +0100, Lars Marius Garshol wrote:
> If we knew that $A was a topic of type 'person' and that$B and $C had > to be dates we could do things more efficiently, but there are some > other considerations: > > - if we require type declarations we make the programmer's job much > harder, As my mother said: "No pain, no gain." To a certain extent the query processor can try to build indices based on the most frequent queries, but the more information you put in, the ... > - declaring the types will in general restrict the use of a > function/inference rule to situations where the types fit, (an > excellent example is Java, where the same methods are often defined > X times for X different types), and ...more you make it specific, but faster. An acceptable tradeoff situation, I think. > - if you can establish the types used for a particular invocation you > can internally create a new instantiation of the function/rule > where the types have been upgraded and if necessary do this for all > the different type combinations that turn up. This is polymorphism and late vs. early binding? > The main downside to having an implicit typing approach like this, I > think, is that the performance model of the language tends to become > very complex. What this means is that users may find that minor tweaks > to schemas or queries cause huge performance differences in practice > in ways that are very difficult for them to predict, and similarly > that which queries run fast on which implementations may also be very > difficult to predict. Yes, most probably, the behaviour will be rather 'chaotic', in the mathematical sense. OTOH, after 30 years of database development for RDBMSes, we still see these Oracle gurus asking obscene amounts of money just to tweak your DB and to optimize your queries. :-) So why not have this with TM databases? > I'll have a look, but this route is not free of dangers, as I think > Jeni Tennison documented admirably: > > <URL: http://www.idealliance.org/papers/extreme03/html/2003/Tennison01/EML2003Tennison01-toc.html > > > (Well, I haven't read the paper; I just saw the presentation, which > was excellent.) Very interesting, indeed: >>> The complexity of the type system is largely a consequence of >>> XML Schema?s datatype specification.... What I find more promising is not (only) to import an external type system into TMQL/TMCL, but to use the 'types' as provided by an ontology definition. For instance, if this is defined in a constraint language: # all hillarious things, must be either politicians or lecturers forall$t [ in (fun-indicator): hillarious ]
=> exists $t [ * (politician | lecturer) ] and all maps which are subjected to a query follow this constraint, then the following query can be optimized: file://mafia.atm : * # take _anything_ from the mafia map [ ./ in (fun-indicator) = 'hillarious' ] # filter out those which are funny / in (bribe-level) # get the bribe money necessary Instead of pulling ALL topics from the map, we can concentrate on those being an instance of 'politician' or 'lecturer'. So we get file://mafia.atm : * -> is-instance-of / class [ . = (politician | lecturer) ] / in (bribe-level) If the implementation already has an index on types (which is quite likely), then this would speed up the query already. If we additionally have in the ontology forall$t [ * (lecturer) ]
=> not exists [ in (bribe-level) : * ]

then the query transforms to

file://mafia.atm : *
-> is-instance-of [ ./ class = politician ]
/ in (bribe-level)

Something like this would be nice as it can transform queries in such
a way as they can use the existing indices and avoid naive iterations
over all possible combinations (simply avoiding the combinatorial
explosions).

The advantages given by using a "low-level" type system like XML
Schema Data Types are certainly there, but are probably not SO big.

> | I am not sure about the future of TMCL. At the moment it looks like
> | RDFS light.
>
> It does, but we don't want it to wind up that way, nor do I think
> Graham wants that. So I wouldn't get worried just yet. I think the
> final result is much more likely to look like AsTMa!/OSL represented
> in topic maps, with some special syntax. The stuff Dmitry has been
> playing with lately looks promising to me.

As born Austrian is have the birthright to be worried about everything :-)

\rho