[tmql-wg] Hello and proposal

Robert Barta rho@bigpond.net.au
Sun, 28 Mar 2004 07:50:07 +1000

On Tue, Mar 23, 2004 at 05:59:21PM +0100, Stefan Lischke wrote:
> My Name is Stefan Lischke, i'm interested in TopicMaps, artificial 
> intelligence and XML stuff.

Stefan, willkommen im Club!

> Using XPath on this TopicMap is now very fast.
> I have played a little bit with that and also used XSLT's alot ontop of XTM.

Yes, but this all assumes that the XTM document is in particular
"normalized" form, otherwise one and the same query might give
different results for semantically equivalent maps.

The way I understand it is that XML documents have a structure implied
by the XML infoset. TopicMaps have their own structure.  Actually,
this data structure is _completely_ independent from XML. XTM happens
only to be one particular way to manifest TM content.

If you export an relational database to an arbitrary XML vocabulary,
would you use XPath/XQuery to query the database? Probably not. But
why not?

> My experience shows me, that working with XPath ontop of XTM is pain in 
> the a**, cause you only have the TopicID as reference. so if you want to 
> have a Topic of a special Type, you have to know the XML TopicID of this 
> type.
> /topicMap/topic[instanceOf/topicRef/@xlink:href='#typeID']

I think this is a completely different issue.

There is nothing wrong with topic ids. I can imagine that a TM
processor can be obligated to hold these ids stable during a "query
and access session". Accordingly, you would 'open a TM database',
would then somehow retrieve the topic id and work with it as long as
this session lasts.

How you extract these 'temporary' identifiers is a different issue,
subject indicators might be a way, but certainly not the only way to
identify a topic.

> for $topic1 in $topicmap.topic.type='typeURI'
>    for $topic2 in $topicmap.topic.subjectLocator='somethingURI'
> return
>    <association>
>       <instanceOf>
>          <topicRef @xlink:href="#someAssoc"/>
>        </instanceOf>
>       <member>
>          <topicRef  xlink:href=$topic1.id/>
>        </member>
>       <member>
>          <topicRef  xlink:href=$topic2.id/>
>        </member>
>    </association>

I completely appreciate this structure. It first returns a sequence
and then iterates and creates results. The problem I have with pure
path-oriented sublanguages to detect interesting pieces in a TM is
that these are too cumbersome a few cases.

I see two ways of detecting interesting information:

  (a) a pattern match approach (this is like being dropped in a parachute)
      not always perfectly precise but a rather good approximation

  (b) a drilling approach (you start somewhere, and walk and walk and
      walk and walk)

(a) is very fast to bring you close to the target, (b) is very precise.

> Now we have to create XQuery shortcuts to create the Association:
> for $topic1 in $topicmap.topic.type='typeURI'
>    for $topic2 in $topicmap.topic.subjectLocator='somethingURI'
> return
>    association($topicmap.topic.subjectLocator='someAssocURI',  $topic1,  $topic2)

Well, a shortcut here, a shortcut there and we have another language
:-) AsTMa? in fact looks a bit like this.

> With that we could have a TMQL ontop of XPath and XQuery.

The same argument can (and should) be applied to TMQL ontop of
SQL. And TMQL ontop of RDF-QL. And TMQL ontop of DNS. And Whois.

A query language like TMQL is nothing else as a promise to 

  (a) look for interesting stuff according to the TM paradigm, and
  (b) return the identified stuff according to the TM paradigm.

How the data is actually stored is a separate issue. Some people like
to do it in XML because they author the TM content manually, others
prefer a lightning-fast native store. Others only may want a TM view
over the global DNS.

(a) and (b) are described by the ontology, the application is interested



> larsbot@IRC said:

> however, one thing you should consider is whether you can take a
> different approach take TMQL and compile it to XQuery

This is exactly where I see a big need for research. (Heads up:
potential Ph.D. students listening?)

> we do the same with tolog in the sense that we compile it to SQL

Right. Here it may make a difference whether

 (a) the database schema is already TM-biased, or
 (b) whether the database is just ... the database.

In case (b) the transformation would have to be flexible enough to map
a TMQL query into a set of SQL queries and reverse map the results.

BTW, this is what I will talk about on XML Europe 2004.

> nobody's done a QL for TMs with update capability yet

Not officially, no. But it does not look as if it were so difficult.