[sc34wg3] TMQL - Comments against draft dtd 2007-07-13

Lars Heuer heuer at semagia.com
Fri Oct 19 07:45:34 EDT 2007


Hi all,

As I know, the TMQL have nothing special to do this weekend, so here
some TMQL comments ;)


Topic Maps -- Query Language
============================

Namespaces
----------
- TMQL is silent about a mechanism to define namespaces. It
  uses QNames but it seems to be impossible to define these.
  (we have some predefined prefixes, but how does the user define
  one?)
  TMQL mentions the environment (section 6.3) but wouldn't it
  be good if the user can define prefixes for a query ad hoc?
  (see also my comments for section 6.3.)


4.3. Item References
--------------------
- [A] item-reference ::= *
  CTM uses the wildcard '*' to create a (more or less) anonymous
  topic. If CTM uses '*' for that purpose, it is confusing if
  TMQL uses the same syntax to refer to a specific topic which
  is defined as 'tm:subject'. Either we should blame the CTM
  editors or the TMQL editors for that confusing overlap
  (I can imagine the answer from the TMQL-editors ;)).


4.4. Navigation
---------------

- [B] step::= >> instances 
  'instances' is not part of production [18]. Either this axis should
  be added or the forward direction should be removed, and therefor
  only '<<' 'types' should be a valid axis.

- I wonder if we can reuse the 'isa' (is instance of) and 'ako' 
  (a kind of) keywords somehow instead of introducing 'types'
  and 'supertypes'.

- I wonder if we should move those axes where the 'anchor' is ignored
  to another production. If the 'anchor' is ignored anyhow, why should
  we allow to specify it? If an 'anchor' is specified where it is
  ignored, an error would be more helpful than ignoring it silently,
  IMO. If I understand the production correctly, the following
  statement::
        
        person << types
  
  is equivalent to this one::
        
        person << types <http://www.something.which/is.ignored.but.allowed>
  
  The 'anchor' seems to have only a relevance for the following
  productions:
  - players
  - characteristics

- 'locators' / 'indicators'

  - Any chance, that TMQL adapts the CTM syntax for subject
    identifiers / subject locators?
    (TMQL uses '=' and '~' as suffix, CTM does not use ~ for subject
    identifiers at all and uses = as IRI-prefix for subject locators)
    Is the '~' necessary at all? Why is not every plain old IRI a subject
    identifier?

- 'reifier'
  - Wouldn't be one tilde enough? Instead of '~~>' we could use '~>'
  - Why is the forward shortcut ~~> defined, but not a backward shortcut
    (<~~)?

- The 'characteristics' axis:
  - Is the 'anchor' necessary at all? Isn't it possible to use
    ``tm:occurrence`` and ``tm:name`` (or to be more exact
    ``tm:topic-name``) as axis and ``tm:characteristic`` to retrieve
    both: names and occurrences? Maybe it becomes more complicated if
    the user wants a specific type, but if 'isa' could be reused, the
    following seems to be possible::
    
    a)  john >> tm:occurrence [. isa homepage]

        retrieves all occurrences of type 'homepage'
    
    b)  john >> tm:name [. isa nickname]
        
        retrieves all nicknames from the topic 'john'

    c)
        i.  john >> tm:characteristic [. isa whatever]
        ii. john >> whatever
        
        retrieves all characteristics (names *and* occs) of type 'whatever'

    d)  Retrieving all occurrences by foot:
        
        john >> tm:characteristic [. isa tm:occurrence]

    e)  Retrieving all names of type 'nickname' by foot:
    
        john >> tm:characteristic [. isa tm:name][. isa nickname]

        or:
    
        john >> tm:characteristic [. isa tm:name & . isa nickname]

- While writing this, I wonder if the need these uniform axes at all.
  Why do we need the << >> axis incl. keywords, if we'd introduce some
  specific "axes": ako, isa, <-, ->, ~>, <~ (the last four axes are
  already part of TMQL). Well, I may be mistaken here, but we could
  remove some (lengthly) keywords and introduce a dedicated syntax for
  them.


4.7. Composite Content
----------------------
- I just want to mention, that I find the '==', '++' and '--' infix 
  operators confusing / not very intuitive
  - for intersection ('==') I'd use '&'
  - for union ('++') I'd use '|'
  - for difference ('--') I'd use '-'
  
  Well, it's just syntax and '&' and '|' are already used for 'and' 
  and 'or', but ... hmmm ... anyway ... I'll move on

- Has someone verified, that the condition (if .. then .. else ..) is
  unambiguous (even if conditions are nested)?


4.10. Topic Map Content
-----------------------
- The topic map content is wrapped inside triple quotes ("""), but
  CTM itself uses triple quotes. Maybe another syntax should be used
  to wrap topic map content, otherwise the implementator has to
  count the number of """ to decide if the topic map content block 
  is closed or if a CTM string is opened / closed.
  Additionally, this is bad for syntax highlighers etc.
  If TMQL uses another syntax as wrapper for topic map content, it would
  be more obvious that the content is not necessarily a string, but
  a topic map content stream (whatever that is).


4.13. Boolean Expressions
-------------------------
- see 4.7. :/


6.3. Environment Clause
-----------------------
- The environment clause seems to be a string, why does TMQL not adapt
  the directives from CTM? Maybe TMQL could add some directives for
  setting the default topic map etc.
  - If TMQL adapts the CTM directives, we should check if 
  
        '%' directive-name
    
    violates the syntax for the context / environment map 
    (c.f. 5.3. Variables)



Best regards,
Lars
-- 
http://www.semagia.com



More information about the sc34wg3 mailing list