[sc34wg3] Editors' drafts of TMDM and XTM 1.1

Robert Barta sc34wg3@isotopicmaps.org
Fri, 13 Jan 2006 08:28:37 +1000


On Wed, Jan 11, 2006 at 02:31:33AM +0000, Murray Altheim wrote:
> >The variant element can no longer be nested. 
> 
> While some people might have found this odd, the hierarchy of
> variants did allow selection of a specific Topic name based on
> a specific set of accumulated variant parameters. If this
> features (which is decidedly complicated) is being removed,
> that's hardly a 0.x version change. That removes an entire
> feature of a language.

What you describe is an "application-feature" which a vendor may want
to build _ON TOP_ of TMs. For those of us where TM is not an
"application to do one particular thing", but a base technology for a
very high number of applications, the feature of nested wartisants is
not a canonical one.

And "Odd" is not the word I used when I commented my Perl code for it,
btw ;-)

> Ugh. So now when I merge in another XTM document I have no
> ability to un-merge it or determine where a Topic comes from?

Yes. In the same way as you do not know whether the number 42 is the
result of 21 + 21 or 23 + 19, merging maps should be just that.

> >The id attribute has been removed from all elements except topic,
> >and the reifies attribute has been added on some elements. 
> 
> While some people may not see the *need* for ID on all elements,
> it never hurt to have it.

It hurts. It hurts A LOT if you do not process TMs with XML tools.
Actually it hurts so much that you can only actually process TMs with
IDs all over the place with XML tools. Truthfully processing IDs means
that one has to keep the XML structure intact. Brrrr.

> >The mergeMap element must now come before all topic and association
> >elements. 
> 
> There should be no ordering requirement of XTM documents. It's not
> a sequence, it's a bag. If applications need to process <mergeMap>
> elements first, they should pull them from the graph and process
> them first.

>From an implementor's POV this does not make any difference,
yes. Still, having the 'imports' at the beginning is 'Good Practice'
just to avoid a <mergeMap>s somewhere in the middle of the
document. Not all people take the time to wade through all code ....

> >The datatype attribute has been added to resourceData, which also
> >now supports embedded markup.

[...]

> I think you're making an enormous mistake formally tying XTM to
> XML Schema.

Hmm, there is XML Schema (WXS), the W3C XML schema language. And there
are the XML Schema data types (XSD).

XTM is using WXS to be described (as alternative to RelaxNG), but I
can nowhere find any ties with XSD except anyURI (and ID):

   datatype = attribute datatype { xsd:anyURI }

   resourceData = element resourceData { datatype?, any-markup }

   The datatype attribute contains an IRI identifying the datatype of
   the resource that is represented by the resourceData element.

> There is a vast array of datatyping schemas, and the
> majority of them that have any value have nothing to do with the
> work of the W3C.

Definitely, and I can see no problem why they could not be used.

> If we're going to break validation and allow embedded markup
> (a questionable strategy at best), I would at least highly
> recommend including the 'datatype' attribute but not assigning
> values outside of XTM's namespace, i.e., either leave it blank or
> create XTM's own set for the necessary datatypes, with our own
> definitions for our own purposes, not those tied in with
> Description Logics, which is an entirely different domain than
> Topic Maps, based on an entirely different set of core assumptions.

DL itself is not associated with any particular set of data
types. There are DL variants which allow to reason about a finite set
of 'fixed' data types. But which ones they are, the formalism does not
care.

> Data typing is really an application-level specification, and
> probably shouldn't be included in the core graph syntax, within
> the XTM namespace.

Which, btw, is the perfect argumentation against having variants in
XTM ;->

> The choice for deserialization to go with the XML Infoset is perhaps
> welcomed by some, but certainly not by me. I don't see it as an
> improvement, just a snazzy reference to a cool-sounding W3C spec that
> seems to confuse a lot of people.

Well, developers mostly get this quite quickly. I had not used
Infoset before, and am quite comfortable with it now....

> I really love statements like this
> one:
> 
>     "Reliance on any particular behaviour in the XML processors
>      used by recipients is strongly discouraged."
> 
> If developers can't rely on any particular behaviour, what can they
> rely upon? I don't get it.

For generating a data model instance we are in a very similar
position like, say, XPath:

   http://www.w3.org/TR/xpath20/#id-processing-model

Also there XML processing (parsing, validation, ...) is outside the
actual specification which explains how data is sucked in.

> We suddenly have in front of us a number of sketchy new ideas (e.g.,
> IRIs, Infoset, reification, topic items)

"Suddenly"?

If - what I assume - you have followed the discussions about TMDM
(starting from 200[23]) then an adaption/streamlining of XTM should
not come as a surprise, rather an overdue necessity.

> that are either undefined or point at theoretically functional specs,
> but taken as a whole I don't see the processing model through the
> haze. While Annex F in XTM 1.0 was perhaps only partial, it at least
> left developers with some scrap of an idea of what to build.

1) TMDM, XTM, CTM, CXTM all cannot have a 'processing model'. 'Processing'
   means 'do something with it to produce something new'. None of these
   standards do that. All are about different representations of the same
   thing (a TM structure).

   Processing applies to TMCL and TMQL, though. So there will be processing
   model(s).

2) Have you ever wondered why there are so many RDF stores/engines and
   relatively few TM stores/engines? With RDF you _know_ what to build.
   Most programmers do not have the time to do 'guessing'.

> isn't going to include any behaviours (and is just a data model),
> then this should be clearly stated.

It is, but where it belongs:

   http://www.isotopicmaps.org/sam/sam-model/#scope

> All in all, I don't think I'll be using the newer version of XTM
> for anything I'm working on. There are too many substantive changes,
> and I disagree with many or most of them. This isn't to me just
> sour grapes, it's changed the entire path that XTM was moving to
> harmonize it with a different model than the one it was originally
> created for, it now includes external, underspecified, and unwanted
> semantics.

Hmmm, as implementor I am very hawkish about under- and
over-specification, but XTM does not seem to suffer from any of this.

\rho