[sc34wg3] Re: Montreal meeting recommendations

Steven R. Newcomb sc34wg3@isotopicmaps.org
Mon, 17 Sep 2001 15:52:00 -0500


[Lars Marius:]
> As a vendor I'm concerned that this seems to mean
> throwing away everything we've done so far (as
> vendors), and I'm not very keen to do that.

I think you greatly exaggerate the negatives of this
proposal for vendors like Ontopia, and I strongly
disagree with you about its likely effects.

* None of our vendors is so naive as to think that the
  software it has already created will form the basis
  of its business forever.  Everyone knows that the
  maintenance of a technology product line is a
  never-ending effort that frequently involves "eating
  one's own children" in order to remain competitive.

* No existing Topic Maps system vendor will lose the
  value of its name-recognition on account of the fact
  that the definition of Topic Maps becomes more
  generalized.  On the contrary, the goodwill value of
  existing vendors will remain proportional to the size
  of the competitive arena in which they have a share,
  and to the security of the future of the arena as a
  whole.  And that's what we're proposing: to increase
  the size of the competitive arena, and to protect the
  long term value of the existing stakes in that arena.

* I don't want to draw the conclusion that Ontopia's
  attitude about the standardization process is not
  oriented toward building the largest possible
  competitive arena for public and fairly-distributed
  private benefit, but rather toward freezing the
  dimensions of that arena in order to influence the
  distribution of benefit in a way that will unfairly
  favor Ontopia's existing product line.  Please
  reassure us that this isn't so!  When participating
  in cooperative efforts with your competitors, the
  only correct attitude is to seek ways of making more
  pie for everyone.  It is ultimately self-defeating to
  seek ways to prohibit more pie from being made, or to
  try to make a process that is designed to foster
  cooperation into one that prevents it.  The
  standardization process is the wrong place to fight
  over who gets what share of the pie.  The purpose of
  standardization is to build an arena for that kind of
  fight.  The arena must be built fairly, or it won't
  be built at all.

> We changed our internal model when we went from ISO
> 13250 to XTM 1.0, and although it was a lot of work
> it wasn't so bad, because no customers were using our
> APIs. Things are different now, however.  (Of course,
> this is also an argument _for_ the new core model.)

  !

> As a person interested in topic maps and wishing them
> to succeed I am worried that changing topic maps in a
> radical way (for the third time) is something that
> has great destructive potential for the community.

Not true.  It is not a radical change simply to
acknowledge that what we've been doing is a special
case of something that has a wider scope.  Nobody is
suggesting that we change what we're doing.  We're just
trying to expand the scope of what *can* be done.

> How can we ensure confidence in the standards we make
> when we keep changing them all the time? (SGML has
> made, and kept, a vow never to break valid
> documents.)

How does this proposal break valid documents?  I can't
think of a single respect in which that's true.  On the
contrary, in fact, what we're proposing gives the
possibility of recognizing and integrating the
*intended* meaning of documents produced according to
software designed according to *varying*
interpretations of the assertion types described in the
existing standard(s).  Indeed, it occurs to me that
perhaps this is exactly what you fear: the idea that
the existing nuances of Ontopia's interpretation would,
according to the new proposal, be distinctly
identifiable, just like everyone else's, and the
potential for Ontopia's nuances to achieve de-facto
standard status, with concomitant huge advantages for
Ontopia's existing software investments, will be lost.
I'd welcome your reassurance that this isn't your
attitude.  Cooperation between competitors depends on
trust that everyone, at least in the cooperative
context, is working for the *common* benefit.

Like almost everyone else involved in this cooperative
activity, I'm interested in the huge increase in the
size of the market that will result from the potential
to integrate all Topic Maps resources emanating from
software offered by all vendors.  I'm unmoved by
anyone's desire to achieve hegemony in this
marketplace.  From the perspective of the public
interest, such hegemony would be a bad thing, at best.
We would all be irresponsible (and possibly corrupt) if
we allowed the international standardization process to
be used in such an unfair way.

> By all means, let us go forward, but let us do so
> carefully.

No argument here!

> My thinking now is that PMTM4 should have a different
> name from topic maps, since to me it is not the same
> as topic maps, but seems more like something more
> basic than topic maps. (My paper on topic maps and
> RDF that will be presented in Orlando and Copenhagen
> will probably make it more clear what I mean.)

I have been saying for some time that all of our
controversies have been about the meaning of the "Topic
Maps" brand name.  You seem to want to keep the scope
of this brand small and specific.  Many of us want to
make the scope as large as necessary and practical in
order to maximize the total public and private benefit.

Maybe this controversy will ultimately be resolved by
vote.  If so, I know which way I'm going to vote, and I
will also lobby implacably on behalf of our common
interests.

> I think we should stick to the plan agreed to in
> Montréal for the time being, and only revise it when
> we've learned enough about the core model and its
> relationship to topic maps to know what best to do
> with the core model.

If we decide not to act until everyone feels that they
have "learned enough", we will never act at all.  Those
who refuse to learn it will always be able to block it.

> | Consider the difference between saying "SGML" and 
> | "The Reference Concrete Syntax of SGML". 

> This is a very misleading analogy. Syntax is not the
> foundation of entire product suites the way models
> are. If you change the syntax of comma-separated
> files RDBMS vendors won't care much, but if you
> change the relational model it's something else
> completely.

I wasn't trying to say that a syntax is analogous to a
model.  I was trying to make the point that the only
thing we know for sure about the future is the fact
that it will be different from today.  The best and
only way to protect our investments is to make them as
adaptable as may be practical.

Currently, Topic Maps has no standard underlying model.
What it has instead might be considered analogous to an
implicit RDBMS schema, but that schema can't be
rigorously and explicitly expressed because there is no
underlying model (analogous to the "relational model"
that underlies RDBMSs) with which to express it.  The
proposed PMTM4-like "core model" provides that
underlying model.  If the world needs Topic Maps, it
also needs such an underlying model, and the sooner we
have it, the better.

> | Similarly, very few people are going to worry about the distinction
> | between "Topic Maps" and "The Standard Application of Topic Maps".

> Well, I am one of the people worrying about it.

So am I, and I deeply respect your careful approach,
Lars Marius.

> | I would also like to point out that almost nobody uses the Reference
> | Concrete Syntax of SGML, and that SGML's survival, now that it is
> | known as "XML", has *depended* on its ability to accommodate
> | syntaxes OTHER than The Original Reference Concrete Syntax.  Smart
> | as they were, the people who created the ISO SGML standard weren't
> | smart enough to foresee the XML phenomenon, and they knew it.

> That is true, and yet on the other hand they were
> also smart enough to wait for SGML to gain some
> traction before they went and changed it.  Not only
> that, but when they changed it they gave it a new
> name.

Your view of history is idiosyncratic.

My own recollection is that SGML was not changed
because certain persons, with excellent and
public-spirited intentions, did everything possible to
prevent it from being changed (except to rectify some
bugs, all of which were minor).  In the end, I guess
they overdid it, because the only way to make the
changes that were needed in SGML was for W3C to simply
ignore ISO almost entirely.  The W3C effort therefore
necessarily *had* to change the name of the standard.
Nowadays, it's not normally called "SGML"; it's much
more widely known as "XML".

In retelling this story, I think the history of SGML
supports *my* position, not yours.  If ISO cannot make
an *adaptable* Topic Maps standard, then, in order to
have a Topic Maps standard that really works under
future conditions that we cannot predict, we will have
to get (for example) W3C to do an end-run around ISO,
and call the resulting Recommendation (or whatever)
something other than "Topic Maps" -- e.g. "Mapped
Subjects".  In other words, the mass market will not be
responsive to the brand "Topic Maps" (which will be, by
then, analogous to the still-unknown brand "SGML").
Instead, the mass market will respond to the brand
"Mapped Subjects" (analogous to the well-known brand
"XML").  I sincerely hope it doesn't come to that.  We
would all have to abandon all of our investments in the
"Topic Maps" brand name.  What a waste that would be.
Not to mention the fact that "Topic Maps" is simply and
inherently a great brand name (unlike "SGML", which is
an absolutely terrible brand name).

> Now that I have verified that my understanding of
> your intentions I think I can answer some of the
> questions I originally posed. Please note that what I
> write below is just my suggestions.

> * Lars Marius Garshol
> |
> |  - which model should the XTM and ISO 13250 serialization and
> |    deserialization be described in terms of?

> The infoset model.

Do I understand you correctly if I interpret your use
of the term, "the infoset model", to refer to the
style guide that implicitly/explicitly guided those who
created the XML Infoset Recommendation?  Or do you mean
something else?

> |  - how do we agree on (and document!) our common terminology?

> We discuss the issues on this list. SRN & MB maintain
> a terminology document.

Right.  Many standards, including ISO 13250, include a
terminology section.  It's a normal and necessary
aspect of the editorial work.

> |  - how many documents should we maintain as part of this work?

> Infoset-requirements, infoset-model, PMTM4,
> terminology document.

In addition to the above, I think there's much more to
do:

  * rules for formally expressing an Application in
    terms of the core model.

  * the parsing rules for the XTM syntax, in terms of
    the core model.

  * the parsing rules for the existing "HyTime-based"
    13250 syntax, in terms of the core model.

  * natural language expression of the semantics of
    each of the assertion types that are built into
    both the XTM and HyTime-based syntaxes, in terms of
    the core model.

  * One or more formal expressions of the semantics of
    each of the assertion types documented in the above
    item, as UML, Property Set, API, etc.

  * rules for expressing Doctrines for the Expression
    of Scope

  * a Standard (default) Doctrine for the Expression of
    Scope

> |  - what should the mapping between the two models 
> |    look like, and which model document should contain 
> |    it?

> The infoset-model document, perhaps? An argument
> could also be made for the opposite, however.

See above list.

> |  - to what extent should the models go into descriptions of the
> |    meaning of topic map information? which of the models should do
> |    this?

The definition of "assertion" (what we are saying when
we write an <association>, for example) should be in
the core model.  

The semantics of those very few assertion types that
are needed to allow us to define all assertion types
should be defined in the core model.  These include
template-role-RPR, class-instance, and
superclass-subclass.

The definitions of all the other assertion types that
are explicitly or implicitly required by the parsing
models for both syntaxes (XTM and the HyTime-based
syntax) should be in the definition of the Standard
Application.  These include topic-basename,
basename-variantname, and topic-occurrence.

> |  - which of the two models should TMCL and TMQL build on top of?

This is a very good question.  For what they are worth,
here are my gut feelings about the answers.

For each of these languages, there are at least two
possibilities, and we need to decide how we want
to proceed.

One possibility for each language is that it contains
primitives based on The Standard Application, and it is
therefore related to and dependent on The Standard
Application.  I think this possibility is the one that
has been in almost everyone's mind, anyway.  My gut
feeling is that this is what we should do, in fact, but
I reserve the right to change my opinion based on
conversations we all have not yet had.

The other, far more "purist" possibility, at least for
TMQL, is to base the essence of TMQL purely on the core
model, and to make TMQL borrow the bulk of its
primitives automatically from formal declarations (TBD)
of whatever assertion templates are in effect for the
current Application.  The HyTime standard takes this
approach with SDQL (Standard Document Query Language);
SDQL borrows the bulk of its primitives from the names
of the properties of the notation(s) used in the
document(s) being queried, as those notations'
properties are formally declared in the relevant
Property Sets.  From an information management
perspective, it is a very beautiful way to proceed, but
it is challenging to implement, and it is challenging
to use.  Realistically speaking, I'm not at all sure
that such beautiful and total generality is worth the
trouble of implementing it, at least for now.  That
high level of generality can be achieved later, when we
know it is really needed.  

I'm thinking that the "purist" possibility for TMCL
will be just as hard to implement as the "purist"
possibility for TMQL.  It's probably better, and it's
certainly more practical, at least for now, to create a
Topic Map Constraint Language that provides primitives
based on some particular set of assertion types -- the
assertion types provided in the Standard Application.

> |  - what is the relationship to the XTM 1.0 and ISO 
> |    13250 specifications to be? how much of these 
> |    two specifications should be replaced by the 
> |    new model-based ones?

Second question first: I believe we must keep virtually
*everything* that's currently found in both of these
specifications (except for their bugs, of course).
This may require us to explicitly resolve any existing
vendor-specific differences of interpretation in one
way or another.  (All such differences can always be
resolved.  The worst case scenario is that we are
forced to recognize distinct syntaxes for Vendor A and
for Vendor B.  Even if we come to an impasse, and we
are forced to recognize such distinct syntaxes, it's
still better for the industry and for the public than
the alternative, which is to have incorrect merging of
topic map resources emanating from systems sold by
Vendors A and B.)

Now the first question: I believe that the XTM 1.0
syntax and the existing "HyTime-based" syntax are
really just two distinct syntaxes for one and the same
set of assertion types, which I've been collectively
calling "The Standard Application of Topic Maps."

> |  - where should the models go, once they are 
> |    complete? Are they ISO 13250 2nd edition? 
> |    Should they be a normative technical report?
> |    Or what?

My own preference is to see everything brought together
in a single comprehensive standard that fully validates
and protects existing investments in software and
instances, while at the same time providing for a
future in which we will have to adapt to conditions
that are unforeseeable today.

-Steve

--
Steven R. Newcomb, Consultant
srn@coolheads.com

voice: +1 972 359 8160
fax:   +1 972 359 0270

1527 Northaven Drive
Allen, Texas 75002-1648 USA