[sc34wg3] a new name for the Reference Model
Steven R. Newcomb
sc34wg3@isotopicmaps.org
02 Jan 2003 21:52:07 -0600
Jim Mason asks:
> Do the names we give to standards say anything
> about what conforms to them?
If the answer is "Yes", then I can say what I'd
like to be said:
(1) The RM governs the definitions of Topic Map(s?)
Models whose specifications claim conformance
to the ISO Topic Maps paradigm. The RM imposes
requirements on all TM Model definitions that
claim conformance to it.
(2) Topic Map(s?) Models, including but not limited
to the Standard Model, govern the topic map
documents and enabling software that claim
conformance to them. When a Topic Map(s?)
Model inherits ("borrows", includes) another
Topic Map(s?) Model, the inherited Model *also*
governs the documents and software that claims
conformance to the inheriting Model.
How to say or imply all that in the names is a good
question. I think we're already much closer than
we've ever been.
> What conforms to what is still labelled the RM?
Definitions of TM Models.
> What does conformance to [the RM] mean?
For the definition of any given TM Model, it means
that the RM's shopping list of required
definitions, and aspects of each definition, is
fully satisfied.
> Likewise for what's still called SAM? Where does
> conformance to one of those say about conformance
> to what's currently out there as ISO/IEC 13250?
OK, here's what I think.
13250 is about two syntaxes for interchanging topic
maps: XTM and HyTM.
XTM and HyTM are both inside and outside the SAM,
and that's why both the SAM and the RM are needed
in order to fully and rigorously describe what's
happening in 13250. (Before anybody gets angry,
hear me out. I don't think this formulation
threatens anybody or anything.)
XTM and HyTM are inside the SAM because both
syntaxes invoke SAM-defined semantics, including:
* occurrences
* names
* instance of class
* subject indicator
* addressable subject
* set
* scope
* etc.
The SAM can define merging rules for all of the
subjects that are yielded by the above semantics,
but it can't define merging rules for subjects that
may be yielded by user-defined association types,
since we can't know what their semantics will be.
Now, since both XTM and HyTM allow users to define
their own association types, the question arises:
What does it mean when a subject is specified by
means of an association, and how does anyone
other than ISO standards-makers define the
merging rules for the subjects that may be
conferred upon role players by associations?
I bet some readers are saying, "What in the hell
is Newcomb blathering about here? Since when are
subjects conferred upon role players by
associations?" Well, to make a long story short,
everything boils down to relationships between
subjects, and some relationships confer subjects on
the topics that play certain roles.
For example, there is the kind of relationship that
exists between a topic and its subject indicator.
One of the roles in such a relationship is played
by a subject indicator, which is always a piece of
information. The other role is played by a topic.
The nature of this kind of relationship is such
that the *existence* of the relationship causes the
subject that is the meaning of the piece of
information (that plays the subject indicator role
type) to be conferred upon the topic (that plays
the other role). That's an example of how a
relationship actually specifies the subject of a
topic.
Now, let's look at the XTM syntax. XTM syntax is
designed to make some relationships extremely easy
and intuitive to specify. In fact, the XTM syntax
doesn't even make them look like relationships.
For example, instances of the <subjectIdentity>
element type establish the same kind of
relationship I've been talking about, between
topics and their subject indicators. And the
purpose of a <subjectIdentity> element is to confer
a subject upon the <topic> that contains it. It
does this by specifying that there is a
relationship between the topic and a subject
indicator.
If you're still with me, here, you're ready for the
punch line: XTM *also* allows topic map authors to
specify *arbitrary* kinds of relationships, by
means of <association>s. This is where the RM
comes into 13250 (i.e., into HyTM and XTM). 13250
not only has *inherent* kinds of relationships (the
significance of each of which is described in the
SAM), but also it allows and encourages
*user-defined* relationship semantics. In other
words, HyTM/XTM has always expected that the SAM's
relationship semantics would be extended by users
of HyTM/XTM.
What if some of those user-defined relationship
types are supposed to confer subjects on some of
their role players? How do we tell when such
subjects are the same, and therefore must be
merged? Neither 13250 nor the current SAM faces up
to the possibility:
* that a user could define an association type
whose instances determine the subjects of one or
more their role players, and
* that more than one topic may thus have conferred
upon it the same subject, and
* that therefore such topics need to be merged.
We should decide what we want to do about this. I
think there are several choices, including:
(1) We do nothing. We don't say anything about it.
We pretend the issue doesn't exist, and we face
it at some later date. (I think this is the
worst possible choice. It weakens both the SAM
and our credibility. It creates a situation in
which weeds will thrive.)
(2) We say that user-defined assertion types in
XTM/HyTM *are not* allowed to have any
semantics such that the subjects of any of
their role players are specified by instances
of such user-defined assertion types.
If we choose this option, whenever a topic map
author wants topics to be merged, he must say
so explicitly and redundantly, either with a
<topicRef> or with two <subjectIndicatorRef>s
to the same subject indicator. Michel likes
this idea. It protects developers of XTM/HyTM
processors from ever having to support
user-extensible merging rules, and it may have
other advantages of which I am not yet aware.
I dislike it because I think it should be
enough to say, in domain-specific terms (i.e.,
via instances of user-defined association
types), that topic A has subject S1, and topic
B has subject S1, and expect that A will merge
with B simply because they both have the same
subject. It shouldn't *also* be necessary
to say explicitly:
(i) that topic A also has subject indicator
SI1, and topic B also has subject
indicator SI1, or
(ii) that topic A has the same subject as
topic B.
If the SAM imposes a requirement to supply such
redundant information in each topic map, we
will eventually have to answer the following
embarrassing question, emanating from possibly
irate users: "Why does XTM/HyTM allow
user-defined association types at all, since
they aren't allowed to mean anything in terms
of subject recognition?"
(3) We say that user-defined assertion types in
XTM/HyTM *are* allowed to have semantics such
that the subjects of their role players are
specified by instances of such user-defined
assertion types. We require that, when such
topic maps are interchanged, they must include
the information necessary to allow such
subjects to be merged automatically, in the
normal course of topic map processing, whenever
such subjects are identical.
Personally, I strongly favor this third choice.
It doesn't require us to delay the
standardization of the SAM, even though we may
wish to add stuff to the SAM, at some future
date, that *standardizes the expression* of the
additional TM Modeling information necessary to
extend the merging rules of XTM/HyTM in support
of domain-specific subjects. It leaves
XTM/HyTM's future indefinitely long and
indefinitely bright -- as long and bright as
the full breadth and depth of the TM paradigm
itself. It won't force XTM users to learn how
to make TM Models; it only leaves open the
possibility that they can use such knowledge if
they want to, without first having to abandon
XTM.
In any case, we really need to face this issue.
> Does [the RM] specify or interpret?
It specifies. If it only interprets, then its
constraints are optional, and we abandon the idea
that "Topic Maps" means reliable, predictable,
ontology-neutral knowledge aggregation.
> And this leads me back to a discussion in
> Baltimore about whether we need a multipart
> standard or multiple standards. If we have a
> multipart standard, I think it's easier to
> justify the RM as the much-needed explanation of
> what the current 13250 means, whether it turns
> out to be standard-like or TR-like.
I don't see the RM as fully answering the question,
"What does 13250 mean?" It provides an essential
part the answer, but it cannot provide the whole
answer. The same is true of the SAM. I see the RM
and SAM together as fully answering the question,
"What does 13250 mean?".
The RM and SAM are both deeply technical. The
answer to the question, "What does 13250 mean?" is
not light reading. If if we pack both the SAM and
the RM into 13250, along with HyTM and XTM, we'll
have a big, heavy standard that nobody will read.
(I hesitate to mention three examples of too-heavy
standards: HyTime, STEP, and XML Schemas. All
three are marketing disasters, despite whatever
technical virtues they may or may not have. The
primary problem with all of them is that they are
TOO BIG.)
We need Topic Maps to be *perceived* as light,
easy, and intuitive. The XTM DTD gives this
impression. Great!
So, if we go the multipart route, I want to be
very, very certain that the default, leading ISO
publication on Topic Maps is very short and sweet,
has the XTM DTD in it, and damn little else. It
should be a "README" for Topic Maps.
I think we will defeat ourselves if we direct
public attention toward "ISO 13250", and people
look at it only to find that it's 100 pages of
mostly unintelligible techno-gibberish. Even if
the first part of it is short, sweet, and
easy to understand, 100 pages is a big turn-off.
(I'm reminded of the guy who asks for a drink of
water, and then receives his drink in the form of a
blast from a firehose.)
> If it's going to be a separate standard, then we
> have to make it a standard and be clear about
> what conforms to it. (If it's an abstract model
> of data or data aggregation or something like
> that and all that really conforms to it is 13250
> itself, then it shouldn't be a separate standard
> or even a separate document but rather a part of
> a revised 13250.)
Any TM Model can conform to the RM, not just the
SAM, and not just XTM or HyTM. 13250 is not the
only thing that will ever conform to the RM, or to
the SAM, even.
13250 is a standard for a pair of syntaxes. Both
of these syntaxes are suitable for interchanging
Topic Maps that conform to the SAM, which is a TM
Model set forth in a separate standard. Both of
these syntaxes inherently provide for the
expression of semantics (user-defined <association>
types) that are outside the scope of the SAM. When
such non-SAM relationship semantics are used, they
must be defined in conformance with the RM, which
is *another* separate standard.
> Another way of asking the question about what
> sorts of documents the RM and SAM are is to ask
> who the audience is. Are we writing these things
> for code writers or end users? That is to say,
> are we specifying the actions of a TM engine and
> the interchange formats for TM data, or are we
> marketing TMs/trying to help people who want to
> create TMs? At the moment, those two audiences
> are sometimes almost the same (that is to say,
> ourselves), but what about 5 years down the road?
> We need the information in both the RM and the
> SAM, but we need to understand how we need it.
Here's my take on audiences:
13250: It's for everybody, but it's especially
appealing to XML/SGML people. (More and
more, that's everybody in any
knowledge-intensive field.)
SAM: Audience is knowledge managers and software
developers: techies.
RM: It's for Knowledge managers and software
developers: techies who want to achieve
subject location uniqueness for subjects
that are specified by domain-specific
relationship types.
--
-- Steve
Steven R. Newcomb, Consultant
srn@coolheads.com
Coolheads Consulting
http://www.coolheads.com
voice: +1 972 359 8160
fax: +1 972 359 0270
1527 Northaven Drive
Allen, Texas 75002-1648 USA