SIDPs and limits of interoperability - RE: [sc34wg3] to advance Topic Maps

Bernard Vatant sc34wg3@isotopicmaps.org
Wed, 9 Apr 2003 18:12:26 +0200


Steve

Just a reaction about SLUO and SIDPs. If I catch you well:

	TMM says that a TM application has to define how it (tries to) achieve the
SLUO, otherwise said which SIDP(s) the application uses for subject
identification. OK.

	SAM defines which properties should be used as SIDPs in SAM-conformant
applications. OK.

	SAM makes no provision about how to interoperate with non SAM-conformant
TM applications that would use other kinds of SIDPs. You think it should.

My guess is that TM applications not agreeing on which properties they use
as SIDPs are doomed to be not interoperable. You can't agree on identity if
you don't have previous agreement on the identification process. Those are
generic limits inherent to the very notion of identification, linked to the
fact that there is no absolute identity, independent of a specific
identification process. I've said that a number of times, we can't agree on
what is a person identity - let alone on what is a person - but we can
agree (or not) on an identification process.
If an application (be it TM or not) uses the welfare number to identify a
person, there is no way to make it interoperable with another one using
e.g. Birthday + Birthplace + Number on Birth Registry.

Maybe TMM should state something as: "Two TM applications are interoperable
if they use identical SIDPs" ... with of course the issue of what
"identical" means here.

If SAM states explicitly which SIDPs it uses, what else can it do for
interoperability's sake? If other TM applications use the same ones, they
will be interoperable, if not, they will not. Full stop.

BTW, I have some comments on TMM following section 3.2.3

[parid6039] Subject identity discrimination properties (SIDPs)

[parid6419] Every topic has at least one SIDP instance. Each SIDP instance
independently specifies the subject of the topic, for all purposes of
subject identification. SIDP values are the only basis for automatically
recognizing when two topics have the same subject or different subjects,
and should therefore either be merged or left unmerged. No topic can have
more than one SIDP instance whose property class is defined by any single
TM Application.

1. Making SIDP mandatory seems a very constraining requirement. You can
create a topic without clear notion of the identity of its subject at
creation time, or at least of the value of its SIDP.

2. Identical SIDPs are the basis for inferring that two subjects (are
declared to) have the same identity. But from different SIDPs no positive
conclusion can be inferred, and certainly not that the subjects are
different. The only possible conclusion is that the subjects are not
*known* to be identical *under the identification process used*. Maybe they
are identical under another process (if that makes sense).

3. I'm not sure about the last sentence. I understood so far that a subject
indicator is a SIDP, at least in SAM. Does that mean that TMM would forbid
to have more than one subject indicator for a topic?

Bernard

Bernard Vatant
Senior Consultant
Knowledge Engineering
Mondeca - www.mondeca.com
bernard.vatant@mondeca.com


-----Message d'origine-----
De : sc34wg3-admin@isotopicmaps.org
[mailto:sc34wg3-admin@isotopicmaps.org]De la part de Steven R. Newcomb
Envoye : mercredi 9 avril 2003 11:58
A : sc34wg3@isotopicmaps.org
Objet : [sc34wg3] to advance Topic Maps


In 1993, when we began to work on the problem of merging
independently-created indexes, and when we first coined the term
"Topic Maps", the objective was to know how and when to merge two
independently-created index entries, on account of the fact that we
(somehow) deemed them to have the same subject.  This facilitation of
this objective -- in its full generality -- has always been, and, at
least for me, continues to be the purpose of the Topic Maps standard.
The objective has never been to impose a single, one-size-fits-all
world-view on all users of ISO-standard Topic Maps.  In the early
days, we didn't realize how deeply our world-view had been embedded in
Topic Maps.  Now, we know better, and we see much more clearly what
we're doing, here.  We should see the SAM as an instance of a class of
TM Applications, and we should welcome the idea that each member of
that class can represent a different, potentially valid (and
necessarily limited) world-view.  When we promote the SAM, we should
be convinced that, among all existing and future TM Applications, the
SAM represents an outstanding balance of trade-offs -- a balance
worthy of publication by ISO for general use by the public.  We should
be able to articulate those trade-offs, and to defend the design of
the SAM in terms of those trade-offs.

We're not there yet, but, in the recently submitted Topic Maps Model
(TMM, SC34/N0393), we now have a tool for expressing and evaluating TM
Applications, and for discussing and making explicit the design
trade-offs they represent.  We have the opportunity to use this tool
to make the SAM a stellar example of a TM Application, and I hope we
will use it.

In order to advance Topic Maps, it is urgent that we align the SAM
with the requirements for TM Applications prescribed in the TMM.  In
order to do that:

(1) The SAM should be expressed and constrained in such a way that it
    is clear that the SAM can be extended, and that its extensions can
    extend the rules for merging and number of relationship types that
    can determine the subjects of their role players.

    Currently, the SAM makes no provision for such extensions.  The
    SAM provides no general doctrine for merging, in terms of which it
    explains both its own merging rules, and those that may be added
    by TM Applications that include (inherit) and extend the SAM.

    Specifically, the SAM does not say how (or even whether) the
    instances of user-defined association types can determine or
    influence whether their role players should merge.

(2) The SAM should be expressed and constrained in such a way that it
    is clear that topic maps that are based on the SAM can be merged
    rigorously and predictably, not only with each other, but also
    with topic maps that are not based on the SAM.

    The current SAM makes no provision for this.

    The TMM shows how the SAM can be expressed in such a way as to
    allow other TM Applications, including but not limited to TM
    Applications that inherit (or "include") the SAM, to be
    independently designed and maintained without sacrificing the
    integrity of the topic maps that are based on them when SAM and
    non-SAM topic maps are merged.

    It's important to maintain the integrity of knowledge even after
    it is merged with other knowledge.  The TMM is designed to meet
    the requirement of preserving the integrity of merged topic maps.

    Any data models that we publish for Topic Maps should be informed
    by sensible doctrines that establish the general rubric under
    which diverse merging rules must co-operate, despite the diversity
    of the knowledge domains and world-views from which they emanate.
    The TMM proposes such a rubric.

(3) The SAM should be expressed and constrained in such a way that it
    is clear that the SAM reflects the WG3's intentions regarding
    which subjects it reifies (which subjects are capable of being
    role players and are subject to merging), vs. which subjects are
    not reifiable in systems that are governed only by the SAM.

    The current SAM document does not clarify this.  In the absence of
    such clarification, there is no basis for any claims we (or
    anybody else) might make about the integrity with which knowledge
    is handled, even under the SAM's own rules.  The TMM requires all
    TM Applications to make explicit the limits of their support for
    the SLUO, and that their behaviors be deterministic and
    predictable, even in multi-source, multi-TM-Application
    environments.  (The "Subject Location Uniqueness Objective (SLUO)"
    is the principle that all topics that have the same subject should
    be merged.)  The support of every TM Application for the SLUO is
    necessarily limited.  It's important that users are able to know
    exactly how the SLUO is met by any TM Application(s) they use.

    The SAM, as currently written, doesn't state the limits of its
    support for the SLUO.  At least one of the things that the SAM
    does needs an especially detailed disclosure: the SAM allows the
    reification of subjects to be controlled, not by the inherent
    logic of the SAM, but rather by syntactic constructs that are used
    in a given interchangeable instance.  This makes the merging
    responsibilities of implementations ambiguous.  It becomes
    impossible, in the general case, to preserve the integrity of
    topic maps across merging operations with other topic maps,
    because if a subject is reified in one topic map, and unreified in
    another, the two topic maps cannot be merged into a single topic
    map that preserves the integrity of both originals.  If we decide
    that the SAM really should be designed in such a way that its
    implementations are exempted from respecting the SLUO in this way,
    then we must disclose the fact, and we must say exactly how all
    SAM implementations will uniformly resolve all the ensuing
    ambiguities.  Again, the TMM doesn't care how much or how little a
    TM Application respects the SLUO; it merely demands that the
    limits be disclosed.

In N0393, there's a checklist of things that need to be done when
defining a TM Application such as the SAM:
http://www.isotopicmaps.org/tmmm/TMMM-latest.html#parid0781.  There's
also a checklist for Syntax Deserialization Definitions, such as for
XTM and HyTM:
http://www.isotopicmaps.org/tmmm/TMMM-latest.html#parid0775.

It's possible to reconcile the SAM and the TMM.  I hope we will all
read the new SAM document (which is looking cleaner than ever -
http://www.isotopicmaps.org/sam/sam-model/), and the new TMM document,
too, with that goal.

-- Steve

Steven R. Newcomb, Consultant
srn@coolheads.com

Coolheads Consulting
http://www.coolheads.com

voice: +1 972 359 8160
fax:   +1 972 359 0270

1527 Northaven Drive
Allen, Texas 75002-1648 USA

_______________________________________________
sc34wg3 mailing list
sc34wg3@isotopicmaps.org
http://www.isotopicmaps.org/mailman/listinfo/sc34wg3