[sc34wg3] Reference Model to SAM - Mapping Issues and Thoughts

03 Jan 2003 12:53:59 -0600

"Graham Moore" <gdm@empolis.co.uk> writes:

> Hi all, 
> 
> I decided to pick up this issue from Steves posting
> as I agree with him that it is the perhaps the most
> fundamental aspect of a multi-part standard. If the
> parts are incoherent as a whole then semantic
> interchange is doomed and we have a big mess.
> 
> I would just like to lay out what I feel the issues
> we face in defining this SAM RM mapping. Some of
> these things are done or in place already but it
> feels good to have a complete (maybe) list and even
> better if some things are ticked off.
> 
> 1. PSI definitions for all SAM constructs that can be
> represented in the RM. This allows an up and down
> translate with no loss of information when viewing
> the SAM after migrating it down to the rm then back
> up.
> 
> 2. A definition of how each SAM construct 'looks' as
> a RM subgraph. Including use of PSI structures. With
> this though perhaps a place to do the Mapping is in
> the SAM itself alongside each information item. This
> then sets a standard way that any models defined in
> terms of the RM metamodel must for each 'item' define
> its relationship to the RM?
> 
> 3. This issue I think is the biggy and is the issue
> that has troubled me for many months. The RM, simply
> put, has more nodes than the SAM. It has more things
> although less types. The SAM actively hides some
> nodes that a RM view exposes. Thus taking a RM model
> and viewing it as the SAM means that some things are
> NOT addressable. My feeling is that this is exactly
> what should be happening. We build levels of
> abstractions for different purposes - in this case to
> make the most prominent and important part of our
> intellectual thinking available in a easy to
> understand and useful form. The open question and the
> one to which I seek comment is - 'Is it ok for the
> SAM to lose some nodes such that some RM parts that
> were addressable are no longer so - even if the SAM
> is translated back into a RM representation?
> 
> Here is an example - around the area of
> subjectIndicators the RM has many more nodes to
> express subjectInidcatormess. If someone at the RM
> level makes an assertion about one of these nodes and
> then translates that into a SAM the SAM IS NOT able
> to maintain that information and it will be lost. Its
> becuase there are nodes in the RM that have no
> identifiable equivalent in the SAM.
> 
> This means that :
> 
> 3.1 SAM -> RM -> SAM (is loss-less)

> 3.2 RM -> SAM -> RM (is loss-less in cases where
> assertions arent made about items that have no
> identifiable node in the SAM.)
> 
> I think this is ok and if its ok with everyone else
> then I dont really see that we have a problem.
> 
> The implication, if the SAM must have a mechanism for
> accessing the underlying RM with ALL nodes present,
> is that ALL implementations of all Topic Map Models
> (SAM or otherwise) must maintain all relevant RM
> nodes and thus be implemented in terms of RM
> structures. Yikes!
> 
> I hope this helps the mapping process as it is a
> critical part of this activity.

I have been troubled by the same problem, but I'm
feeling much better now.  The solution turned out to be
to minimize the constraints that the RM places on TM
Model design.  Your solution assumes that the RM has
already decided, on behalf of the SAM, what's involved
in subject indicatorness.  That used to be true, but
it's not true any more!  We expelled all that stuff
from the RM so as to maximize the SAM's freedom to be
exactly what it is, within the constraint that the SAM
must make explicit exactly which subjects it honors, so
that the SLUO will be achieved with respect to those
subjects.

So, when we express the SAM in RM terms, we get to say
what subjects the SAM regards as subjects -- what
becomes a node (a topic), and what doesn't.

No matter how many ways we describe it, the SAM really
must say exactly which subjects it honors.  I don't
think we can afford to have *any* discrepancies in our
different descriptions of the SAM.  Lossy round-trips
between SAM-land and RM-land are not acceptable; if
that's what happens, something is broken.  In fact, I'd
prefer that there be only a single description of the
SAM.  I think we can achieve that, but you SAM guys
really have to be explicit about exactly which subjects
the SAM recognizes as subjects, which means that
Subject Location Uniqueness Objective will be honored
by all SAM implementations with respect to those
subjects.  Everything that's not a subject isn't
subject to merging; it's just a property value.

The problem we're talking about is not unique to the
SAM.  *Every* TM Model has to reflect design decisions
about what's a subject and what isn't.  If nothing
else, there's a bootstrapping issue that (I think,
anyway) *unavoidably* requires some things to be
property values rather than subjects.  The RM does not
constrain these design decisions.  The RM only insists
that the decisions be explicit.

The biggest problem with the SAM today is that it
doesn't say which information items are properties of
subjects, and which are subjects.  I think this
distinction goes to the heart of what Topic Maps are
all about, and we really must be explicit about it.
Yes, it either involves changing the way the SAM is
expressed, or it involves having two alternative ways
of expressing the SAM (which is much harder to do
without discrepancies).  We can do it either way.

-- Steve

Steven R. Newcomb, Consultant
srn@coolheads.com

Coolheads Consulting
http://www.coolheads.com

voice: +1 972 359 8160
fax:   +1 972 359 0270

1527 Northaven Drive
Allen, Texas 75002-1648 USA