[sc34wg3] SAM issue: sam-conformance

Lars Marius Garshol sc34wg3@isotopicmaps.org
23 Apr 2003 00:49:00 +0200


I created an issue for this, since I think we have to discuss this
properly. I was going to do it anyway, but now is the perfect time
since you've taken me up on this discussion.

Note, I say SAM throughout to keep the discussion simple. I know
there's some dissent about whether there should be separate SAM and RM
specifications, but let's ignore that while discussing this. The
conformance issue would play out in exactly the same way whether we
were talking about RM, SAM, or RM+SAM.

* Patrick Durusau
|
| Well, perhaps not a technical issue but the problem of the SAM not
| really requiring any sort of conformance seems like a problem to
| me. Difficult to see first of all how one can have a standard data
| model that does not require some level of conformance.
|
| Even harder to see how a TMCL or TMQL could be based on a data model
| that requires no conformance. 

I started out with the same view as you on this but gradually came to
think myself mistaken in that. I'll try to explain why below.

In my view we have to look at the conformance of each piece of the
puzzle separately. That is, they'll all have their own definition of
what it means to conform, but that definition may well tie in with
what is specified in the other documents.

Now, I should probably start of with what I believe the value of a
conformance clause is. In my view a standard (as opposed to a
technical report) exists in order to ensure interoperability, possibly
of data, possibly of software. Interoperability means ensuring that
different implementations will produce the same end-user results given
the same inputs. (Inputs here being both data and application logic in
whatever form.) 

I also think that restrictions on applications and implementations are
not inherently good; they are only means to achieve an end. So there
shouldn't be any more restrictions than there need to be. Only
restrictions that improve interoperability are of any value. And to
improve interoperability a restriction has to be very very precise.


The conformance clause of XTM I think should say that a document
conforms when it is valid according to the normative schema and can be
deserialized into a SAM instance without triggering errors
(XTM-specific errors or general SAM errors). Note that here the
syntaxes normatively reference the SAM, but it is only the conformance
clauses in the XTM documents that come into play.

As for implementations, I think they should be said to conform when
they can do an XTM1 -> SAM -> XTM2 roundtrip where XTM1 and XTM2
deserialize to logically equivalent SAM instances. SAM will have to
specify exactly what this means by describing a comparison procedure.

This means that if your tool can import XTM without being able to
roundtrip it that tool does not conform to XTM. Similarly, if your
tool can export conformant XTM without being able to do general
roundtripping it does not conform. (This may be perfectly OK. The tool
may be a genealogical research tool. All XTM will be saying is that it
is not a conforming XTM implementation.)

(BTW: I assume you'll agree that SAM does not need a conformance
section in order for us to do what I describe here?)


If you replace XTM with HyTM in the three paragraphs above you'll have
the conformance rules for HyTM.


CXTM is a little more tricky. An implementation conforms to CXTM if,
given a SAM instance, it always produces *exactly* the output required
by the CXTM specification. The problem here is that the input is a SAM
instance in CXTM, but in real life it will be an object structure, a
file, a database instance, or something similar. If the files are XTM
or HyTM we have iron-clad rules giving the whole file -> SAM -> CXTM
path, but for the others we won't have that. I think having it for XTM
and HyTM is enough. (I'll deal with the other cases below when I get
to SAM.)


TMQL is very similar, except here the input is an environment (with
bindings of various sorts), a SAM instance, and a query, and the
output is a TMQL result set (what form that will take is still
unknown). Given a sort of CXTM for TMQL result sets we will have all
we need to create a test suite for TMQL, which again means the
conformance clause is tight enough.


TMCL is very similar, except there is no environment and the output is
a yes/no. Conceivably one could also create a test suite where TMCL is
used as a filter. That doesn't affect the main idea, though.


Finally we come to the SAM itself. Here the question, in my opinion,
is what having a separate notion of conformance or non-conformance to
the SAM would give us. That is, in what scenarios would it give us
additional interoperability of some sort? I can't think of any. TM4J,
OKS, K42, and TMAPI would all conform to the SAM under any reasonable
conformance clause that I can think of, but they still wouldn't have
API-level interoperability. Interoperability for TMQL, TMCL, ..., I
have accounted for already, so SAM can't help us there more than it
already does.

So where would it help? What piece of data or description of
application logic would a user be able to move from implementation A
to implementation B and see working in the same way if we put in a
conformance clause in the SAM? I honestly can't think of a single one,
and therefore I think it would be wrong to put one in.

And even if we put one in, what would it look like? The one we have
now says that you conform if you provide a documentation of the
correspondence between the SAM and your external API. Well, how would
we be able to claim that people failed to conform here if they did
provide such a document? What sort of correspondences would be
acceptable and which ones would not? We could spend years trying to
get that definition right, but I don't see any way where we could
actually come up with a thin red line that goes exactly where it needs
to go.

In short, I think it's wrong to even try.

Note that if you look at the Japanese NB comments on N0396 (sent to
Sara late last night) they say that

  "Conformance per se is not usually part of a formal data model and
  should be removed from this part of the standard. A conformance
  clause is required for ISO 13250, but it needs to be included as
  either a separate part or in some other part to be decided by the
  committee."

This makes perfect sense to me, for the reasons given above, and it
would indeed surprise me to find a data model that has a conformance
section. The XML Infoset does not have one, and if you look at the SQL
standard you'll see that it discusses conformance to SQL, not
conformance of the internal model, because the only way you can access
the internal model (from the point of view of the SQL standard, and
indeed in many implementations) is through SQL.


I should perhaps add that what lead me to start thinking in this
direction was my criticism of the RM that one couldn't determine what
conformance to it really meant. I gradually came to realize that the
problem was that it was a data model, and that although the SAM was
better off it was only marginally so. You can consider this a case of
boomerang criticism, I suppose. :)

-- 
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50                  <URL: http://www.garshol.priv.no >