[sc34wg3] RE: How Two Syntaxes Can Make One Standard

Wed, 25 Jul 2001 09:47:47 -0400

This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C11510.6418D5B6
Content-Type: text/plain;
	charset="iso-8859-1"

Greetings,

This document will be number N235 and posted to the SC 34 site as soon as it
comes up.  JTC 1 web site has been a little temperamental this week.

Thanks,

Sara Hafele
American National Standards Institute
25 West 43rd St.
New York, NY 10036

phn (212) 642-4937
fax  (212) 840-2298
email:   shafele@ansi.org

Visit ANSI Online at www.ansi.org

-----Original Message-----
From: Steven R. Newcomb [  <mailto:srn@coolheads.com>
mailto:srn@coolheads.com]
Sent: Tuesday, July 24, 2001 8:13 PM
To: sc34wg3@isotopicmaps.org
Cc: shafele@ansi.org; mxm@ornl.gov
Subject: How Two Syntaxes Can Make One Standard

We had intended to file a Defect Report for 13250, but we are
submitting this paper to the committee for discussion, instead.  We're
hoping that the questions we raise in this paper will provoke
discussion that will provide guidance for the development of a full
Defect Report, or perhaps for some other kind of approach to the
fulfillment of the community's goals for 13250.

We feel that if we're going to file a Defect Report, we should do it
in the context of consensus that that's the right thing to do, and
that the approach that such a Defect Report (if any) will outline will
be the right way to do it.

We feel that, even though this is more of a discussion kickoff than a
formal agenda-driver, it's an important enough paper to deserve an
SC34 number.  (So, James Mason and/or Sara Hafele, can you please
number this paper?  Thanks.)

Michel and Steve

Michel Biezunski, InfoLoom
Tel +33 1 44 59 84 29 Cell +33 6 03 99 25 29
Email: mb@infoloom.com  Web: www.infoloom.com

Steven R. Newcomb, Consultant
Coolheads Consulting
srn@coolheads.com
voice: +1 972 359 8160
fax:   +1 972 359 0270
1527 Northaven Drive
Allen, Texas 75002-1648 USA

************************************************************************

ISO/IEC 13250:2000: How Two Syntaxes Can Make One Standard

July 24, 2001

Michel Biezunski and Steven R. Newcomb

Introduction
------------------------------------------

The ISO/IEC 13250:2000 "Topic Maps" International
Standard, which seems about to integrate a second
interchange syntax, the XTM DTD, does not explain to
what degree, and exactly how, the two syntaxes are
functionally equivalent.  The standard should explain
this.

How to describe the semantic commonalities of the syntaxes?
------------------------------------------

One might think that there are two ways to formalize
the semantic commonalities of the two syntaxes:

  (1) Describe a rigorous syntactic transformation
      process that will show how instances of one
      syntax can be transformed into instances of the
      other syntax, or

  (2) Describe how instances of each syntax can be
      transformed into instances of the common
      underlying model (which could be, but need not
      be, a syntactic model), and describe how
      instances of the underlying model can be
      transformed into instances of each syntax.

The first approach might seem easier, at least
superficially.  However, if we select this solution, we
are focusing on just two syntaxes, instead of
recognizing the fact that information that has the
character of topic map information may be expressed in
many different notations.  It is highly desirable to be
able to federate all kinds of "finding information",
not just the finding information that happens to be
expressed in one of only two syntaxes.  For example, it
would be inappropriate to exclude instances of RDF or
NewsML from the possibility of being understood as
interchangeable topic map documents, with their
information becoming directly available to topic map
application software.  If we adopt the first approach,
RDF and NewsML instances would be only indirectly
available, by means of some sort of syntactic
transformation into the form of a syntactic topic map,
which would then, in turn, be parsable as a topic map
and made available to topic map applications.  The
extra overhead and inconvenience of this transformation
would be a barrier for RDF and NewsML instances.

Unlike the first approach, the second approach will be
applicable to any number of notations, although the ISO
13250 standard would only actually apply the approach
to the two syntaxes.  The second approach is more
ambitious in the sense that it requires that the
underlying foundational model be made explicit, and it
will make topic map applications far more ubiquitous
and omnivorous over the long term.

The difference between topic map syntax and topic map
information
------------------------------------------

The structure of the topic maps that are represented
for interchange in either the existing HyTime-based
syntax of 13250, or in the newly-contributed XTM
syntax, is *not* identical to the syntactic structures
of the documents used to interchange them.  Therefore,
neither 13250-based nor XTM-based topic map documents
are "ready-to-use" by application-specific logic.  In
other words, a syntactically represented topic map
doesn't reflect exactly what a topic map software
application would be expected to understand from it.
Before a topic map software application can be expected
to perform its application-specific functions, generic
processing -- processing that must be performed in
order to understand the topic map that an
interchangeable instance of that topic map is designed
to represent -- to make the topic map "ready-to-use".

>From an economic standpoint, there are significant
advantages in using a distinct software module that
implements this generic processing, commonly called a
"topic map engine" or a "topic map parser".  We urge
that the term "topic map parsing" be reserved to mean
all of the aspects of "topic map processing" that are
required to be done by all topic map software that
takes, as input, interchangeable topic maps that
conform to either the HyTime-based or XTM-based
syntaxes.  We urge that the term "topic map processing"
be used generically, so that it can be used to refer to
any kind of processing, including both topic map
parsing (as just defined) and application-specific
processing of ready-to-use topic maps.

Four rules must be applied by all topic map parsers:

-- the subject-based merging rule
-- the name-based merging rule
-- the node-demander rule
-- the no-redundancy rule

These rules are already implicit in 13250.  We propose
that 13250 should emphasize their definitions and to
explain their ramifications.  These explanations will
be invaluable to users of the standard who need to
create conventions for the understanding of instances
of various (both ISO and non-ISO) notations as sources
of topic map information.

We urge that 13250 should fully explain and constrain
the topic maps parsing process, but only to the extent
of describing the rules and goals of the parsing
process, and how these rules and goals are to be
applied in the case of each of the two syntaxes.  For
the Topic Maps software industry, this is the
least-constraining approach that is consistent with
13250's goal of facilitating universal and accurate
understanding of Topic Maps information.  This approach
allows software vendors to compete on the grounds of
product differentiation, without unduly increasing the
cost of merging disparate topic maps emanating from
multiple, differently-specialized software
applications.

Two Underlying Models Have Been Proposed
------------------------------------------

Two different underlying models, both expressed in
terms of how XTM instances should be understood by
topic map parsers, have been contributed to the
discussion.  Both deserve serious attention.

 - An "XML Infoset"-like model, called "A Topic Map
   Data Model", has been proposed by Lars Marius
   Garshol.

 - A "Processing Model for XTM 1.0" has been proposed
   by Michel Biezunski and Steven R. Newcomb.

The two proposals do not necessarily contradict each
other, and the advantages and drawbacks of each of them
should be studied.

The underlying model that will be adopted by ISO must
clarify how specific applications of Topic Maps can be
defined and identified.

The documents that are available for study include:

 - Lars Marius Garshol, "A Topic Map Data Model -- An
   infoset-based proposal",
    <http://www.ontopia.net/topicmaps/materials/proc-model.html>
http://www.ontopia.net/topicmaps/materials/proc-model.html

 - Michel Biezunski and Steven R. Newcomb,
   "Topicmaps.net's Processing Model for XTM 1.0,
   version 1.0.1" [now sometimes called "PMTM4"],
    <http://www.topicmaps.net/pmtm4.htm> http://www.topicmaps.net/pmtm4.htm

   Other materials offer help in understanding PMTM4:

   - Biezunski/Newcomb, "The Structure of Topic Maps
     Foundations,"  <http://www.topicmaps.net/struct.htm>
http://www.topicmaps.net/struct.htm

   - Biezunski/Newcomb, "A Topic Maps Graph in XML,
      <http://www.topicmaps.net/simpleTMGraph3.htm>
http://www.topicmaps.net/simpleTMGraph3.htm and
      <http://www.topicmaps.net/simpleTMGraph3.dtd>
http://www.topicmaps.net/simpleTMGraph3.dtd.

   - Biezunski/Newcomb, "An API to a Topic Maps Graphs
     in XML",  <http://www.topicmaps.net/TMGraphAPI3.htm>
http://www.topicmaps.net/TMGraphAPI3.htm
     and  <http://www.topicmaps.net/TMGraphAPI3.dtd>
http://www.topicmaps.net/TMGraphAPI3.dtd

The decisions that will be taken on these issues will
influence the work that need to be done to complete the
work in progress for a topic map query language as well
as the one for a topic map constraint language.

We encourage the members of the ISO working group WG3
to read these documents and to send questions and
comments to the newly created mailing list for
discussion.  (The subscription server is
 <http://www.isotopicmaps.org/mailman/listinfo/sc34wg3>
http://www.isotopicmaps.org/mailman/listinfo/sc34wg3 )

------_=_NextPart_001_01C11510.6418D5B6
Content-Type: text/html;
	charset="iso-8859-1"

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE></TITLE>

<META content="MSHTML 5.50.4611.1300" name=GENERATOR></HEAD>
<BODY>
<P><FONT size=2>Greetings,</FONT></P>
<P><FONT size=2>This document will be number N235 and posted to the SC 34 site 
as soon as it comes up.&nbsp; JTC 1 web site has been a little temperamental 
this week.</FONT></P>
<P><FONT size=2>Thanks,</FONT><BR><BR><FONT size=2>Sara Hafele<BR>American 
National Standards Institute<BR>25 West 43rd St.<BR>New York, NY 
10036<BR><BR>phn (212) 642-4937<BR>fax&nbsp; (212) 
840-2298<BR>email:&nbsp;&nbsp; shafele@ansi.org<BR><BR>Visit ANSI Online at 
www.ansi.org<BR><BR><BR><BR>-----Original Message-----<BR>From: Steven R. 
Newcomb [</FONT><A href="mailto:srn@coolheads.com"><FONT 
size=2>mailto:srn@coolheads.com</FONT></A><FONT size=2>]<BR>Sent: Tuesday, July 
24, 2001 8:13 PM<BR>To: sc34wg3@isotopicmaps.org<BR>Cc: shafele@ansi.org; 
mxm@ornl.gov<BR>Subject: How Two Syntaxes Can Make One 
Standard<BR><BR><BR><BR>We had intended to file a Defect Report for 13250, but 
we are<BR>submitting this paper to the committee for discussion, instead.&nbsp; 
We're<BR>hoping that the questions we raise in this paper will 
provoke<BR>discussion that will provide guidance for the development of a 
full<BR>Defect Report, or perhaps for some other kind of approach to 
the<BR>fulfillment of the community's goals for 13250.<BR><BR>We feel that if 
we're going to file a Defect Report, we should do it<BR>in the context of 
consensus that that's the right thing to do, and<BR>that the approach that such 
a Defect Report (if any) will outline will<BR>be the right way to do 
it.<BR><BR>We feel that, even though this is more of a discussion kickoff than 
a<BR>formal agenda-driver, it's an important enough paper to deserve an<BR>SC34 
number.&nbsp; (So, James Mason and/or Sara Hafele, can you please<BR>number this 
paper?&nbsp; Thanks.)<BR><BR><BR><BR>Michel and Steve<BR><BR>Michel Biezunski, 
InfoLoom<BR>Tel +33 1 44 59 84 29 Cell +33 6 03 99 25 29<BR>Email: 
mb@infoloom.com&nbsp; Web: www.infoloom.com<BR><BR>Steven R. Newcomb, 
Consultant<BR>Coolheads Consulting<BR>srn@coolheads.com<BR>voice: +1 972 359 
8160<BR>fax:&nbsp;&nbsp; +1 972 359 0270<BR>1527 Northaven Drive<BR>Allen, Texas 
75002-1648 
USA<BR><BR>************************************************************************<BR><BR>ISO/IEC 
13250:2000: How Two Syntaxes Can Make One Standard<BR><BR>July 24, 
2001<BR><BR>Michel Biezunski and Steven R. 
Newcomb<BR><BR><BR>Introduction<BR>------------------------------------------<BR><BR>The 
ISO/IEC 13250:2000 "Topic Maps" International<BR>Standard, which seems about to 
integrate a second<BR>interchange syntax, the XTM DTD, does not explain 
to<BR>what degree, and exactly how, the two syntaxes are<BR>functionally 
equivalent.&nbsp; The standard should explain<BR>this.<BR><BR><BR>How to 
describe the semantic commonalities of the 
syntaxes?<BR>------------------------------------------<BR><BR>One might think 
that there are two ways to formalize<BR>the semantic commonalities of the two 
syntaxes:<BR><BR>&nbsp; (1) Describe a rigorous syntactic 
transformation<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; process that will show how 
instances of one<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; syntax can be transformed 
into instances of the<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; other syntax, 
or<BR><BR>&nbsp; (2) Describe how instances of each syntax can 
be<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; transformed into instances of the 
common<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; underlying model (which could be, but 
need not<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; be, a syntactic model), and describe 
how<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; instances of the underlying model can 
be<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; transformed into instances of each 
syntax.<BR><BR>The first approach might seem easier, at 
least<BR>superficially.&nbsp; However, if we select this solution, we<BR>are 
focusing on just two syntaxes, instead of<BR>recognizing the fact that 
information that has the<BR>character of topic map information may be expressed 
in<BR>many different notations.&nbsp; It is highly desirable to be<BR>able to 
federate all kinds of "finding information",<BR>not just the finding information 
that happens to be<BR>expressed in one of only two syntaxes.&nbsp; For example, 
it<BR>would be inappropriate to exclude instances of RDF or<BR>NewsML from the 
possibility of being understood as<BR>interchangeable topic map documents, with 
their<BR>information becoming directly available to topic map<BR>application 
software.&nbsp; If we adopt the first approach,<BR>RDF and NewsML instances 
would be only indirectly<BR>available, by means of some sort of 
syntactic<BR>transformation into the form of a syntactic topic map,<BR>which 
would then, in turn, be parsable as a topic map<BR>and made available to topic 
map applications.&nbsp; The<BR>extra overhead and inconvenience of this 
transformation<BR>would be a barrier for RDF and NewsML instances.<BR><BR>Unlike 
the first approach, the second approach will be<BR>applicable to any number of 
notations, although the ISO<BR>13250 standard would only actually apply the 
approach<BR>to the two syntaxes.&nbsp; The second approach is more<BR>ambitious 
in the sense that it requires that the<BR>underlying foundational model be made 
explicit, and it<BR>will make topic map applications far more ubiquitous<BR>and 
omnivorous over the long term.<BR><BR><BR>The difference between topic map 
syntax and topic 
map<BR>information<BR>------------------------------------------<BR><BR>The 
structure of the topic maps that are represented<BR>for interchange in either 
the existing HyTime-based<BR>syntax of 13250, or in the newly-contributed 
XTM<BR>syntax, is *not* identical to the syntactic structures<BR>of the 
documents used to interchange them.&nbsp; Therefore,<BR>neither 13250-based nor 
XTM-based topic map documents<BR>are "ready-to-use" by application-specific 
logic.&nbsp; In<BR>other words, a syntactically represented topic map<BR>doesn't 
reflect exactly what a topic map software<BR>application would be expected to 
understand from it.<BR>Before a topic map software application can be 
expected<BR>to perform its application-specific functions, generic<BR>processing 
-- processing that must be performed in<BR>order to understand the topic map 
that an<BR>interchangeable instance of that topic map is designed<BR>to 
represent -- to make the topic map "ready-to-use".<BR><BR>From an economic 
standpoint, there are significant<BR>advantages in using a distinct software 
module that<BR>implements this generic processing, commonly called a<BR>"topic 
map engine" or a "topic map parser".&nbsp; We urge<BR>that the term "topic map 
parsing" be reserved to mean<BR>all of the aspects of "topic map processing" 
that are<BR>required to be done by all topic map software that<BR>takes, as 
input, interchangeable topic maps that<BR>conform to either the HyTime-based or 
XTM-based<BR>syntaxes.&nbsp; We urge that the term "topic map processing"<BR>be 
used generically, so that it can be used to refer to<BR>any kind of processing, 
including both topic map<BR>parsing (as just defined) and 
application-specific<BR>processing of ready-to-use topic maps.<BR><BR>Four rules 
must be applied by all topic map parsers:<BR><BR>-- the subject-based merging 
rule<BR>-- the name-based merging rule<BR>-- the node-demander rule<BR>-- the 
no-redundancy rule<BR><BR>These rules are already implicit in 13250.&nbsp; We 
propose<BR>that 13250 should emphasize their definitions and to<BR>explain their 
ramifications.&nbsp; These explanations will<BR>be invaluable to users of the 
standard who need to<BR>create conventions for the understanding of 
instances<BR>of various (both ISO and non-ISO) notations as sources<BR>of topic 
map information.<BR><BR>We urge that 13250 should fully explain and 
constrain<BR>the topic maps parsing process, but only to the extent<BR>of 
describing the rules and goals of the parsing<BR>process, and how these rules 
and goals are to be<BR>applied in the case of each of the two syntaxes.&nbsp; 
For<BR>the Topic Maps software industry, this is the<BR>least-constraining 
approach that is consistent with<BR>13250's goal of facilitating universal and 
accurate<BR>understanding of Topic Maps information.&nbsp; This 
approach<BR>allows software vendors to compete on the grounds of<BR>product 
differentiation, without unduly increasing the<BR>cost of merging disparate 
topic maps emanating from<BR>multiple, differently-specialized 
software<BR>applications.<BR><BR><BR>Two Underlying Models Have Been 
Proposed<BR>------------------------------------------<BR><BR>Two different 
underlying models, both expressed in<BR>terms of how XTM instances should be 
understood by<BR>topic map parsers, have been contributed to 
the<BR>discussion.&nbsp; Both deserve serious attention.<BR><BR>&nbsp;- An "XML 
Infoset"-like model, called "A Topic Map<BR>&nbsp;&nbsp; Data Model", has been 
proposed by Lars Marius<BR>&nbsp;&nbsp; Garshol.<BR><BR>&nbsp;- A "Processing 
Model for XTM 1.0" has been proposed<BR>&nbsp;&nbsp; by Michel Biezunski and 
Steven R. Newcomb.<BR><BR>The two proposals do not necessarily contradict 
each<BR>other, and the advantages and drawbacks of each of them<BR>should be 
studied.<BR><BR>The underlying model that will be adopted by ISO must<BR>clarify 
how specific applications of Topic Maps can be<BR>defined and 
identified.<BR><BR>The documents that are available for study 
include:<BR><BR>&nbsp;- Lars Marius Garshol, "A Topic Map Data Model -- 
An<BR>&nbsp;&nbsp; infoset-based proposal",<BR>&nbsp;&nbsp; </FONT><A 
target=_blank 
href="http://www.ontopia.net/topicmaps/materials/proc-model.html"><FONT 
size=2>http://www.ontopia.net/topicmaps/materials/proc-model.html</FONT></A><BR><BR><FONT 
size=2>&nbsp;- Michel Biezunski and Steven R. Newcomb,<BR>&nbsp;&nbsp; 
"Topicmaps.net's Processing Model for XTM 1.0,<BR>&nbsp;&nbsp; version 1.0.1" 
[now sometimes called "PMTM4"],<BR>&nbsp;&nbsp; </FONT><A target=_blank 
href="http://www.topicmaps.net/pmtm4.htm"><FONT 
size=2>http://www.topicmaps.net/pmtm4.htm</FONT></A><BR><BR><FONT 
size=2>&nbsp;&nbsp; Other materials offer help in understanding 
PMTM4:<BR><BR>&nbsp;&nbsp; - Biezunski/Newcomb, "The Structure of Topic 
Maps<BR>&nbsp;&nbsp;&nbsp;&nbsp; Foundations," </FONT><A target=_blank 
href="http://www.topicmaps.net/struct.htm"><FONT 
size=2>http://www.topicmaps.net/struct.htm</FONT></A><BR><BR><FONT 
size=2>&nbsp;&nbsp; - Biezunski/Newcomb, "A Topic Maps Graph in 
XML,<BR>&nbsp;&nbsp;&nbsp;&nbsp; </FONT><A target=_blank 
href="http://www.topicmaps.net/simpleTMGraph3.htm"><FONT 
size=2>http://www.topicmaps.net/simpleTMGraph3.htm</FONT></A><FONT size=2> 
and<BR>&nbsp;&nbsp;&nbsp;&nbsp; </FONT><A target=_blank 
href="http://www.topicmaps.net/simpleTMGraph3.dtd"><FONT 
size=2>http://www.topicmaps.net/simpleTMGraph3.dtd</FONT></A><FONT 
size=2>.<BR><BR>&nbsp;&nbsp; - Biezunski/Newcomb, "An API to a Topic Maps 
Graphs<BR>&nbsp;&nbsp;&nbsp;&nbsp; in XML", </FONT><A target=_blank 
href="http://www.topicmaps.net/TMGraphAPI3.htm"><FONT 
size=2>http://www.topicmaps.net/TMGraphAPI3.htm</FONT></A><BR><FONT 
size=2>&nbsp;&nbsp;&nbsp;&nbsp; and </FONT><A target=_blank 
href="http://www.topicmaps.net/TMGraphAPI3.dtd"><FONT 
size=2>http://www.topicmaps.net/TMGraphAPI3.dtd</FONT></A><BR><BR><BR><FONT 
size=2>The decisions that will be taken on these issues will<BR>influence the 
work that need to be done to complete the<BR>work in progress for a topic map 
query language as well<BR>as the one for a topic map constraint 
language.<BR><BR>We encourage the members of the ISO working group WG3<BR>to 
read these documents and to send questions and<BR>comments to the newly created 
mailing list for<BR>discussion.&nbsp; (The subscription server is<BR></FONT><A 
target=_blank href="http://www.isotopicmaps.org/mailman/listinfo/sc34wg3"><FONT 
size=2>http://www.isotopicmaps.org/mailman/listinfo/sc34wg3</FONT></A><FONT 
size=2> )<BR></FONT></P></BODY></HTML>

------_=_NextPart_001_01C11510.6418D5B6--