[sc34wg3] RE: How Two Syntaxes Can Make One Standard
Sara Hafele
sc34wg3@isotopicmaps.org
Wed, 25 Jul 2001 09:47:47 -0400
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_001_01C11510.6418D5B6
Content-Type: text/plain;
charset="iso-8859-1"
Greetings,
This document will be number N235 and posted to the SC 34 site as soon as it
comes up. JTC 1 web site has been a little temperamental this week.
Thanks,
Sara Hafele
American National Standards Institute
25 West 43rd St.
New York, NY 10036
phn (212) 642-4937
fax (212) 840-2298
email: shafele@ansi.org
Visit ANSI Online at www.ansi.org
-----Original Message-----
From: Steven R. Newcomb [ <mailto:srn@coolheads.com>
mailto:srn@coolheads.com]
Sent: Tuesday, July 24, 2001 8:13 PM
To: sc34wg3@isotopicmaps.org
Cc: shafele@ansi.org; mxm@ornl.gov
Subject: How Two Syntaxes Can Make One Standard
We had intended to file a Defect Report for 13250, but we are
submitting this paper to the committee for discussion, instead. We're
hoping that the questions we raise in this paper will provoke
discussion that will provide guidance for the development of a full
Defect Report, or perhaps for some other kind of approach to the
fulfillment of the community's goals for 13250.
We feel that if we're going to file a Defect Report, we should do it
in the context of consensus that that's the right thing to do, and
that the approach that such a Defect Report (if any) will outline will
be the right way to do it.
We feel that, even though this is more of a discussion kickoff than a
formal agenda-driver, it's an important enough paper to deserve an
SC34 number. (So, James Mason and/or Sara Hafele, can you please
number this paper? Thanks.)
Michel and Steve
Michel Biezunski, InfoLoom
Tel +33 1 44 59 84 29 Cell +33 6 03 99 25 29
Email: mb@infoloom.com Web: www.infoloom.com
Steven R. Newcomb, Consultant
Coolheads Consulting
srn@coolheads.com
voice: +1 972 359 8160
fax: +1 972 359 0270
1527 Northaven Drive
Allen, Texas 75002-1648 USA
************************************************************************
ISO/IEC 13250:2000: How Two Syntaxes Can Make One Standard
July 24, 2001
Michel Biezunski and Steven R. Newcomb
Introduction
------------------------------------------
The ISO/IEC 13250:2000 "Topic Maps" International
Standard, which seems about to integrate a second
interchange syntax, the XTM DTD, does not explain to
what degree, and exactly how, the two syntaxes are
functionally equivalent. The standard should explain
this.
How to describe the semantic commonalities of the syntaxes?
------------------------------------------
One might think that there are two ways to formalize
the semantic commonalities of the two syntaxes:
(1) Describe a rigorous syntactic transformation
process that will show how instances of one
syntax can be transformed into instances of the
other syntax, or
(2) Describe how instances of each syntax can be
transformed into instances of the common
underlying model (which could be, but need not
be, a syntactic model), and describe how
instances of the underlying model can be
transformed into instances of each syntax.
The first approach might seem easier, at least
superficially. However, if we select this solution, we
are focusing on just two syntaxes, instead of
recognizing the fact that information that has the
character of topic map information may be expressed in
many different notations. It is highly desirable to be
able to federate all kinds of "finding information",
not just the finding information that happens to be
expressed in one of only two syntaxes. For example, it
would be inappropriate to exclude instances of RDF or
NewsML from the possibility of being understood as
interchangeable topic map documents, with their
information becoming directly available to topic map
application software. If we adopt the first approach,
RDF and NewsML instances would be only indirectly
available, by means of some sort of syntactic
transformation into the form of a syntactic topic map,
which would then, in turn, be parsable as a topic map
and made available to topic map applications. The
extra overhead and inconvenience of this transformation
would be a barrier for RDF and NewsML instances.
Unlike the first approach, the second approach will be
applicable to any number of notations, although the ISO
13250 standard would only actually apply the approach
to the two syntaxes. The second approach is more
ambitious in the sense that it requires that the
underlying foundational model be made explicit, and it
will make topic map applications far more ubiquitous
and omnivorous over the long term.
The difference between topic map syntax and topic map
information
------------------------------------------
The structure of the topic maps that are represented
for interchange in either the existing HyTime-based
syntax of 13250, or in the newly-contributed XTM
syntax, is *not* identical to the syntactic structures
of the documents used to interchange them. Therefore,
neither 13250-based nor XTM-based topic map documents
are "ready-to-use" by application-specific logic. In
other words, a syntactically represented topic map
doesn't reflect exactly what a topic map software
application would be expected to understand from it.
Before a topic map software application can be expected
to perform its application-specific functions, generic
processing -- processing that must be performed in
order to understand the topic map that an
interchangeable instance of that topic map is designed
to represent -- to make the topic map "ready-to-use".
>From an economic standpoint, there are significant
advantages in using a distinct software module that
implements this generic processing, commonly called a
"topic map engine" or a "topic map parser". We urge
that the term "topic map parsing" be reserved to mean
all of the aspects of "topic map processing" that are
required to be done by all topic map software that
takes, as input, interchangeable topic maps that
conform to either the HyTime-based or XTM-based
syntaxes. We urge that the term "topic map processing"
be used generically, so that it can be used to refer to
any kind of processing, including both topic map
parsing (as just defined) and application-specific
processing of ready-to-use topic maps.
Four rules must be applied by all topic map parsers:
-- the subject-based merging rule
-- the name-based merging rule
-- the node-demander rule
-- the no-redundancy rule
These rules are already implicit in 13250. We propose
that 13250 should emphasize their definitions and to
explain their ramifications. These explanations will
be invaluable to users of the standard who need to
create conventions for the understanding of instances
of various (both ISO and non-ISO) notations as sources
of topic map information.
We urge that 13250 should fully explain and constrain
the topic maps parsing process, but only to the extent
of describing the rules and goals of the parsing
process, and how these rules and goals are to be
applied in the case of each of the two syntaxes. For
the Topic Maps software industry, this is the
least-constraining approach that is consistent with
13250's goal of facilitating universal and accurate
understanding of Topic Maps information. This approach
allows software vendors to compete on the grounds of
product differentiation, without unduly increasing the
cost of merging disparate topic maps emanating from
multiple, differently-specialized software
applications.
Two Underlying Models Have Been Proposed
------------------------------------------
Two different underlying models, both expressed in
terms of how XTM instances should be understood by
topic map parsers, have been contributed to the
discussion. Both deserve serious attention.
- An "XML Infoset"-like model, called "A Topic Map
Data Model", has been proposed by Lars Marius
Garshol.
- A "Processing Model for XTM 1.0" has been proposed
by Michel Biezunski and Steven R. Newcomb.
The two proposals do not necessarily contradict each
other, and the advantages and drawbacks of each of them
should be studied.
The underlying model that will be adopted by ISO must
clarify how specific applications of Topic Maps can be
defined and identified.
The documents that are available for study include:
- Lars Marius Garshol, "A Topic Map Data Model -- An
infoset-based proposal",
<http://www.ontopia.net/topicmaps/materials/proc-model.html>
http://www.ontopia.net/topicmaps/materials/proc-model.html
- Michel Biezunski and Steven R. Newcomb,
"Topicmaps.net's Processing Model for XTM 1.0,
version 1.0.1" [now sometimes called "PMTM4"],
<http://www.topicmaps.net/pmtm4.htm> http://www.topicmaps.net/pmtm4.htm
Other materials offer help in understanding PMTM4:
- Biezunski/Newcomb, "The Structure of Topic Maps
Foundations," <http://www.topicmaps.net/struct.htm>
http://www.topicmaps.net/struct.htm
- Biezunski/Newcomb, "A Topic Maps Graph in XML,
<http://www.topicmaps.net/simpleTMGraph3.htm>
http://www.topicmaps.net/simpleTMGraph3.htm and
<http://www.topicmaps.net/simpleTMGraph3.dtd>
http://www.topicmaps.net/simpleTMGraph3.dtd.
- Biezunski/Newcomb, "An API to a Topic Maps Graphs
in XML", <http://www.topicmaps.net/TMGraphAPI3.htm>
http://www.topicmaps.net/TMGraphAPI3.htm
and <http://www.topicmaps.net/TMGraphAPI3.dtd>
http://www.topicmaps.net/TMGraphAPI3.dtd
The decisions that will be taken on these issues will
influence the work that need to be done to complete the
work in progress for a topic map query language as well
as the one for a topic map constraint language.
We encourage the members of the ISO working group WG3
to read these documents and to send questions and
comments to the newly created mailing list for
discussion. (The subscription server is
<http://www.isotopicmaps.org/mailman/listinfo/sc34wg3>
http://www.isotopicmaps.org/mailman/listinfo/sc34wg3 )
------_=_NextPart_001_01C11510.6418D5B6
Content-Type: text/html;
charset="iso-8859-1"
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE></TITLE>
<META content="MSHTML 5.50.4611.1300" name=GENERATOR></HEAD>
<BODY>
<P><FONT size=2>Greetings,</FONT></P>
<P><FONT size=2>This document will be number N235 and posted to the SC 34 site
as soon as it comes up. JTC 1 web site has been a little temperamental
this week.</FONT></P>
<P><FONT size=2>Thanks,</FONT><BR><BR><FONT size=2>Sara Hafele<BR>American
National Standards Institute<BR>25 West 43rd St.<BR>New York, NY
10036<BR><BR>phn (212) 642-4937<BR>fax (212)
840-2298<BR>email: shafele@ansi.org<BR><BR>Visit ANSI Online at
www.ansi.org<BR><BR><BR><BR>-----Original Message-----<BR>From: Steven R.
Newcomb [</FONT><A href="mailto:srn@coolheads.com"><FONT
size=2>mailto:srn@coolheads.com</FONT></A><FONT size=2>]<BR>Sent: Tuesday, July
24, 2001 8:13 PM<BR>To: sc34wg3@isotopicmaps.org<BR>Cc: shafele@ansi.org;
mxm@ornl.gov<BR>Subject: How Two Syntaxes Can Make One
Standard<BR><BR><BR><BR>We had intended to file a Defect Report for 13250, but
we are<BR>submitting this paper to the committee for discussion, instead.
We're<BR>hoping that the questions we raise in this paper will
provoke<BR>discussion that will provide guidance for the development of a
full<BR>Defect Report, or perhaps for some other kind of approach to
the<BR>fulfillment of the community's goals for 13250.<BR><BR>We feel that if
we're going to file a Defect Report, we should do it<BR>in the context of
consensus that that's the right thing to do, and<BR>that the approach that such
a Defect Report (if any) will outline will<BR>be the right way to do
it.<BR><BR>We feel that, even though this is more of a discussion kickoff than
a<BR>formal agenda-driver, it's an important enough paper to deserve an<BR>SC34
number. (So, James Mason and/or Sara Hafele, can you please<BR>number this
paper? Thanks.)<BR><BR><BR><BR>Michel and Steve<BR><BR>Michel Biezunski,
InfoLoom<BR>Tel +33 1 44 59 84 29 Cell +33 6 03 99 25 29<BR>Email:
mb@infoloom.com Web: www.infoloom.com<BR><BR>Steven R. Newcomb,
Consultant<BR>Coolheads Consulting<BR>srn@coolheads.com<BR>voice: +1 972 359
8160<BR>fax: +1 972 359 0270<BR>1527 Northaven Drive<BR>Allen, Texas
75002-1648
USA<BR><BR>************************************************************************<BR><BR>ISO/IEC
13250:2000: How Two Syntaxes Can Make One Standard<BR><BR>July 24,
2001<BR><BR>Michel Biezunski and Steven R.
Newcomb<BR><BR><BR>Introduction<BR>------------------------------------------<BR><BR>The
ISO/IEC 13250:2000 "Topic Maps" International<BR>Standard, which seems about to
integrate a second<BR>interchange syntax, the XTM DTD, does not explain
to<BR>what degree, and exactly how, the two syntaxes are<BR>functionally
equivalent. The standard should explain<BR>this.<BR><BR><BR>How to
describe the semantic commonalities of the
syntaxes?<BR>------------------------------------------<BR><BR>One might think
that there are two ways to formalize<BR>the semantic commonalities of the two
syntaxes:<BR><BR> (1) Describe a rigorous syntactic
transformation<BR> process that will show how
instances of one<BR> syntax can be transformed
into instances of the<BR> other syntax,
or<BR><BR> (2) Describe how instances of each syntax can
be<BR> transformed into instances of the
common<BR> underlying model (which could be, but
need not<BR> be, a syntactic model), and describe
how<BR> instances of the underlying model can
be<BR> transformed into instances of each
syntax.<BR><BR>The first approach might seem easier, at
least<BR>superficially. However, if we select this solution, we<BR>are
focusing on just two syntaxes, instead of<BR>recognizing the fact that
information that has the<BR>character of topic map information may be expressed
in<BR>many different notations. It is highly desirable to be<BR>able to
federate all kinds of "finding information",<BR>not just the finding information
that happens to be<BR>expressed in one of only two syntaxes. For example,
it<BR>would be inappropriate to exclude instances of RDF or<BR>NewsML from the
possibility of being understood as<BR>interchangeable topic map documents, with
their<BR>information becoming directly available to topic map<BR>application
software. If we adopt the first approach,<BR>RDF and NewsML instances
would be only indirectly<BR>available, by means of some sort of
syntactic<BR>transformation into the form of a syntactic topic map,<BR>which
would then, in turn, be parsable as a topic map<BR>and made available to topic
map applications. The<BR>extra overhead and inconvenience of this
transformation<BR>would be a barrier for RDF and NewsML instances.<BR><BR>Unlike
the first approach, the second approach will be<BR>applicable to any number of
notations, although the ISO<BR>13250 standard would only actually apply the
approach<BR>to the two syntaxes. The second approach is more<BR>ambitious
in the sense that it requires that the<BR>underlying foundational model be made
explicit, and it<BR>will make topic map applications far more ubiquitous<BR>and
omnivorous over the long term.<BR><BR><BR>The difference between topic map
syntax and topic
map<BR>information<BR>------------------------------------------<BR><BR>The
structure of the topic maps that are represented<BR>for interchange in either
the existing HyTime-based<BR>syntax of 13250, or in the newly-contributed
XTM<BR>syntax, is *not* identical to the syntactic structures<BR>of the
documents used to interchange them. Therefore,<BR>neither 13250-based nor
XTM-based topic map documents<BR>are "ready-to-use" by application-specific
logic. In<BR>other words, a syntactically represented topic map<BR>doesn't
reflect exactly what a topic map software<BR>application would be expected to
understand from it.<BR>Before a topic map software application can be
expected<BR>to perform its application-specific functions, generic<BR>processing
-- processing that must be performed in<BR>order to understand the topic map
that an<BR>interchangeable instance of that topic map is designed<BR>to
represent -- to make the topic map "ready-to-use".<BR><BR>From an economic
standpoint, there are significant<BR>advantages in using a distinct software
module that<BR>implements this generic processing, commonly called a<BR>"topic
map engine" or a "topic map parser". We urge<BR>that the term "topic map
parsing" be reserved to mean<BR>all of the aspects of "topic map processing"
that are<BR>required to be done by all topic map software that<BR>takes, as
input, interchangeable topic maps that<BR>conform to either the HyTime-based or
XTM-based<BR>syntaxes. We urge that the term "topic map processing"<BR>be
used generically, so that it can be used to refer to<BR>any kind of processing,
including both topic map<BR>parsing (as just defined) and
application-specific<BR>processing of ready-to-use topic maps.<BR><BR>Four rules
must be applied by all topic map parsers:<BR><BR>-- the subject-based merging
rule<BR>-- the name-based merging rule<BR>-- the node-demander rule<BR>-- the
no-redundancy rule<BR><BR>These rules are already implicit in 13250. We
propose<BR>that 13250 should emphasize their definitions and to<BR>explain their
ramifications. These explanations will<BR>be invaluable to users of the
standard who need to<BR>create conventions for the understanding of
instances<BR>of various (both ISO and non-ISO) notations as sources<BR>of topic
map information.<BR><BR>We urge that 13250 should fully explain and
constrain<BR>the topic maps parsing process, but only to the extent<BR>of
describing the rules and goals of the parsing<BR>process, and how these rules
and goals are to be<BR>applied in the case of each of the two syntaxes.
For<BR>the Topic Maps software industry, this is the<BR>least-constraining
approach that is consistent with<BR>13250's goal of facilitating universal and
accurate<BR>understanding of Topic Maps information. This
approach<BR>allows software vendors to compete on the grounds of<BR>product
differentiation, without unduly increasing the<BR>cost of merging disparate
topic maps emanating from<BR>multiple, differently-specialized
software<BR>applications.<BR><BR><BR>Two Underlying Models Have Been
Proposed<BR>------------------------------------------<BR><BR>Two different
underlying models, both expressed in<BR>terms of how XTM instances should be
understood by<BR>topic map parsers, have been contributed to
the<BR>discussion. Both deserve serious attention.<BR><BR> - An "XML
Infoset"-like model, called "A Topic Map<BR> Data Model", has been
proposed by Lars Marius<BR> Garshol.<BR><BR> - A "Processing
Model for XTM 1.0" has been proposed<BR> by Michel Biezunski and
Steven R. Newcomb.<BR><BR>The two proposals do not necessarily contradict
each<BR>other, and the advantages and drawbacks of each of them<BR>should be
studied.<BR><BR>The underlying model that will be adopted by ISO must<BR>clarify
how specific applications of Topic Maps can be<BR>defined and
identified.<BR><BR>The documents that are available for study
include:<BR><BR> - Lars Marius Garshol, "A Topic Map Data Model --
An<BR> infoset-based proposal",<BR> </FONT><A
target=_blank
href="http://www.ontopia.net/topicmaps/materials/proc-model.html"><FONT
size=2>http://www.ontopia.net/topicmaps/materials/proc-model.html</FONT></A><BR><BR><FONT
size=2> - Michel Biezunski and Steven R. Newcomb,<BR>
"Topicmaps.net's Processing Model for XTM 1.0,<BR> version 1.0.1"
[now sometimes called "PMTM4"],<BR> </FONT><A target=_blank
href="http://www.topicmaps.net/pmtm4.htm"><FONT
size=2>http://www.topicmaps.net/pmtm4.htm</FONT></A><BR><BR><FONT
size=2> Other materials offer help in understanding
PMTM4:<BR><BR> - Biezunski/Newcomb, "The Structure of Topic
Maps<BR> Foundations," </FONT><A target=_blank
href="http://www.topicmaps.net/struct.htm"><FONT
size=2>http://www.topicmaps.net/struct.htm</FONT></A><BR><BR><FONT
size=2> - Biezunski/Newcomb, "A Topic Maps Graph in
XML,<BR> </FONT><A target=_blank
href="http://www.topicmaps.net/simpleTMGraph3.htm"><FONT
size=2>http://www.topicmaps.net/simpleTMGraph3.htm</FONT></A><FONT size=2>
and<BR> </FONT><A target=_blank
href="http://www.topicmaps.net/simpleTMGraph3.dtd"><FONT
size=2>http://www.topicmaps.net/simpleTMGraph3.dtd</FONT></A><FONT
size=2>.<BR><BR> - Biezunski/Newcomb, "An API to a Topic Maps
Graphs<BR> in XML", </FONT><A target=_blank
href="http://www.topicmaps.net/TMGraphAPI3.htm"><FONT
size=2>http://www.topicmaps.net/TMGraphAPI3.htm</FONT></A><BR><FONT
size=2> and </FONT><A target=_blank
href="http://www.topicmaps.net/TMGraphAPI3.dtd"><FONT
size=2>http://www.topicmaps.net/TMGraphAPI3.dtd</FONT></A><BR><BR><BR><FONT
size=2>The decisions that will be taken on these issues will<BR>influence the
work that need to be done to complete the<BR>work in progress for a topic map
query language as well<BR>as the one for a topic map constraint
language.<BR><BR>We encourage the members of the ISO working group WG3<BR>to
read these documents and to send questions and<BR>comments to the newly created
mailing list for<BR>discussion. (The subscription server is<BR></FONT><A
target=_blank href="http://www.isotopicmaps.org/mailman/listinfo/sc34wg3"><FONT
size=2>http://www.isotopicmaps.org/mailman/listinfo/sc34wg3</FONT></A><FONT
size=2> )<BR></FONT></P></BODY></HTML>
------_=_NextPart_001_01C11510.6418D5B6--