[sc34wg3] Individual contribution on the U.S. N.B. position onthe
progress ion of Topic Map standards
Patrick Durusau
sc34wg3@isotopicmaps.org
Sat, 03 Apr 2004 07:52:45 -0500
Robert,
Robert Barta wrote:
> On Thu, Apr 01, 2004 at 03:48:14PM +0200, Bernard Vatant wrote:
>
>>>I believe that the four parts of ISO 13250 in progress at the moment
>>>address all four of your points, but as you noted, there is currently no
>>>way for an application to specify merging rules declaratively.
>>
>
>>I'm not sure to understand what you mean by "specify merging rules
>>declaratively", but it sounds to me a sort of paradox. From the
>>recent thread about merging rules, what I understood was that the
>>debate was about having or not merging rules *at all* in the core
>>standard, since they are procedural specifications.
>
>
> Bernard,
>
> I happily disagree. :-)
>
> I think Dmitry's assessment of the situation that you can capture all
> merging rules with 'additional statements' is quite correct.
>
Dmitry is only correct if you think subject identifier (like Humpty
Dumpty) can mean whatever you wish for it to mean at the moment,
undisclosed to any author or user of a topic map.
In the context of the TMDM, which is, after all, a data model for XTM
syntax, subject identifier has a specific meaning, to-wit:
3.24 subject identifier
a locator that refers to a subject indicator
[What is a locator?]
3.11 locator
a string conforming to some locator notation that references one or more
information resources
[What is a subject indicator?]
3.25 subject indicator
an information resource that is referred to from a topic map in an
attempt to unambiguously identify the subject of a
topic to a human being. Any information resource can become a subject
indicator by being referred to as such from
within some topic map, whether or not it was intended by its publisher
to be a subject indicator.
Dmitry's case of baseName is a good one, but not for the reason posed.
How do I determine, based upon an examination of the syntax of a topic
map in front of me, that Dmitry has used baseName as the basis for
subject identity? Certainly not reflected in the syntax of the topic
map. Not in the TMDM.
Can I apply some TMCL rule as you suggest to reach that result? Sure,
but that is determining subject identity on an ad hoc basis and not in
terms of specifying the rules for subject identity prior to processing
the topic map.
What is the meaning of the data to which I am applying the TMCL rule? As
far as I can tell, the syntax and TMDM don't say and there is no
mechanism for that to be made known. I have seen examples that make
resourceData determinative of subject identity. How am I going to
distinguish those cases from cases where that is not happening?
This is really the struggle between developers who want to make software
"do" something and the need to have documentation for why it works the
way it does. Private knowledge of the meaning of various bits of syntax
in a particular context is a real poor way to achieve interchange of
information. And on which to base a standard.
> At least my view is that merging is _always_ application specific, it
> just depends how you identify two things. In this sense a 'merging
> rule' is nothing else than an additional constraint on a map: "It
> SHALL never be that two topics are in one map where .... <and here
> comes some condition involving the two topics>".
>
> If one accepts that a merging rule is nothing else than a constraint
> then one may also consequently think that this is something which
> should belong in a TMCL statement. This makes sense to me as a TMCL
> document is supposed to constrain the form of a topic map.
>
Correct on the question being: "how you identity two things." The TMDM
does not say, nor provides a way to say it. In order to apply some other
rule for that purpose, you have to know what you are applying the rule to.
For example, you limit the content of a database field to integers.
That rule does not confer a notion of integers on that field. You had to
have a notion of integers before you could even state the rule, much
less have it make any sense.
I would submit the same is true for TMCL/TMQL. If I don't know what
meaning (in terms of subject identity) was attached to particular parts
of topic map syntax/data model, how is that going to be supplied by a
TMCL/TMQL statement?
Oh, that is not to say that I could not use TMCL/TMQL to impose
arbitrary subject identities on a particular topic map, without regard
to its original authoring, etc., but that is a different case from the
one under discussion. Even in that case, I think we need to have more
than ad hoc notions of subject identity to underlie disclosure.
> And that can and....
>
>
>>And seems to me that Jim's point is to ask for a RM which would
>>contain only declarative semantics, and not procedural
>>specification.
>
>
> ....should be declarative, yes.
>
> ==
>
> My impression - and here I speak with the hat of a computer scientist
> on - is that the TM community tries to burden the "data model" with
> all sorts of 'semantical' constraints. I do not think this is a clever
> move and it will bite us later when we have to integrate TM?L.
>
Note that the TMRM is not trying to burden the data model with semantic
constraints. It is designed to enable disclosure of the basis for
subject identity and nothing more.
Quite honestly, I don't why the TMRM is viewed as competing with the
data model. It does something quite different and necessary in order to
talk about subject identity.
> Please note, that this is NOT like building a house, starting from
> ground up and then making the roof.
>
> \rho
>
> PS: If someone wants to follow my thought experiments:
>
> http://topicmaps.it.bond.edu.au/docs/23/toc
Interesting. I will read in detail over the weekend but one quick
comment from "lite" scan.
Note that you presume that merging rules of the TMDM result in a new
topic. Actually I have been told in person and I assume this exists in
the email archives somewhere, that whether one follows the rules of
merging found in the TMDM is an arbitrary thing.
Second, and more importantly, note that "actual" merging involves the
loss of information, that is that two (or more) topics existed where now
there is just one. It may in fact for auditing purposes for example,
desirable to have the "appearance" of merging and not the "actual"
merging that you posit in your thought experiment. In other words, your
TMQL statements provide a view "as though" all the information was
located at a single point and conducts futher operations as though that
were the case.
Depends on the situation and demands of the project as to which one you
would want to follow if not some combination of the two, dffering
depending upon your information needs for particular parts of the topic map.
Hope you are having a great day!
Patrick
> _______________________________________________
> sc34wg3 mailing list
> sc34wg3@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
>
--
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Patrick.Durusau@sbl-site.org
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Topic Maps: Human, not artificial, intelligence at work!