[sc34wg3] Illustrating SIDPs

Fri, 07 May 2004 12:36:25 -0400

Robert,

Robert Barta wrote:
> On Tue, May 04, 2004 at 02:30:03PM -0400, Patrick Durusau wrote:
> 
>>"Oh, I can distinguish between people who have the same name by means
>>of their spouses.  If they have different spouses, even if they have
>>the same name, they must be different persons."
> 
> ...
> 
>>    * Property name:              "personID"
>>
>>    * Value type:                 complex:
>>                                     "name"   : string
>>                                     "spouse" : topic
>>
>>    * SIDP or OP?:                SIDP
> 
> 
>>There is only one SIDP per topic (per TMA). (note that personName was
>>replaced by personalID)
>>
>>SIDPs (and, for that matter, OPs) can be arbitrarily complex.
> 
> 
> Patrick,
> 
> I agree with (almost) everything you say here, but I think my
> conclusions are quite different from yours.
> 
> As I read the TMRM document(s), I understand the objective to provide
> a framework to define these TMAs. What I am missing, though, is a
> notation to formalize this properly.
> 

First, I was only trying to explain the concept of SIDPs and not the 
entire reference model. We have been plagued with (some self-inflicted) 
problems over terminology and this was the first in a series of posts to 
try to break out of any remaining confusion.

Second, the TMRM is NOT trying to define a formalism for a TMA. (full stop)

Third, the TMRM does not constrain the formalism, or any other aspect of 
writing, specifying or implementing a TMA.

What the TMRM is trying to do is create an inventory and checklist for a 
disclosure statement that you can use to construct a TMA however you like.

> I think that the whole concept of an 'identifier' is rather misleading
> (not only in TM universe, I mean in general): as long as you have such
> an identifier at hand for your topic maps everything is neat and nice.
> In general, though, the concept of identity is an inferred one, coming
> - as you also say - from a combination of values, or - more
> abstractedly - being the result of an application-specific rule.
> 
> If we assume a rule like "two persons should be regarded the same if
> they have the same email address", then this rule is actually nothing
> else as a function which computes an "identity value" out of the email
> address and the fact whether the object is of class 'person' or not.
> If two (or more) objects have then same function result, then they are
> identical _for this very application_. Technically speaking, the
> objects are all in the same equivalence class as induced by the
> function.
> 
> [ This is how deductive databases work in general: They have explicit
> knowledge (facts) together with identity and they have implicit
> knowledge as provided by the inference rules. ]
> 

Part of the confusion is that you seem to be using "identifier" and 
"identity" to mean the same thing. Different levels that should not be 
confused. More on that issue next week.

> My conclusion now for TMs is that
> 
>   - such rules are simply part of the ontology which characterizes an
>     application. One (or more) of such identity-inducing rules might
>     exist in a single application.
> 
>   - it would be an overkill to put such a formalism into TMRM to
>     express this.  Why not burden TMCL with the ability to express
>     such rules?
> 
>   - TMRM could then be simplified in that it merely captures the "nature
>     of TMs", making the association concept explicity as it already does.
> 

Granted that the language of the TMRM can and should be simplified but 
identity is a complex question and to some degree the TMRM reflects that 
fact.

Note that the first Fortran compiler took 18 staff-years to write. (Aho, 
Seti, Ullman in the Dragon book) They also report that a substantial 
compiler can now be completed in a one semester course. With all due 
deference to the programmers on the list, I really don't think current 
CS students are that much brighter than the original Fortran team. (FYI: 
John W. Backus (lead), Sheldon F. Best, Harlan Herrick, Peter Sheridan,
Roy Nutt, Robert Nelson, Irving Ziller, Richard Goldberg, Lois Haibt
and David Sayre.)

The difference is that we now understand and have techniques to deal 
with many of the issues that were resolved on a trial and error basis in 
the first Fortran compiler and models to follow, whatever actual 
compiler you want to write.

The TMRM is an attempt to establish the same sort of inventory of issues 
and checklist for topic maps.

Hope you are having a great day!

Patrick

> This is roughly my argumentation line I had in Amsterdam.
> 
> \rho
> 
> 
> _______________________________________________
> sc34wg3 mailing list
> sc34wg3@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
> 

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Patrick.Durusau@sbl-site.org
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model

Topic Maps: Human, not artificial, intelligence at work!