[sc34wg3] Editorial structure of N0396
Patrick Durusau
sc34wg3@isotopicmaps.org
Tue, 22 Apr 2003 10:01:33 -0400
Martin,
Martin Bryan wrote:
>Patrick wrote
>
>
>
>>On a more technical issue, you might want to note that definition of
>>String in the SAM:
>>
>>
>>
>>>String
>>>
>>> Strings are sequences of abstract Unicode characters conforming to
>>> Unicode Normalization Form C [unicode]
>>> <http://www.isotopicmaps.org/sam/sam-model/#unicode>
>>>
>>>
>>>
>>While following the W3C for XML 1.1 (see details at:
>>http://www.w3.org/TR/charmod/) does exclude (unless this is one of those
>>optional things) other normalization forms that may be required in
>>non-Web based topic map contexts. This may be of particular significance
>>for systems using Chinese/Japanese texts in non-web based topic maps.
>>
>>
>
>Coming from someone who I seem to remember criticised me for suggesting that
>something other than the concrete abstract syntax should be applicable in
>SGML I find this somewhat rich ;-)
>
>
Glad to have made your day! (I recall my point being somewhat different
but do appreciate the irony.)
>W3C have, after much arguing and many years of wrangling, finally got around
>to agreeing a single prefered normalization form for Unicode within XML
>documents and Patrick wants us to allow topic map users to be able to adopt
>an alternative normalization scheme!!! This is supposed to make integration
>of topic maps easier in some way. Two topic maps, using different encodings,
>both in XTM cannot be merged safely if they adopt different normalization
>methods.
>
>
The question is does one normalization scheme, particularly one for the
web fit all topic maps? I really don't think that limiting topic maps to
the WWW is a good idea, have not in the past, do not now, unlikely in
the future. Note that I saw a note yesterday that 75% of all business
data in held in COBOL. Glad that the SAM allows any notation to be used
for locators.
I am not saying that two topic maps based on XTM should use different
normalizations nor that such would make "integration" easier. That is
both unfair and inaccurate. The SAM, as I understand it, is supposed to
be "the" data mdel for all topic maps. Are you suggesting that it is
only the data model for XTM based topic maps?
>Having said that, I do believe that this statement should not be part of the
>SAM model, but should be part of the XTM serialization of the model. As HyTM
>is based on SGML rather than XML we can expect user-defined character sets
>to be defined as part of HyTM. We can, of course, agree to differ as to
>whether or not topic maps based on different character sets need to be
>normalized to conform to Form C before being interchanged/merged. This
>subject should be added to one of the discussion lists for London, but I'd
>hate to suggest which one.
>
All jousting aside, I think this comes close to my original point.
Patrick
--
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu
Co-Editor, ISO Reference Model for Topic Maps