TITLE: | Topic Maps -- Reference Model |
SOURCE: | Patrick Durusau and Steven R. Newcomb |
PROJECT: | WD 13250-5: Information Technology -- Topic Maps -- Reference Model |
PROJECT EDITORS: | Mr. Patrick Durusau; Dr. Steven R. Newcomb |
STATUS: | Informational (4.6) |
ACTION: | For review and comment |
DATE: | 2004-11-07 |
DISTRIBUTION: | SC34 and Liaisons |
REPLY TO: |
Dr. James David Mason (ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada) Crane Softwrights Ltd. Box 266, Kars, ON K0A-2E0 CANADA Telephone: +1 613 489-0999 Facsimile: +1 613 489-0995 Network: jtc1sc34@scc.ca http://www.jtc1sc34.org |
2004-11-07
CHANGE HISTORY:[parid9004] Version 4 is yet another complete redraft, with the goal of achieving both clarity and brevity. [parid9003] Version 3 is a complete redraft that narrows the focus to nomenclature and Disclosure. [parid9002] Version 2 is a major revision of Version 1. The ideas of Version 1 are preserved, but they are now all explained in terms of the properties of topics. There are also some terminological changes; for example, what in Version 1 was called a "node" is called a "topic" in Version 2. [parid9001] Made IS13250::t-roles{ } an SIDP instead of an OP. According to the TMRM paradigm, it is inescapable that the subject of a t-node is its roles, at least for all purposes of merging topics. |
0 | [parid0001] Introduction (Informative) |
[parid0010] Topic maps are sets of subject proxies, each of which is a surrogate for a single subject. In the language of ISO 13250:2002, the general notion of subject proxies was divided into "topics," "associations," and "occurrences," despite all three being surrogates for subjects in any given topic map. The Topic Maps Reference Model (TMRM) concerns itself with surrogates for all kinds of subjects, including but not limited to subjects that fall strictly into one of the three traditional categories; it defines the notion of "subject proxy".
[parid0011] The TMRM also specifies requirements for disclosing how to determine whether two or more subject proxies are proxies for the same subject, and how to view all of them as a single proxy. In a topic map, if a subject proxy is the only proxy for its subject, then users can access all information about that subject through that proxy's single (virtual) "location".
[parid0020] Topic maps are useful ways of viewing information because, as in all kinds of maps (including geographic maps), every specific location is unique, and all information that is relevant to a given geographic location is available at the corresponding location in the map. For example, in a map of Russia that depicts cities and elevations, the elevation of St. Petersburg can be found at the place where St. Petersburg is depicted, but one would not expect to find this information at the place where Moscow is depicted. The idea of a unique location within the framework of a geographic map corresponds to the use of "subject proxy" within the TMRM.
[parid0030] All maps are written within specific frameworks of expression. In the case of geographic maps, the nature of the correspondence between locations on the map and actual geographic locations is determined by the framework within which the map is expressed. Often a projection technique, such as the Mercator projection technique, is an important aspect of the frameworks of geographic maps. Other aspects of framework design include decisions about what to depict and what not to depict, the symbols that will be used, and so on; the framework of a map can have many aspects and dimensions. No map can be understood in the absence of at least some understanding of its framework of expression. When people share the same understanding of the framework of a given map, they can understand that map in the same way.
[parid0031] Any map, regardless of its framework, can be seen in terms of any other mapping framework, but only if both frameworks are understood. Maps can be combined ("merged") if their frameworks are known, and if their contents can be understood in terms of a framework capable of encompassing the combination of their contents.
[parid0060] In the TMRM, the framework of expression of a topic map is called a "Topic Map Application (TMA)". The TMRM requires TMA disclosures to disclose, among other things:
[parid0070] the definitions of the kinds of properties ("property classes") that subject proxies can have,
[parid0080] the rules for determining when multiple proxies are surrogates for the same subject, and
[parid0090] the rules for merging the values of the properties of proxies, when it has been determined that the proxies are surrogates for the same subject and they need to be viewable as a single proxy.
[parid0061] The TMRM's disclosure requirements are designed to facilitate the uniform understanding and merging of diverse topic maps, so that all of the diverse, independently expressed information about each subject can be viewed as if it were available at that subject's unique virtual "location" -- its unique subject proxy.
1 | [parid0110] Scope |
[parid0120] This International Standard specifies:
[parid0130] The abstract definition of subject proxy.
[parid0131] The abstract definition of merging subject proxies.
[parid0140] The abstract definition of Topic Map Application (TMA), and requirements to be met by disclosures of TMAs.
[parid0141] Other definitions and specifications in support of the above.
[parid0150] This International Standard does not specify:
[parid0160] The subjects of subject proxies nor constraints on such subjects.
[parid0161] The classes of the properties of subject proxies nor constraints on such classes.
[parid0162] The values of the properties of subject proxies nor constraints on such values.
[parid0170] The supporting algorithms and data models that may be used to represent subjects, to detect when two or more subject proxies are proxies for the same subject, or to merge the values of Subject Identity Properties or Other Properties.
2 | [parid0190] Glossary |
2.1 | [parid0230] built-in |
[parid0240] (Said of the value of a property of a subject proxy:) Unsupported by a disclosed conferral rule; given; axiomatic. The opposite of conferred (the values of properties of subject proxies can be either built-in or conferred).
|
2.2 | [parid0250] conferred |
[parid0260] (Said of the value of a property of a subject proxy:) Existing because of the operation of a conferral rule that is defined by the Topic Map Application (TMA) that defines the class of the property. The opposite of built-in (the values of properties of subject proxies can be either built-in or conferred).
2.3 | [parid0263] disclosure |
[parid0264] Information that, either within itself and/or by reference to other information, comprehensively defines (or is part of a comprehensive definition of) a Topic Map Application (TMA). Optionally, a disclosure may also define one or more interchange syntaxes, data models, implementations, and/or implementation strategies, along with unambiguous and comprehensive instructions as to how instances of each such syntax, data model, etc. are intended to be interpreted in terms of the defined Topic Map Application.
2.4 | [parid0270] merging |
[parid0280] The process whereby two subject proxies that are surrogates for the same subject become viewable as a single resulting subject proxy. The single proxy that results from such a merger has instances of all of the property classes of which either or both of the original two proxies have instances. If an instance of a given property class appears in only one of the original two proxies, then that property instance appears unchanged in the resulting single proxy. If both of the original subject proxies have instances of a given property class, the values of those instances are combined in conformance with the rule for combining the values of instances of that class that is disclosed by the TMA that governs the class, and the resulting value becomes the value of the instance of that property class that appears in the resulting single proxy.
2.5 | [parid0310] Other Property (OP) |
[parid0320] A property class or property instance that is not a Subject Identity Property (SIP).
2.6 | [parid0330] property class |
[parid340] A named type of Subject Identity Property (SIP) or Other Property (OP), instances of which may appear in subject proxies. Within a TMA, the names of all property classes are unique.
2.7 | [parid0350] property instance |
[parid0360] One of the named values that comprise a subject proxy: an instance of a Subject Identity Property (SIP) or Other Property (OP). Its name is the same as the name of the property class of which it is an instance. In the subject proxy in which it appears, it is the only instance of its class.
2.8 | [parid0365] reification |
[parid0366] The representation of a subject by a subject proxy.
|
2.9 | [parid0370] subject |
[parid0380] Any thing whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever. A potential or actual subject of conversation.
2.10 | [parid0390] Subject Identity Property (SIP) |
[parid0400]
[parid0391] A property class whose instances specify the subjects of the subject proxies in which they appear.
[parid0392] A property instance that specifies the subject of the subject proxy in which it appears.
|
2.11 | [parid0410] subject proxy |
[parid0420] A surrogate for a subject. Subject proxies consist of property instances, at least one of which must be a Subject Identity Property (SIP). In a subject proxy, there cannot be more than one SIP whose class is defined by the same Topic Map Application (TMA).
2.12 | [parid0430] topic map |
[parid0440]
[parid0450] A set of subject proxies that is treated as a unit. The classes of the properties (both SIPs and OPs) that comprise the subject proxies, and the rules for recognizing when multiple proxies are surrogates for the same subject, are both disclosed by one or more TMAs (called the "governing TMAs"). The governing TMAs also disclose the rules for viewing the values of multiple instances of each property class as a single value of that class; these rules enable multiple proxies for the same subject to be viewed as if they were a single proxy with a single instance of each property class.
[parid0460] An document written in a topic map syntax; an interchangeable expression of a set of subject proxies.
|
2.13 | [parid0480] Topic Map Application (TMA) |
[parid0490] A Disclosure of rules that govern properties of subject proxies.
2.14 | [parid0560] topic map view |
[parid0570] A subject-oriented view of a corpus of information, via a set of subject proxies that are governed by a TMA. By conforming to the TMA's rules for recognizing subject sameness, for merging the properties of proxies, etc., a topic map view can provide the convenience of subject-oriented access to the contents of the corpus.
2.15 | [parid0580] Topic Maps Reference Model (TMRM) |
[parid0590] This International Standard.
3 | [parid0600] Subjects and Subject Proxies |
[parid0610] In a topic map view, all subjects for which a single virtual "location" can be established have a surrogate, that is, a subject proxy. Every subject proxy is a surrogate for a single unique subject.
[parid0620] The constraint that every subject proxy must have, among its properties, one SIP per governing TMA, fulfills two related but independent functions:
[parid0630] the SIP instances collectively identify the subjects that the author of a particular topic map view chose to reify; and
[parid0640] the classes of the SIP instances reveal how the author chose to distinguish those subjects from one another, or, in other words, how the author chose to recognize when two or more proxies have the same subject.
[parid0650] The function of SIP instances (that is, specifying the subjects of the subject proxies created by an author) enables interchange of topic map views with the expectation that the view as authored can be the same view seen by a user. The user is not constrained to take that view, but, since the ability to interchange topic map views is a requirement, this International Standard must enable authors to give users of topic map views the ability to replicate the views as authored.
[parid0660] The function of SIP classes (that is, revealing how the topic map view's author chose to distinguish the subjects of subject proxies) facilitates the merging of independently authored topic map views, not only when the views are expressed in terms defined by the same TMAs, but also when the TMAs are different. In the latter case, each independent TMA's Disclosure of its means for distinguishing the subjects of subject proxies can provide a useful basis for designing rules for combining the proxies governed by the independent TMAs.
Editor's Note 1: |
[parid0670] PD Question: Should the notion of assertions be treated here? Reasoning that assertion models are definitions of the SIP of the subject proxy known as the 'a-node' in earlier versions of the TMRM. There is no requirement that the values that make up the SIP of the 'a-node' of an assertion be other subject proxies, although there are obvious benefits to a TMA making and enforcing such requirements. [parid0671] SRN: I think it's the camel's nose under the tent. There is no limit to the amount of advice we could provide here for TMA designers. The design of an assertion model, and even the question of whether a TMA must or should provide a property value type of "subject proxy", are things I think we'd better exclude from this standard. |
4 | [parid0700] Disclosures of Topic Map Applications (TMAs) |
[parid0710] Each Disclosure of a Topic Map Application (TMA) must disclose the following things:
[parid0720] Topic Map Application Name. The name of the TMA.
[parid0730] Property Class Names. The names of all of the TMA's Subject Identity Property (SIP) and Other Property (OP) classes. Every TMA must define at least one SIP class, but OP classes are optional. Within a given TMA, the names of all SIP and OP classes must be unique; they all share the same namespace.
[parid0731] Property Value Constraints. Definitions of the constraints on the values of instances of each SIP class and OP class. These constraints may include, for example, their value types.
[parid0732] Property Instance Merging Rules. For each SIP class and OP class, a definition of the rule for combining the values of multiple instances of it into the value of a single instance.
|
|
|
[parid0740] Subject Sameness Detection Rules. For each SIP class, the rule for comparing any two instances of the class in order to determine whether they specify the same subject, i.e., whether the subject proxies in which two instances of the same SIP class appear can be viewed as a single subject proxy in which their respective property instances have been merged.
[parid0780] Conferral Rules. If the TMA includes rules (called "conferral rules") that require values or value components to be conferred upon property instances, then, for each conferral rule, the property/value conditions that trigger the operation of the rule, and the effects of its operation on property values, must be disclosed. Conferral rules can require the conferral of values or value components on property instances that do not have any built-in value components, and therefore would not exist without their conferred values; such property instances are said to be "conferred into existence". Property instances that are conferred into existence may appear in subject proxies that would not otherwise have any property instances, and therefore would not otherwise exist; such subject proxies are also said to be "conferred into existence".
|
[parid0790] The following grammatical productions are expressed in a notation similar to that of Clause 5 of ISO 8879. They summarize what the TMRM requires TMA Disclosures to disclose, and how the property instances that appear in the subject proxies of topic map views invoke the disclosures that apply to them by means of the names of TMAs, SIP classes, and OP classes.
[parid0800]
(1) topicMapApplicationDisclosure = (topicMapApplicationName, propertyClassDefinition+, conferralRuleDefinition*) |
[parid0810]
(2) propertyClassDefinition = ( SIPClassDefinition | OPClassDefinition) |
[parid0820]
(3) SIPClassDefinition = (propertyClassName propertyValueConstraints, propertyInstanceMergingRule, propertyInstanceSubjectSamenessDetectionRule) |
[parid0830]
(3) OPClassDefinition = (propertyClassName propertyValueConstraints, propertyInstanceMergingRule) |
[parid0840]
(4) subjectProxy = (SIPInstance+, OPInstance* ) |
[parid0850]
(5) (SIPInstance | OPInstance) = (topicMapApplicationName, propertyClassName, propertyValue) |
[parid0860]
(6) (topicMapApplicationName | propertyClassName | propertyValueConstraints | propertyInstanceMergingRule | propertyInstanceSubjectSamenessDetectionRule | propertyValue) = [unconstrained by the TMRM] |
[parid0870]
(7) conferralRule = [unconstrained by the TMRM, except that only instances of property classes defined by the same TMA that defines the conferral rule can have values conferred upon them by that conferral rule.] |
[parid0880]
(8) topicMapView = (topicMapApplicationName+, subjectProxy+) |
Editor's Note 2: |
[parid0970] The items to be defined by a TMA deserve better names. The longer forms are used herein for clarity in discussion within the committee and should not necessarily taken as suggested final forms. |
Editor's Note 3: |
[parid0980] The TMA concept answers the question raised by the often-heard requirement that merging be possible on any basis in addition to or instead of the rules articulated in the TMDM (ISO 13250-2): If a topic map view is to be constructed with merging based on such an additional or replacement set of merging rules, how is that to be disclosed for interchange purposes? It also answers the question of how to merge two or more topic map views that follow rules that depart from the TMDM. |
Editor's Note 4: |
[parid0990] While not proposed herein, it is noted that some notation should be required of TMAs in order to facilitate interchange of topic map views based upon TMAs. At a minimum, it is suggested that names for the various components of a TMA be defined, even though the content to follow those names is of necessity unconstrained. (For example, the definition of an SIP for resources held in Cobol may need to use a different syntax and terminology than for resources that are held in XML.) For interchange purposes, however, it would be helpful for users to know that a SIP is being defined, even if the user must accept the burden of understanding whatever syntax is used to express the definition. |
5 | [parid1000] Formal Model (remarks on Barta's Tau) |
[parid1010] While extremely interesting as an implementation strategy or model, the Tau model has no concept of identity (by design), preferring to leave that issue to other parts of the topic maps standard.
[parid1020] For the TMRM, on the other hand, the question of identity lies at the core of having a topic map view. That is, in order to have interchangeable, mergeable, or even useful topic map views, one must know what was authored as a subject proxy (in TMRM terms, what possesses an SIP) and on what basis that SIP was to be compared to others for purposes of detecting subject sameness, and what to do when subject sameness is detected. Knowledge of how such processing will be performed is of vital interest to promote use of topic maps, but is not the same as formulation of a the requirements for disclosing those choices.