ISO/IEC JTC1/SC34

ISO/IEC

ISO/IEC JTC1/SC34

Information Technology —

Document Description and Processing Languages

Title: Topic Maps — Data Model
Source: Lars Marius Garshol, Graham Moore, JTC1 / SC34
Project: ISO 13250: Topic Maps
Project editor: Lars Marius Garshol, Graham Moore
Status: Final Draft International Standard
Action: For review
Date: 2005-10-28
Summary:
Distribution: SC34 and Liaisons
Refer to:
Supercedes:
Reply to: Dr. James David Mason
(ISO/IEC JTC1/SC34 Chairman)
Y-12 National Security Complex
Information Technology Services
Bldg. 9113 M.S. 8208
Oak Ridge, TN 37831-8208 U.S.A.
Telephone: +1 865 574-6973
Facsimile: +1 865 574-1896
E-mail: mailto:mxm@y12.doe.gov
http://www.y12.doe.gov/sgml/sc34/sc34oldhome.htm

Mr. G. Ken Holman
(ISO/IEC JTC 1/SC 34 Secretariat - Standards Council of Canada)
Crane Softwrights Ltd.
Box 266,
Kars, ON K0A-2E0 CANADA
Telephone: +1 613 489-0999
Facsimile: +1 613 489-0995
Network: jtc1sc34@scc.ca

Topic Maps — Data Model

Contents

1 Scope
2 Normative references
3 Terms and definitions
4   The metamodel
4.1   Introduction
4.2   Locators
4.3   The fundamental types
4.4   Datatypes
4.5   Constraints
5   The data model
5.1   General
5.2   The topic map item
5.3   Topic items
5.3.1   Subjects and topics
5.3.2   Identifying subjects
5.3.3   Scope
5.3.4   Reification
5.3.5   Properties
5.4   Topic name items
5.5   Variant items
5.6   Occurrence items
5.7   Association items
5.8   Association role items
6   Merging
6.1   General
6.2   Merging topic items
6.3   Merging topic name items
6.4   Merging variant items
6.5   Merging occurrence items
6.6   Merging association items
6.7   Merging association role items
7   Core subject identifiers
7.1   General
7.2   The type-instance relationship
7.3   The supertype-subtype relationship
7.4   Sort names
7.5   Subjects for defined terms

Foreword

ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

ISO/IEC 13250-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 34, Document Description and Processing Languages.

ISO/IEC 13250 consists of the following parts, under the general title Topic Maps:

Introduction

Topic Maps is a technology for encoding knowledge and connecting this encoded knowledge to relevant information resources. Topic maps are organized around topics, which represent subjects of discourse; associations, representing relationships between the subjects; and occurrences, which connect the subjects to pertinent information resources.

Topic maps may be represented in many ways: using Topic Maps syntaxes in files, inside databases, as internal data structures in running programs, and even mentally in the minds of humans. All these forms are different ways of representing the same abstract structure. It is that structure which this part of ISO/IEC13250 defines, in the form of a data model.

NOTE:

The phrase "topic maps" is used in two ways in this part of ISO/IEC13250: as a (capitalized) proper noun, "Topic Maps", denoting the name of this part of ISO/IEC13250; and as the plural of a common noun "topic map". Both terms are defined under 3.

Topic Maps — Data Model

1 Scope

NOTE:

This clause defines the scope of this part of ISO/IEC13250. It should not be confused with the concept of "scope" defined in 5.3.3, which only applies in the context of Topic Maps.

This part of ISO/IEC13250 specifies a data model of Topic Maps. It defines the abstract structure of Topic Maps, using the information set formalism, and to some extent their interpretation, using prose. The rules for merging in Topic Maps are also defined, as are some fundamental subject identifiers.

The purpose of the data model is to define the interpretation of the Topic Maps interchange syntaxes, and to serve as a foundation for the definition of supporting standards for canonicalization, querying, constraints, and so on. All of these standards fall outside the scope of this part of ISO/IEC13250, however.

NOTE:

This part of ISO/IEC13250 does not have a conformance section since it is only a data model, and as such it has no boundary with the outside world in terms of which conformance can be specified.

2 Normative references

The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

NOTE:

Each of the following documents has a unique identifier that is used to cite the document in the text. The unique identifier consists of the part of the reference up to the first comma, and referenced thus: [Identifier].

Unicode, The Unicode Standard, Version 4.0, The Unicode Consortium, Reading, Massachusetts, USA, Addison-Wesley, 2003, ISBN 0-321-18578-1

RFC 3986, Uniform Resource Identifiers (URI): Generic Syntax, Internet Standards Track Specification, January 2005, http://www.ietf.org/rfc/rfc3986.txt

RFC 3987, Internationalized Resource Identifiers (IRIs), Internet Standards Track Specification, January 2005, http://www.ietf.org/rfc/rfc3987.txt

XML Infoset, XML Information Set (Second Edition), World Wide Web Consortium, 4 February 2004, http://www.w3.org/TR/2004/REC-xml-infoset-20040204

ISO 10646, ISO 10646:2003: Information technology — Universal Multiple-Octet Coded Character Set (UCS), ISO, 2003

XML, Extensible Markup Language (XML) 1.0 (Third Edition), W3C Recommendation, 4 February 2004, http://www.w3.org/TR/2004/REC-xml-20040204

XML Schema-2, XML Schema Part 2: Datatypes Second Edition, W3C Recommendation, 28 October 2004, http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/

3 Terms and definitions

For the purposes of this part of ISO/IEC13250, the following terms and definitions apply.

NOTE:

These definitions are reproduced from the body of this document; for those unfamiliar with the terminology the definitions are best read in context. They are repeated here for reference.

3.1
association

representation of a relationship between one or more subjects

3.2
association role

representation of the involvement of a subject in a relationship represented by an association

3.3
association role type

subject describing the nature of the participation of an association role player in an association

3.4
association type

subject describing the nature of the relationship represented by associations of that type

3.5
information resource

resource that can be represented as a sequence of bytes and thus could potentially be retrieved over a network

3.6
item identifier

locator assigned to an information item in order to allow it to be referred to

3.7
locator

string conforming to some locator notation that references one or more information resources

3.8
merging

process applied to a topic map in order to eliminate redundant topic map constructs in that topic map

3.9
occurrence

representation of a relationship between a subject and an information resource

3.10
occurrence type

subject describing the nature of the relationship between the subjects and information resources linked by the occurrences of that type

3.11
reification

making a topic represent the subject of another topic map construct in the same topic map

3.12
scope

context within which a statement is valid

3.13
statement

claim or assertion about a subject (where the subject may be a topic map construct)

3.14
subject

anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever

3.15
subject identifier

locator that refers to a subject indicator

3.16
subject indicator

information resource that is referred to from a topic map in an attempt to unambiguously identify the subject represented by a topic to a human being

3.17
subject locator

locator that refers to the information resource that is the subject of a topic

3.18
topic

symbol used within a topic map to represent one, and only one, subject, in order to allow statements to be made about the subject

3.19
topic map

set of topics and associations

3.20
topic map construct

component of a topic map; that is, a topic map, a topic, a topic name, a variant name, an occurrence, an association, or an association role.

3.21
Topic Maps

technology for encoding knowledge and connecting this encoded knowledge to relevant information resources

3.22
topic name

name for a topic, consisting of the base form, known as the base name, and variants of that base form, known as variant names

3.23
topic name type

subject describing the nature of the topic names of that type

3.24
topic type

subject that captures some commonality in a set of subjects

3.25
unconstrained scope

scope used to indicate that a statement is considered to have unlimited validity

3.26
variant name

alternative form of a topic name that may be more suitable in a certain context than the corresponding base name

4 The metamodel

4.1 Introduction

The metamodel used in this document is the same as that used by the XML Information Set [XML Infoset]. An instance of this data model consists of a number of information items, each one of which is an abstract representation of a topic map construct. Every information item is an instance of some information item type, which specifies a number of named properties which the information item shall have. Throughout this part of ISO/IEC13250 the term "information item" refers to the information item types defined in this model, while information items of particular types are referred to as "topic items", "topic name items", and so on.

The names of these properties are written in square brackets: [property name], following the convention used in [XML Infoset]. Every property has an associated type that constrains what values it may have. Properties are not allowed to have null as their value unless this is explicitly stated in the definition of the property.

Certain properties in the model are specified as computed properties, which means that they are specified in terms of how their values may be produced from other properties in the model. These properties are specified for reasons of convenience or to better reflect the semantics of the data model but are strictly speaking redundant.

All information item types and fundamental types defined in this part of ISO/IEC13250 have a well-defined test of equality. This equality test is used to avoid duplicate values in properties whose values are of type set. Information items have identity, independent of their values, so items can be compared both by identity and by value. Equality throughout this part of ISO/IEC13250 should be taken to mean equality according to the rules defined for the types of the values being compared.

UML diagrams [UML] are used in addition to the infoset formalism for purposes of illustration. These diagrams are purely informative, and in cases of discrepancy between the diagrams and normative prose, the prose is definitive.

Figure 1 — The class hierarchy

NOTE:

TopicMapConstruct is the abstract superclass of all classes used in these UML diagrams. It is used here to simplify the UML diagrams using inheritance, and because the value of the [reifier] property of topic items can be any topic map construct.

4.2 Locators

An information resource is a resource that can be represented as a sequence of bytes and thus could potentially be retrieved over a network. Topic maps can refer to information resources external to themselves in order to make statements about them. These information resources are not part of the topic map; they are only referenced from it.

A locator is a string conforming to some locator notation that references one or more information resources. Locators are always expressed in some locator notation, which is a definition of the formal syntax and interpretation of a class of locators. The definition of locator notations is outside the scope of this part of ISO/IEC13250. All locators in this model use the notation defined by [RFC 3986] and [RFC 3987].

4.3 The fundamental types

The values of information item properties may be either other information items, or values of one of the types defined below:

String

Strings are sequences of Unicode scalar values (see [Unicode] and [ISO 10646]).

Strings are equal if they consist of the exact same sequence of Unicode scalar values.

NOTE:

This part of ISO/IEC13250 does not require Unicode normalization to be applied to strings order to detect that syntactically different but logically equivalent strings are in fact equivalent. The application of such logic is encouraged, however. As it cannot be guaranteed that normalization will be performed reliance on normalization is strongly discouraged.

Set

Sets are collections of zero or more unordered elements that contain no elements that are equal to each other. In this data model the elements of a set are always information items or strings.

Two sets are equal unless there exists an element in one set for which no equal element can be found in the other.

Null

Null is a type of exactly one value, used to indicate that a property has no value; it does not necessarily indicate that the value of the property is unknown. Specifically, null has the same semantics as No Value in [XML Infoset]. In this model null can never be contained in a set.

Null is distinct from all other values (including the empty set and the empty string); it is only equal to itself.

Locator

Locators are strings conforming to some locator notation.

Locators are equal if they consist of the exact same sequence of Unicode scalar values.

NOTE:

This part of ISO/IEC13250 does not require normalization to be applied to the syntactical expressions of locators in order to detect that syntactically different but logically equivalent locators are in fact equivalent. The application of such logic is encouraged, however. As it cannot be guaranteed that normalization will be performed reliance on normalization is strongly discouraged.

4.4 Datatypes

The only atomic fundamental types defined in this part of ISO/IEC13250 are strings and null. Through the concept of datatypes, data of any type can be represented in this model. All datatypes used shall have a string representation of their value space and this string representation is what is stored in the topic map. The information about which datatype the value belongs to is stored separately, in the form of a locator identifying the datatype.

For each datatype there is an IRI which identifies the datatype. This IRI is to be considered a subject identifier for the datatype, so that a topic having this IRI as its subject identifier represents the datatype. Any such topics, if present, do not affect the processing of the topic map.

This part of ISO/IEC13250 defines only the following three datatypes, but other datatypes may also be used. These datatypes are all defined by [XML Schema-2]; the syntax of the XML datatype is defined by [XML].

String

This is the string datatype, as defined in 4.3. The identifier of this datatype is http://www.w3.org/2001/XMLSchema#string.

IRI

This is the datatype of locators using the IRI notation; the IRIs shall be absolute. The identifier of this datatype is http://www.w3.org/2001/XMLSchema#anyURI.

XML

This is the XML datatype, which represents XML document fragments. The identifier of this datatype is http://www.w3.org/2001/XMLSchema#anyType.

NOTE:

The datatype of a string value may affect its interpretation. For example, the string value "AT&T" means precisely what it says if the datatype is string, but means "AT&T" if the datatype is XML.

4.5 Constraints

The model defined in this part of ISO/IEC13250 contains not only fundamental types and information item types with named properties, but also constraints on the allowed instances of the model. The purpose of these constraints is to prevent inconsistencies in instances of the data model.

5 The data model

5.1 General

This clause defines the data model through the definition of a number of information item types together with their meaning.

A topic map construct is a component of a topic map; that is, a topic map, a topic, a topic name, a variant name, an occurrence, an association, or an association role.

An item identifier is a locator assigned to an information item in order to allow it to be referred to. This part of ISO/IEC13250 does not constrain how item identifiers are assigned to information items.

NOTE:

In a sense item identifiers are identifiers for topic map constructs, but unlike subject locators and identifiers devoid of any specified semantics. Item identifiers may be freely assigned to topic map constructs.

One specific use of item identifiers is in the deserialization from the XML syntax where item identifiers are created that point back to the syntactical constructs that gave rise to the information items in the data model instance. In this case the item identifier will point to the minimal syntactical construct of origin, which means that for topic items created from the XML syntax, for example, the item identifier will point to the originating topic element, rather than the containing topicMap element.

Topic map constructs may have any number of item identifiers since when duplicate information items are merged the resulting information item inherits all the item identifiers of the original information items.

Constraint: Duplicate item identifiers

It is an error for two different information items to have strings that are equal in their [item identifiers] properties, unless they are topic items. If they are topic items they shall be merged according to the procedure in 6.2.

5.2 The topic map item

A topic map is a set of topics and associations. Its purpose is to convey information about subjects through statements about topics representing those subjects. The topic map itself has no meaning or significance beyond its use as a container for the information about those subjects.

NOTE:

Although the topic map does not represent anything, it may be reified in order to make statements about the topic map (that is, the collection of topics and associations) as a whole. These statements may for example provide traditional metadata such as author, version, copyright, or they may reference system metadata such as a schema for the topic map, external documentation of it, and so on.

Figure 2 — The topic map item

The topic map item represents the topic map. Topic map items have the following properties:

  1. [topics]: A set of topic items. All the topics in the topic map.

  2. [associations]: A set of association items. All the associations in the topic map.

  3. [reifier]: A topic item, or null. If present, the topic that reifies the topic map.

  4. [item identifiers]: A set of locators. The item identifiers of the topic map.

5.3 Topic items

5.3.1 Subjects and topics

A subject can be anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever. In particular, it is anything about which the creator of a topic map chooses to discourse.

EXAMPLE:

Examples of subjects for which topics may be created are:

A topic is a symbol used within a topic map to represent one, and only one, subject, in order to allow statements to be made about the subject. A statement is a claim or assertion about a subject (where the subject may be a topic map construct). Topic names, variant names, occurrences, and associations are statements, where as the assignment of identifying locators to topics are not considered statements.

NOTE:

The process of merging ensures that whenever two topics are known to represent the same subject they are merged. It may well be, however, that two topics represent the same subject without this being detectable by the rules of this part of ISO/IEC13250. Merging beyond the minimal merging required by the rules of Clause 6 is freely allowed, although such merging is not required or described by this part of ISO/IEC13250. Most commonly this will be done by inferring the subject of the topics from statements made about them.

5.3.2 Identifying subjects

Formal identification of subjects with locators allows topic maps to be merged safely and precisely, and also allows the definition of subjects with semantics that can be implemented in Topic Maps systems. Examples of such subjects can be found in Clause 7.

A subject indicator is an information resource that is referred to from a topic map in an attempt to unambiguously identify the subject represented by a topic to a human being. Any information resource can become a subject indicator by being referred to as such from within some topic map, whether or not it was intended by its publisher to be a subject indicator.

A subject identifier is a locator that refers to a subject indicator. Topic maps contain only subject identifiers (and not the corresponding subject indicators), and consequently it is the subject identifier that is the basis for merging.

NOTE:

This part of ISO/IEC13250 does not require implementations to derference subject identifiers, and so it is not an error if the subject indicator does not exist. It is, however, recommended to always create a subject indicator when defining a subject identifier.

A subject locator is a locator that refers to the information resource that is the subject of a topic. The topic thus represents that particular information resource; i.e., the information resource is the subject of the topic.

NOTE:

If a topic has multiple subject locators these all refer to the same information resource. This of course raises the question of when two information resources can be considered to be the same. This part of ISO/IEC13250 makes no attempt to clarify this and leaves it for individual locator notations to define.

EXAMPLE:

Consider the IRI http://www.iso.org. If given as the subject locator of topic A this would mean that topic A represents the information resource identified by this IRI. However, using it as the subject identifier of topic B would mean that B represents what is described in that information resource. At the time of writing this would appear to be the organization known as the International Organization for Standardization. (Note: the organization; the real-world institution known by that name. This is different from the subject of A, which is the web page itself.)

Note the uncertainty in the preceding paragraph ("would appear to be"): the information resource in question is a subject indicator for topic B, but it was not created to be a subject indicator, and so it does not entirely unambiguously indicate a single subject. This is not a criticism of the content; the content simply does not describe one single subject, nor was it ever meant to. Neither is it guaranteed to be stable: when it is dereferenced at some time in the future, it may indicate some other subject, or it may no longer exist.

5.3.3 Scope

All statements have a scope. The scope represents the context within which a statement is valid. Outside the context represented by the scope the statement is not known to be valid. Formally, a scope is composed of a set of topics that together define the context. That is, the statement is known to be valid only in contexts where all the subjects in the scope apply.

NOTE:

[ISO 13250:2003] did not explicitly define scope as being "all subjects", hence older topic maps may use scope more loosely.

The unconstrained scope is the scope used to indicate that a statement is considered to have unlimited validity. In the model this is represented by the empty set.

Precisely how a subject, or a set of subjects, define a context is not defined by this part of ISO/IEC13250, but left for those creating topic maps to define as part of the definition of their subjects.

EXAMPLE:

Examples of the use of scope are given below:

5.3.4 Reification

The act of reification is the act of making a topic represent the subject of another topic map construct in the same topic map. For example, creating a topic that represents the relationship represented by an association is reification.

NOTE:

Note that the use of the term 'reification' in this part of ISO/IEC13250 is not to be confused with its use in philosophy.

In many cases it is desirable to be able to attach additional information to topic map constructs such as topic names or associations. One may want to give an association occurrences, or to give an occurrence a name. The basic Topic Maps model does not allow this, except through reification; that is, creating a topic that reifies the topic map construct. The necessary information can then be attached to the reifying topic, and the reification relationship is present in a structured form that can reliably be detected by implementations.

NOTE:

One topic cannot reify another. A topic reifying a topic map construct in reality represents the real-world thing represented by that topic map construct. A topic reifying an association really represents the relationship represented by that association, and so if one topic were to reify another that would mean that the topic represents the subject of the other, and so the two would have to merge, since they would have the same subject.

5.3.5 Properties

Figure 3 — The topic item

Topic items represent topics. Topic items have the following properties:

  1. [topic names]: A set of topic name items. This is the set of topic names assigned to this topic.

  2. [occurrences]: A set of occurrence items. This is the set of occurrences assigned to this topic.

  3. [roles played]: A set of association role items. This is the set of association roles played by this topic.

    Computed value: the set of all association role items whose [player] property value is this topic item.

  4. [subject identifiers]: A set of locators. The locators referring to the subject indicators of this topic.

  5. [subject locators]: A set of locators. The locators referring to the information resource that is the subject of this topic.

  6. [reified]: An information item, or null. If given, the topic map construct that is reified by this topic.

    Computed value: the information item whose [reifier] property contains this topic item.

  7. [item identifiers]: A set of locators. The item identifiers of the topic.

  8. [parent]: An information item. The topic map containing the topic.

    Computed value: the topic map item whose [topics] property contains this topic item.

Equality rule: Two topic items are equal if they have:

Constraint: Topic identity required

All topic items shall have a value for at least one of the [subject identifiers], [subject locators], and [item identifiers] properties that is not the empty set.

NOTE:

Locators which refer directly to subjects which are not information resources should be used with caution. They should not be used in the [subject locators] property, as this is intended only for references to information resources. Rather, they should be placed in the [subject identifiers] property.

The isbn URN scheme used to identify books ([RFC 2288]), for example, does not reference information resources, and so should not be put in the [subject locators] property, but instead in the [subject identifiers] property.

NOTE:

Topics may in addition to the properties defined above also have types, instances, supertypes, and subtypes, represented by means of associations using the subject identifiers defined in 7.2 and 7.3.

5.4 Topic name items

A topic name is a name for a topic, consisting of the base form, known as the base name, and variants of that base form, known as variant names. A topic name type is a subject describing the nature of the topic names of that type.

Topic names always have a scope, which defines in what context the topic name is an appropriate label for the subject. A topic may have any number of topic names.

A base name is a name or label for a subject, expressed as a string. That is, it is something that identifies the subject (though not necessarily uniquely) and can be used as a label for the subject in user interfaces. The notion of a base name corresponds closely to the common sense notion of a name.

NOTE:

Suitable base names for people, countries, and organizations are their names, while base names for documents, musical works, and movies might be their titles. Base names may have variant names, which are alternative forms of the base name that may be more appropriate in specific contexts. Essentially, a base name is a specialized kind of occurrence.

Figure 4 — The topic name item

Topic name items represent topic names. Topic name items have the following properties:

  1. [value]: A string. The base name; the base form of the topic name.

  2. [type]: A topic item. The topic defining the nature of this topic name.

  3. [scope]: A set of topic items. The scope that represents the context in which the topic name is considered to be a valid label for the topic.

  4. [variants]: A set of variant name items. The variant names that are alternative forms of the topic name.

  5. [reifier]: A topic item, or null. If present, the topic that reifies the topic name.

  6. [item identifiers]: A set of locators. The item identifiers of this topic name.

  7. [parent]: An information item. The topic to which the topic name belongs.

    Computed value: the topic item whose [topic names] property contains this topic name item.

Equality rule: Topic name items are equal if the values of their [value], [type], [scope], and [parent] properties are equal.

NOTE:

Topic name items have a [value] property, but no [datatype] property, because the datatype of the [value] is always string.

5.5 Variant items

A variant name is an alternative form of a topic name that may be more suitable in a certain context than the corresponding base name. The scope of the variant name is the only basis for establishing what variant name is most suitable in any given situation. A variant name may be a string, but it may also be any other kind of information resource.

NOTE:

When choosing a label for a topic, the topic name considered most appropriate should be chosen; thereafter the form of the topic name best suited for display in that particular context should be chosen, which may be the base name or one of its variants.

Figure 5 — The variant name item

Variant items represent variant names. Variant items have the following properties:

  1. [value]: A string. If the datatype is IRI, a locator referring to the information resource that is the variant name; otherwise the string is the variant name.

  2. [datatype]: A locator. A locator identifying the datatype of the variant name value.

  3. [scope]: A non-empty set of topic items. The scope that represents the context in which the variant name is preferred as a label for the topic.

  4. [reifier]: A topic item, or null. If present, the topic that reifies the variant name.

  5. [item identifiers]: A set of locators. The item identifiers of the variant name.

  6. [parent]: An information item. The topic name to which the variant belongs.

    Computed value: the topic name item whose [variants] property contains this variant item.

Equality rule: Variant items are equal if the values of their [value], [datatype], [scope], and [parent] properties are equal.

Constraint: Variant scope

The value of the [scope] property of each variant item shall be a true superset of the value of the [scope] property of the topic name item in its [parent] property.

5.6 Occurrence items

An occurrence is a representation of a relationship between a subject and an information resource. The subject in question is that represented by the topic which contains the occurrence. The information resource may either be a value inside the topic map or an external information resource. Occurrences are essentially a specialized kind of association, where one participant in the association shall be an information resource. An occurrence type is a subject describing the nature of the relationship between the subjects and information resources linked by the occurrences of that type.

All occurrences have a scope, which defines the context in which the occurrence relationship between the information resource and the subject is valid.

Figure 6 — The occurrence item

Occurrence items represent occurrences. Occurrence items have the following properties:

  1. [value]: A string. If the datatype is IRI, a locator referring to the information resource the occurrence connects with the subject; otherwise the string is the information resource.

  2. [datatype]: A locator. A locator identifying the datatype of the occurrence value.

  3. [scope]: A set of topic items. The scope that represents the context in which the occurrence relationship is considered valid.

  4. [type]: A topic item. The topic that defines the nature of the occurrence relationship.

  5. [reifier]: A topic item, or null. If present, the topic that reifies the occurrence.

  6. [item identifiers]: A set of locators. The item identifiers of the occurrence.

  7. [parent]: An information item. The topic to which the occurrence belongs.

    Computed value: the topic item whose [occurrences] property contains this occurrence item.

Equality rule: Occurrence items are equal if the values of their [value], [datatype], [scope], [type], and [parent] properties are equal.

5.7 Association items

An association is a representation of a relationship between one or more subjects. Associations have an association type, a subject describing the nature of the relationship represented by associations of that type.

An association role is a representation of the involvement of a subject in a relationship represented by an association. An association role connects two pieces of information within an association: the association role player, that is, the topic participating in the association, and the association role type, that is, a subject describing the nature of the participation of an association role player in an association.

EXAMPLE:

An example of an association might be the 'authorship' relationship between Henrik Ibsen and the play 'Peer Gynt'. In this relationship there are two roles: Ibsen plays the role of 'author', while 'Peer Gynt' plays the role of 'work'.

Another example might be the 'parenthood' relationship between Hamlet, King Hamlet, and Queen Gertrude. This relationship has three roles: Hamlet plays the role of 'child', the King that of 'father', and the Queen that of 'mother'.

All associations have a scope, which defines the context in which the relationship represented by the association is considered valid. The scope also applies to the assignment of the roles to the topics playing them; that is, the scope defines the context in which the topics can be said to play the roles in the association.

Figure 7 — The association item

Association items represent associations. Association items have the following properties:

  1. [type]: A topic item. The topic that defines the nature of the relationship represented by the association.

  2. [scope]: A set of topic items. The scope that represents the context in which the association is considered valid.

  3. [roles]: A non-empty set of association role items. The association roles for all the topics that participate in this relationship.

  4. [reifier]: A topic item, or null. If present, the topic that reifies the association.

  5. [item identifiers]: A set of locators. The item identifiers of the association.

  6. [parent]: An information item. The topic map containing the association.

    Computed value: the topic map item whose [associations] property contains this association item.

Equality rule: Association items are equal if the values of their [scope], [type], and [roles] properties are equal.

5.8 Association role items

NOTE:

See 5.7 for the definition of the term 'association role'.

Figure 8 — The association role item

Association role items represent association roles. Association role items have the following properties:

  1. [player]: A topic item. The topic that plays this role in the association.

  2. [type]: A topic item. The topic that represents the nature of the involvement of the association role player in the association.

  3. [reifier]: A topic item, or null. If present, the topic that reifies the association role.

  4. [item identifiers]: A set of locators. The item identifiers of this association role.

  5. [parent]: An information item. The association to which the association role belongs.

    Computed value: the association item whose [roles] property contains this association role item.

Equality rule: Association role items are equal if the values of their [type], [player], and [parent] properties are equal.

6 Merging

6.1 General

A central operation in Topic Maps is that of merging, a process applied to a topic map in order to eliminate redundant topic map constructs in that topic map. This clause specifies in which situations merging shall occur, but the rules given here are insufficient to ensure that all redundant information is removed from a topic map.

Any change to a topic map that causes a set to contain two information items equal to each other shall be followed by the merging of those two information items according to the rules given below for the type of information item to which the two equal information items belong.

6.2 Merging topic items

The procedure for merging two topic items A and B (whose [parent] properties shall contain the same topic map item) is given below. It is an error if A and B both have non-null values in their [reified] properties which are different.

  1. Create a new topic item C.

  2. Replace A by C wherever it appears in one of the following properties of an information item: [topics], [scope], [type], [player], and [reifier].

  3. Repeat for B.

  4. Set C's [topic names] property to the union of the values of A and B's [topic names] properties.

  5. Set C's [occurrences] property to the union of the values of A and B's [occurrences] properties.

  6. Set C's [subject identifiers] property to the union of the values of A and B's [subject identifiers] properties.

  7. Set C's [subject locators] property to the union of the values of A and B's [subject locators] properties.

  8. Set C's [item identifiers] property to the union of the values of A and B's [item identifiers] properties.

6.3 Merging topic name items

The procedure for merging two topic name items A and B is given below.

  1. Create a new topic name item C.

  2. Set C's [value] property to the value of the [value] property of A. B's value is equal that of A and need not be taken into account.

  3. Set C's [type] property to the value of the [type] property of A. B's value is equal that of A and need not be taken into account.

  4. Set C's [scope] property to the value of the [scope] property of A. B's value is equal that of A and need not be taken into account.

  5. Set C's [variants] property to the union of the [variants] properties of A and B.

  6. Set C's [reifier] property to the value of A's [reifier] property if it is not null, and to the value of B's [reifier] property if A's property is null. If both A and B have non-null values, the topic items shall be merged, and the topic item resulting from the merge set as the value of C's [reifier] property.

  7. Set C's [item identifiers] property to the union of the value of the [item identifiers] properties of A and B.

  8. Remove A and B from the [topic names] property of the topic item in their [parent] properties, and add C.

6.4 Merging variant items

The procedure for merging two variant items A and B is given below.

  1. Create a new variant item, C.

  2. Set C's [value] property to the value of A's [value] property. B's value is equal to that of A and need not be taken into account.

  3. Set C's [datatype] property to the value of A's [datatype] property. B's value is equal to that of A and need not be taken into account.

  4. Set C's [scope] property to the value of A's [scope] property. B's value is equal to that of A and need not be taken into account.

  5. Set C's [reifier] property to the value of A's [reifier] property if it is not null, and to the value of B's [reifier] property if A's property is null. If both A and B have non-null values, the topic items shall be merged, and the topic item resulting from the merge set as the value of C's [reifier] property.

  6. Set C's [item identifiers] property to the union of the values of A's and B's [item identifiers] properties.

  7. Remove A and B from the [variants] property of the topic name item in their [parent] properties, and add C.

6.5 Merging occurrence items

The procedure for merging two occurrence items A and B is given below.

  1. Create a new occurrence item, C.

  2. Set C's [value] property to the value of A's [value] property. B's value is equal to that of A and need not be taken into account.

  3. Set C's [datatype] property to the value of A's [datatype] property. B's value is equal to that of A and need not be taken into account.

  4. Set C's [scope] property to the value of A's [scope] property. B's value is equal to that of A and need not be taken into account.

  5. Set C's [type] property to the value of A's [type] property. B's value is equal to that of A and need not be taken into account.

  6. Set C's [reifier] property to the value of A's [reifier] property if it is not null, and to the value of B's [reifier] property if A's property is null. If both A and B have non-null values, the topic items shall be merged, and the topic item resulting from the merge set as the value of C's [reifier] property.

  7. Set C's [item identifiers] property to the union of the values of A's and B's [item identifiers] properties.

  8. Remove A and B from the [occurrences] property of the topic item in their [parent] properties, and add C.

6.6 Merging association items

The procedure for merging two association items A and B is given below.

  1. Create a new association item, C.

  2. Set C's [type] property to the value of A's [type] property. B's value is equal to that of A and need not be taken into account.

  3. Set C's [scope] property to the value of A's [scope] property. B's value is equal to that of A and need not be taken into account.

  4. Set C's [roles] property to the value of A's [roles] property. B's value is equal to that of A and need not be taken into account.

  5. Set C's [reifier] property to the value of A's [reifier] property if it is not null, and to the value of B's [reifier] property if A's property is null. If both A and B have non-null values, the topic items shall be merged, and the topic item resulting from the merge set as the value of C's [reifier] property.

  6. Set C's [item identifiers] property to the union of the values of A's and B's [item identifiers] properties.

  7. Remove A and B from the [associations] property of the topic map item in their [parent] properties, and add C.

6.7 Merging association role items

The procedure for merging two association role items A and B is given below.

  1. Create a new association role item, C.

  2. Set C's [player] property to the value of A's [player] property. B's value is equal to that of A and need not be taken into account.

  3. Set C's [type] property to the value of A's [type] property. B's value is equal to that of A and need not be taken into account.

  4. Set C's [item identifiers] property to the union of the values of A's and B's [item identifiers] properties.

  5. Set C's [reifier] property to the value of A's [reifier] property if it is not null, and to the value of B's [reifier] property if A's property is null. If both A and B have non-null values, the topic items shall be merged, and the topic item resulting from the merge set as the value of C's [reifier] property.

  6. Remove A and B from the [roles] property of the association item in their [parent] properties, and add C.

7 Core subject identifiers

7.1 General

This clause defines a number of core subject identifiers in order to achieve interoperability through consistent behaviour. These subject identifiers are central to this part of ISO/IEC13250, yet there is no requirement that they be used, and alternative subject identifiers for the same functionality may be defined and used instead.

All core subject identifiers defined by this part of ISO/IEC13250 are distinct, that is, topics representing these subjects cannot be merged with one another.

7.2 The type-instance relationship

A topic type is a subject that captures some commonality in a set of subjects. Any subject that belongs to the extension of a particular topic type is known as an instance of that topic type. A topic type may itself be an instance of another topic type, and there is no limit to the number of topic types a subject may be an instance of.

The type-instance relationship is not transitive. That is, if B is an instance of the type A, and C is an instance of the type B, it does not follow that C is an instance of A.

The type-instance relationship between two topics can be asserted using an association item that conforms to the following rules:

Association items that use one or more of the subject identifiers defined in this clause, but which do not conform to these structural rules, are not considered to represent type-instance relationships.

Scope applies to this association type in the same way as it does to any other.

7.3 The supertype-subtype relationship

The supertype-subtype relationship is the relationship between a more general type (the supertype) and a specialization of that type (the subtype). If B is the subtype of A, it follows that every instance of B is also an instance of A. The converse is not necessarily true. A type may have any number of subtypes and supertypes.

The supertype-subtype relationship is transitive, which means that if B is a subtype of A, and C a subtype of B, C is also a subtype of A.

NOTE:

Loops in this relationship are allowed, and should be interpreted to mean that the sets of instances for all types in the loop are the same. This does not, however, necessarily imply that the types are the same.

NOTE:

The semantics of the supertype-subtype relationship imply the existence of type-instance and supertype-subtype in addition to those explicitly represented by associations in the topic map. This part of ISO/IEC13250 does not require associations to be created for inferred relationships.

The supertype-subtype relationship between two types can be asserted using an association item that conforms to the following rules:

Association items that use one or more of the subject identifiers defined in this clause, but which do not conform to these structural rules, are not considered to represent supertype-subtype relationships.

Scope applies to this association type in the same way as it does to any other.

EXAMPLE:

Scope makes the interpretation of transitivity more complex. If A is an instance of B in scope Y and X, and B is a subtype of C in scope Y and Z A is an instance of C only in the context where all three topics X, Y, and Z apply. This is because we need both relationships in order to conclude that A is an instance of C, and the failure of any one of these topics to apply will make at least one of the relationships invalid.

7.4 Sort names

Sort names are a particular form of variant name used to sort topics. Sort names will be sorted on the value in the [value] property in Unicode code point order. Implementations may use other sort orders for datatypes other than those defined in this part of ISO/IEC13250. To get a particular sort order use sort names that, when sorted with this algorithm, result in the desired order.

Sort names are represented by variant items whose [scope] property contains a topic item whose [subject identifiers] property contains the string "http://psi.topicmaps.com/iso13250/sort".

7.5 Subjects for defined terms

This clause defines a subject for each formally defined term in this part of ISO/IEC13250.

http://psi.topicmaps.com/iso13250/association

Definition: representation of a relationship between one or more subjects.

Usage: the type of all associations.

http://psi.topicmaps.com/iso13250/association-role

Definition: representation of the involvement of a subject in a relationship represented by an association.

Usage: the type of all association roles.

http://psi.topicmaps.com/iso13250/association-role-type

Definition: subject describing the nature of the participation of an association role player in an association.

Usage: the type of all association role types.

http://psi.topicmaps.com/iso13250/association-type

Definition: subject describing the nature of the relationship represented by associations of that type.

Usage: the type of all association types.

http://psi.topicmaps.com/iso13250/information-resource

Definition: resource that can be represented as a sequence of bytes and thus could potentially be retrieved over a network.

Usage: the type of all topics representing information resources.

http://psi.topicmaps.com/iso13250/item-identifier

Definition: locator assigned to an information item in order to allow it to be referred to.

http://psi.topicmaps.com/iso13250/locator

Definition: string conforming to some locator notation that references one or more information resources.

http://psi.topicmaps.com/iso13250/merging

Definition: process applied to a topic map in order to eliminate redundant topic map constructs in that topic map.

http://psi.topicmaps.com/iso13250/occurrence

Definition: representation of a relationship between a subject and an information resource.

Usage: the type of all occurrences.

http://psi.topicmaps.com/iso13250/occurrence-type

Definition: subject describing the nature of the relationship between the subjects and information resources linked by the occurrences of that type.

Usage: the type of all occurrence types.

http://psi.topicmaps.com/iso13250/reification

Definition: making a topic represent the subject of another topic map construct in the same topic map.

http://psi.topicmaps.com/iso13250/scope

Definition: context within which a statement is valid.

http://psi.topicmaps.com/iso13250/statement

Definition: claim or assertion about a subject (where the subject may be a topic map construct).

http://psi.topicmaps.com/iso13250/subject

Definition: anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever.

Usage: the type of all topics.

http://psi.topicmaps.com/iso13250/subject-identifier

Definition: locator that refers to a subject indicator.

http://psi.topicmaps.com/iso13250/subject-indicator

Definition: information resource that is referred to from a topic map in an attempt to unambiguously identify the subject represented by a topic to a human being.

Usage: the type of all information resources used as subject indicators.

http://psi.topicmaps.com/iso13250/subject-locator

Definition: locator that refers to the information resource that is the subject of a topic.

http://psi.topicmaps.com/iso13250/topic

Definition: symbol used within a topic map to represent one, and only one, subject, in order to allow statements to be made about the subject.

http://psi.topicmaps.com/iso13250/topic-map

Definition: set of topics and associations.

Usage: the type of all topics representing topic maps.

http://psi.topicmaps.com/iso13250/topic-map-construct

Definition: component of a topic map; that is, a topic map, a topic, a topic name, a variant name, an occurrence, an association, or an association role.

Usage: common supertype for topic, topic name, variant name, occurrence, association, association role, and topic map.

http://psi.topicmaps.com/iso13250/Topic-Maps

Definition: technology for encoding knowledge and connecting this encoded knowledge to relevant information resources.

http://psi.topicmaps.com/iso13250/topic-name

Definition: name for a topic, consisting of the base form, known as the base name, and variants of that base form, known as variant names.

Usage: the type of all topic names.

http://psi.topicmaps.com/iso13250/topic-name-type

Definition: subject describing the nature of the topic names of that type.

Usage: the type of all topic name types.

http://psi.topicmaps.com/iso13250/topic-type

Definition: subject that captures some commonality in a set of subjects.

Usage: the type of all topic types.

http://psi.topicmaps.com/iso13250/unconstrained-scope

Definition: scope used to indicate that a statement is considered to have unlimited validity.

Usage: should not be used in a scope.

http://psi.topicmaps.com/iso13250/variant-name

Definition: alternative form of a topic name that may be more suitable in a certain context than the corresponding base name.

Bibliography

Davies, Europe: A History, Norman Davies, Oxford University Press, 1996, ISBN 0-19-820171-0

WWS, The World's Writing Systems, Peter T. Daniels, William Bright, Oxford University Press, 1996, ISBN 0-19-507993-0

UML, Unified Modeling Language (UML), Version 1.5, Object Management Group, http://www.omg.org/technology/documents/formal/uml.htm

RFC 2288, Using Existing Bibliographic Identifiers as Uniform Resource Names, Informational Memo, February 1998, http://www.ietf.org/rfc/rfc2288.txt

ISO 13250:2003, ISO 13250:2003: Information technology — Document Description and Processing Languages — Topic Maps , ISO, 2003