[sc34wg3] Towards TMDM 3.0
Lars Marius Garshol
larsga at garshol.priv.no
Wed Feb 25 08:10:29 EST 2009
* Rani Pinchuk
>
> Throwing the item identifiers after a merge is exactly what I
> suggest: I
> suggest to simplify things by having a mandatory one item identifier
> per
> item. The item identifiers are not used for merging, only for
> identifying items. And no collection of item identifiers is done after
> merging.
Why throw them away only after merging?
> For (1) - with your suggestion, [...]
This is not my suggestion, Rani, but what's been the most common
internal model for Topic Maps engines for the past decade. Actually,
since the first Topic Maps engine was written. And it's been the
standard for a number of years now. All parts of ISO 13250, except -1
and -5, build on this model. As do TMCL and TMQL. And TMAPI 1.0 and
2.0. Plus a number of non-ISO specifications (LTM, JTM, tolog, ...).
So changing this property is going involve a *lot* of updates. If
we're going to change it, we need a very good reason. Of course, we
could decide that changing it would be better, but not worth it. So
far, though, I haven't seen anything to persuade me that any of your
proposed changes would be for the better.
> indeed all topics have a "kind of identifier" but it is actually a
> subject identifier, as it is not item identifier (although it is
> called that way).
Actually, no. TMDM defines subject identifier as "locator that refers
to a subject indicator" and subject indicator as "information resource
that is referred to from a topic map in an attempt to unambiguously
identify the subject represented by a topic to a human being".
Item identifiers don't refer to subject indicators, so they really are
not subject identifiers.
> The reason it is not item identifier, is that it does not help you
> to identify one item but a group of items (because we collect the
> item identifiers, we do not have any more one item identifier - one
> topic relationship).
It's true that it only identifies an item uniquely within a single
topic map. I don't consider that a flaw. In fact, I consider that
unavoidable. So far, the only person I'm aware of who disagrees is you.
I really don't see any problem here. And if I *did* think this was a
problem your suggestion would not solve it.
Imagine this CTM topic map:
topic.
Now load it into two different TopicMap objects in the same engine.
That gives you two different topic items with the same item
identifier, even if we make the change you suggest.
> For (2) - Let's examine a concrete example:
> Suppose we have a topic map with a topic with id "person" and item
> identifier http://one/person.
> We have a query to show all persons (pseudo code of course):
> show all topics of type "person".
>
> I assume here that we do not use the full item identifier in the
> queries
> we write.
No, you'd typically write something like
person << types
> Now we merge with another topic map, that has other persons, and are
> typed with a topic with item identifier http://two/human
>
> If we still use topic map http://one we still can use our query. If we
> now use topic map http://two, we cannot, because we use an ID in our
> query that does not match to "human" and cannot be extracted from the
> other item identifier.
That's true. In fact, querying in the Omnigator with tolog and
opera.xtm used to (and might still) show this problem.
> So I cannot see any gain here.
You mean, any gain relative to having only one item identifier? There
would be a gain if the two topics that merged came from the same topic
map, but I agree that's a marginal case.
> A much simpler way to achieve the same is to simply keep the local
> item
> identifiers when merging with external topic maps. The collection of
> item identifiers does not help.
A collection versus a single-value property? No, it doesn't help much.
> For (3) - This is indeed a rare situation. If A and C are merged, it
> means that we had a reason to merge them. The same with B and C.
> Only if
> those merges were done without PSIs, merging A and B using item
> identifiers will actually make any sense. Merging without PSIs seems
> to
> me at least as difficult as assigning PSIs to the topics that should
> be
> merged.
And yet merging without PSIs is the common case. It's what usually
happens.
In any case, I think you should turn your argument around. If we agree
that item identifiers are going to exist at all, I think you should
consider carefully what the benefits of making it a single-value
property are.
As a general rule, when two topics merge, no information is lost
(except the distinction between the two topics). All properties are
preserved. Why should item identifiers be an exception to this rule?
Remember also that XTM 1.0, XTM 2.0, and CTM all allow you to assign
more than one item identifier to the same topic, so this is not only
about merging.
Given all this, why should it be a single-value property? What are the
benefits of that? Yes, a single-value property requires less overhead
than one that's multi-value, but that's all. There are no other
benefits that I know of, or that you have explained. So I really don't
understand why you consider this change so important.
--Lars M.
http://www.garshol.priv.no/blog/
http://www.garshol.priv.no/tmphoto/
More information about the sc34wg3
mailing list