[sc34wg3] Merging/Viewing subject proxies
Patrick Durusau
sc34wg3@isotopicmaps.org
Tue, 26 Jul 2005 14:14:52 -0400
Jan,
Jan Algermissen wrote:
>
> On Jul 26, 2005, at 6:03 PM, Patrick Durusau wrote:
>
>> Jan,
>>
>> Does seeing it done count at proof of doable?
>
>
> If Versavant (havent looked at it yet) works for the arbitrary case
> and for realistic data set sizes (thousands (better millions) of
> proxies with dozens of properties, then yes, that would be
> sufficient, IMHO.
>
> The question remains though, what 'seeing it done' really means :o)
>
Versavant as it exists doesn't have great storage so I expect it would
die rather quickly with millions of subject proxies. But, that is not
really a test of the paradigm but simply a known weakness in the
application.
As I said in my post to Sam on metrics and scalability, it really is a
question of what you want to do? What disclosures do you want to support?
I can quite easily imagine rather limited disclosures that would be
appropriate for MARC records for example, in fact having both optimized
disclosures and applications that really work in the environment only.
Simply have no requirement to allow any other disclosures.
So, would an application with a limited (they all are in some sense)
disclosure and highly optimized application for MARC records prove
feasibility?
Depends on what you are looking for. If you are looking for an
application that accepts unbounded disclosure statements and millions of
proxies with dozens of properties, probably not. If on the other hand,
you want the the potential that the TMRM offers for your particular
disclosure, then I would say so.
I have a data set on hand right now that has approximately 700,000
bibliographic records that I was toying with prior to the last TMRM
drafting cycle. The number of subject proxies will depend on the view of
taken by the disclosure statement.
Assume I am a university tenure committee and so all I am really
interested in is person, author/editor/reviewer, date of publication and
the journal. The title, description of content, subject treated by the
publication, being irrelevant to the task at hand. ;-) The resulting
view would be far different from the view that a researcher who is
searching for literature on a particular subject would want. In fact,
that researcher might want to be able to add disclosures that are
optimized for other data sources.
The point I am trying to make is there is no one "right" disclosure and
therefore there is no one "right" application. Depends on what you want
to do. .
For what its worth, I think Versavant demonstrates the soundness of the
paradigm. How difficult some disclosures are going to be to actually
implement and how well those will scale is an open question.
Hope you are having a great day!
Patrick
> Jan
>
>
>
>>
>> If so, see www.versavant.com.
>>
>> Patrick
>>
>> Jan Algermissen wrote:
>>
>>
>>> Patrick,
>>>
>>> On Jul 26, 2005, at 4:22 PM, Patrick Durusau wrote:
>>>
>>>
>>>> hhh = { < name = "rabbit, coney" >, < webresource =
>>>> "www.rabbitnetwork.net, en.wikipedia.org/wiki/Rabbit" >, <
>>>> classification = "Oryctotagus cuniculus" > }
>>>>
>>>> Of course I am presuming that the disclosure for "name" allows
>>>> the creation of a list of names and provides that if any of the
>>>> "names" in the list match, further viewing with other subject
>>>> proxies that have either "rabbit" or "coney" for the name
>>>> property will occur.
>>>>
>>>>
>>>
>>> Having spend about a year on implementing what happens when
>>> proxies merge and how the merged values demand further merges etc.
>>> and having especially tried to trim the algorithm for this stuff
>>> down to O(logN) I must say that the datatype magic you describe
>>> (here converting scalar to set as needed) is unlikely to be
>>> doable. The consequence IMHO is that most value types should come
>>> as sets in the first place (e.g. 'names' as opposed to 'name' in
>>> the example.
>>>
>>> All this becomes really, really nasty when it comes to proxies
>>> being (parts of) values...
>>>
>>> This is not to say that the RM is not brilliant....I just think
>>> there is serious stuff in there that would need to be made
>>> explicit and proven as doable. (There might well be problems
>>> lurking in there that are not computable at all in finite time,
>>> dunno)
>>>
>>> Jan
>>>
>>> _____________________________________________________________________
>>> ___ _______________
>>> Jan Algermissen, Consultant & Programmer
>>> http://jalgermissen.com
>>> Tugboat Consulting, 'Applying Web technology to enterprise IT'
>>> http://www.tugboat.de
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> sc34wg3 mailing list
>>> sc34wg3@isotopicmaps.org
>>> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
>>>
>>>
>>>
>>
>> --
>> Patrick Durusau
>> Patrick@Durusau.net
>> Chair, V1 - Text Processing: Office and Publishing Systems Interface
>> Co-Editor, ISO 13250, Topic Maps -- Reference Model
>> Member, Text Encoding Initiative Board of Directors, 2003-2005
>>
>> Topic Maps: Human, not artificial, intelligence at work!
>>
>> _______________________________________________
>> sc34wg3 mailing list
>> sc34wg3@isotopicmaps.org
>> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
>>
>
> ________________________________________________________________________
> _______________
> Jan Algermissen, Consultant & Programmer
> http://jalgermissen.com
> Tugboat Consulting, 'Applying Web technology to enterprise IT'
> http://www.tugboat.de
>
>
>
>
> _______________________________________________
> sc34wg3 mailing list
> sc34wg3@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/sc34wg3
>
>
--
Patrick Durusau
Patrick@Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005
Topic Maps: Human, not artificial, intelligence at work!