From adriana_pfernandez@superig.com.br Sun Feb 8 20:09:34 2004
From: adriana_pfernandez@superig.com.br (Adriana)
Date: Sun, 8 Feb 2004 17:09:34 -0300
Subject: [tmql-wg] tmapi
Message-ID: <006701c3ee7f$79ae9e10$76a595c8@copacabana>
This is a multi-part message in MIME format.
------=_NextPart_000_0064_01C3EE66.535FD490
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Hi,
I'm search about tmql query engine. It's part of api tmapi, but I don't =
find this api for download (it isn't available in sourceforge). Someone =
know where can I to get tmapi?
Thanks,=20
Adriana.
------=_NextPart_000_0064_01C3EE66.535FD490
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Hi,
I'm search about tmql query engine. =
It's part of=20
api tmapi, but I don't find this api for download (it isn't=20
available in sourceforge). Someone know where can I to get=20
tmapi?
Thanks,
Adriana.
------=_NextPart_000_0064_01C3EE66.535FD490--
From Rani.Pinchuk@spaceapplications.com Mon Feb 16 14:04:06 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Mon, 16 Feb 2004 15:04:06 +0100
Subject: [tmql-wg] Toma - A suggestion for TMQL
Message-ID: <1076940246.5602.83.camel@mikush2>
--=-lQFFCVwI4y3Q74wS0NA2
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
Dear all,
Attached a description of the Toma language. Toma is a suggestion for
a topic map query language. It is inspired by Tolog, AsTMa* and SQL,
but uses also some notations that are used in OO languages.
In the past I published in the tmql-wg mailing list an even more
un-ready description of the language. Currently the language is a bit
more mature - and a prototype that partly implement it was
written.
In fact, all the examples in the attached document are actually run
using that prototype.
As the language is still very young, there are some issues that should
be changes or fixed. I would appreciate any feedback about it.
Thanks
Rani
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
--=-lQFFCVwI4y3Q74wS0NA2
Content-Disposition: attachment; filename=SAS-Computers-TM.xml
Content-Type: text/xml; name=SAS-Computers-TM.xml; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
=20
computer=09
A personal desktop computer. =09
=09
case=09
The case of the computer is the metal box that contains all=
internal devices.=09
=09
=09
mouseThe mouse is the small external device with a ball or a LED=
that the user move on the desk.The mouse is the device used by the user to determine the p=
osition of the pointer on the screen. =09
=09
=09
=09
=09
keyboardThe keyboard is the external device with keys on it that t=
he user strokes with his fingers.The keyboard is the device used by the user to enter letter=
s and commands. =09
=09
=09
screen=09
The screen is the external device on the desk that the user=
looks at.The screen is the external device that displays information=
and that the user looks at. =09
=09
=09
=09
motherboardThe motherboard is the internal device on which certain int=
ernal devices are installed and to which some external devices are connecte=
dThe motherboard is the device that collects and organizes a=
ll data flows.=09
=09
=09
processorThe processor is an internal device mounted on the motherbo=
ard.The processor is the device that processes all instructions=
.The processor is the internal device that produces the most=
heat by Joule effect.=09
=09
=09
=09
=09
=09
video-cardThe video-card is an internal device mounted on the motherb=
oard.The video-card is the device that manage the displays.
The video-card is an internal device that produces heat by =
Joule effect.=09
=09
=09
=09
=09
network-cardThe network-card is an internal device mounted on the mothe=
rboard.The network-card is the device that manage the network conn=
ection. =09
=09
=09
processor ventilatorThe processor ventilator is an internal device mounted on t=
he processor.The processor ventilator dissipates heat produces by the pr=
ocessor. =09
=09
=09
video-card ventilatorThe video-card ventilator is an internal device mounted on =
the video-card.The video-card ventilator dissipates heat produces by the v=
ideo-card. =09
=09
=09
heatsis heated bysends data toreceives data fromis next tocontainsis insenderreceiversourcesinkdevicecontainercontained
=09
internal=09
Internal components of the computer.external=09
External components of the computer.thermal=09
Thermal scope type; used to filter thermal aspect.
location=09
Location scope type; used to filter location aspect.
logical=09
Logical scope type; used to filter logical aspect.
graphic=09
Graphic occurrence type; used to provide link to external g=
raphic resource.textText occurrence type; used to provide internal short inform=
ation about a topic.
--=-lQFFCVwI4y3Q74wS0NA2
Content-Disposition: attachment; filename=Toma.html
Content-Type: text/html; name=Toma.html; charset=UTF-8
Content-Transfer-Encoding: 7bit
Toma - Topic Map Query Language
The following is the language description of Toma which is a Topic
Map query language.
A prototype that implement most of the features described in this
document was written, and the examples below are run using that
prototype. Thus it is clear that the language as written in this
document is quite easy to implement.
Whitespace characters are used to as token separators, and apart from
that, they are ignored by the parser. One exception to that rule are
whitespaces within quotes.
Any line starting with hash character '#' is ignoed by the parser and
can be used to comment the code.
A string of whatever characters that is surrounded by single quotes is
a quoted string. If the string supposed to contain a single quote, it
is escaped with another single quote. Example:
'this is a quoted string'
'and that''s also a quoted string'
Labels are used to indicate topic ids, topic maps ids, association
types, roles types and scope topics. They can come in two formats - as
an alphanumeric + underscore string, or an alphanumeric + underscore +
hyphen string surrounded by square brackets. Examples:
container
[logical-exchange]
When the label comes with square brackets, only the string within the
brackets is taken as the label.
Dollar followed by any alphanumeric character or underscore is a
topic variable and it represents a topic.
$<variable_name>
Examples:
$a
$person
Through out this document we will refer to topic variable as
<topic_variable>
However, we will refer to expression that is evaluated to a topic
as
<topic>
Note that the second can be but is not always equivalent to the first -
any topic variable is evaluated to a topic but there are other
expressions that are not topic variables which are evaluated to topics.
.si.tr, .si.sir and .si.rr give the topic subjectIdentity
as a string (xlink:href), and refer respectively to
topicRef, subjectIndicatorRef or resourceRef:
.oc().rd and .oc().rr give the occurrences of the topic as a
string. The occurrences are respectively resourceData or
resourceRef. However, it also take into account the type (instanceOf)
of the occurrence - if within the brackets there is a label, it is
taken as the topic id of the type of the occurrence. If within the
brackets there is a topic variable, it will be set to the type of
the occurrence. The occurrences are respectively resourceData or
resourceRef.
Where topic_variable_or_topic_id can be a topic variable or a label
which is a topic id. The topic_variable_or_topic_id in the left
represents the type of the association, while the right
topic_variable_or_topic_id represents the roleSpec.
The whole argument gives a topic which is the topic that plays the
role in that association.
For example,
The location role of the association 'city':
city->location
The location role of certain association:
$association->location
A certain role of certain association:
$association->$role
The base name of the topic which plays the certain role of the certain
association is:
Usually when quering about association, the aim is to find a
connection between topics. For example -
[logical-exchange]->sender.id = 'motherboard'
and [logical-exchange]->receiver = $receiver;
The target of the above query is to get all the topics that
receive something from the 'motherboard'. However, it is not clear how
the first condition above is linked to the second: The first condition
states that 'motherboard' is the topic id of the topic that plays the
role 'sender' in the association of type 'logical-exchange'. The
second condition states that the topic $receiver suppose to play the
'receiver' role in the association of type 'logical-exchange'. But
in no place it is stated that it is the very same association - only
the type of the association is the same. So, $receiver might be
'video-card' which really receives something directly from the
'motherboard', but it can be also 'screen' that is a topic that plays
the 'receiver' role in other association of type 'logical-exchange'.
This confustion is solved by the chainging rule: Any association
expression in a statement that has the same topic id or topic
variable as the association type share the same association
object. So in the above example 'screen' will not play the 'receiver'
role in the association, because it must be the same association in
both lines.
If it is not desirable to force that two association expressions will share
the same association object, we should re-write the above as:
$le->sender.id = 'motherboard'
and [logical-exchange]->receiver = $receiver;
and $le.id = 'logical-exchange';
This looks equivalent to the former example, however, here the two
association expressions do not have the same topic id or topic
variable, and therefore they do not have to share the same association
object and therefore 'screen' can play the 'receiver' role in the
association.
This rule is an obvious flaw in the language syntax. However the
author could not find a more elegant way to symbolize the association
objects and their interactions - any other way seem to be very
unreadable. However, if the reader has a suggestion for a more elegant
solution, any feedback is welcomed.
@ is the scope operator, and it defines the scope of the object to
its left, by a topic id or topic variable on its right:
<scoped_object>@<topic_id>
<scoped_object>@<topic_variable>
The possible scoped_objects are base names, occurrences and
associations. The scope comes immedaitly after bn and oc().
In association, the scope will come just after the topic that
represents the type of the association and before the ->
operator.
topic_id is a string defines the topic id of the topic that
represents the scope. topic_variable is a topic variable which
represents the scope.
Examples:
The base name of a topic in English
$city.bn@en
The base name of a topic in a variable scope
$city.bn@$scope
The resourceData of an English occurrence of a topic
$person.oc($oc)@en.rd
The type of the occurrence of a topic in certain scope $scope
$person.oc@$scope.type(0)
The 'whole' role of the association 'part-of' in the scope 'detailed'
part-of@detailed->whole
A certain role $role1 of a certain association $association in a
certain scope $scope
The USE statement is used in order to define which Topic Maps are
queried when no specific Topic Map is defined within the query (using
the FROM clause).
use <tm_id_list>;
where tm_id_list is a comma separated list labels which are Topic Map IDs.
Example:
use computers;
or
use topicmap1, topicmap2;
The definition that is done by the USE statement is in scope till
other definition is done by an other USE statement.
The SELECT clause of the SELECT statement is used to define which
topics are expected to be retreived. It contains comma separated list
of topic variables that are present in the search condition clause.
The FROM clause of the SELECT statement is used to
define which Topic Maps are queried for that specific query.
FROM <tm_id_list>
where tm_id_list is a comma separated list labels which are Topic Map IDs.
Example:
from topicmap1, topicmap2
When the FROM clause is present, the definition made by a preceding
USE statement is ignored. When the FROM clause is not present, the
query will be made over the topic maps that were defined in the last
USE statement.
Comparison clause consists of two expressions and a comparison sign
between them:
<expression1> <comparison_sign> <expression2>
An expression might be a an expression that is evaluated to a topic
(like topic variable, type, association etc), or expression that is
evaluated to a string (like a quoted string, topic id, a base name,
occurence resourceData or resourceRef, etc).
It is crucial that the two expressions are both evaluated to a topic
or are both evaluated to a string. When using the regular expressions
comparison signs, the two expressions must be evaluated to a string.
If the two expressions around the non-equal sign are different from
each other, the clause is evaluated to be true.
Examples:
# the id of the topic that plays the 'container' role
# in the association of type 'in-location' is not 'case'
[in-location]->container.id != 'case'
# the topic that plays the 'sender' role in the
# association of type 'logical-exchange' is not $sender.
[logical-exchange]->sender != $sender
The following examples are all run on the prototype implementation of
the language. The XTM source is parsed using XML::Parser Perl module
and populate a rational database (Postgres). The Toma queries are
parsed by Lex & Yacc (Flex & Bison), and the syntax tree that is
created is then converted to Perl data structure. A Perl module is
implemented to walk that syntax tree, to generate SQL code, to run
that SQL code and to generate the Postgres-like resuslt tables.
As a prototype, many features are not yet implemented. Among them, the
choice between Topic Maps (so the USE statement). That means that
the database holds only one Topic Map.
However, the prototype helped to understand better the language that
is needed, and contributed in some features of the language. Apart
from that the prototype demonstrates that the language might be useful
(hopefully, it becomes clear from the examples below), and that it is
quite easy to implement such a language.
In order to make the example clear, a very simple, and very far from
being complete, topic map about computers was created. The XTM of this
topic map can be found in SAS-Computers-TM.xml.
select $topic, $oc_type where exists $topic.oc($oc_type).rd;
topic | oc_type
--------------+--------
case | graphic
case | text
external | text
graphic | text
internal | text
keyboard | graphic
keyboard | text
location | text
logical | text
motherboard | graphic
motherboard | text
mouse | graphic
mouse | text
network-card | graphic
network-card | text
pc | graphic
pc | text
processor | graphic
processor | text
screen | graphic
screen | text
thermal | text
ventilator1 | graphic
ventilator1 | text
ventilator2 | graphic
ventilator2 | text
video-card | graphic
video-card | text
(28 rows)
Although the language might be already useful in many cases, there are
many missing features that should be added to it in the future. Among
those features are:
--=-lQFFCVwI4y3Q74wS0NA2
Content-Disposition: attachment; filename=style.css
Content-Type: text/css; name=style.css; charset=UTF-8
Content-Transfer-Encoding: 7bit
BODY {
background: white;
color: black;
font-family: arial,sans-serif;
margin: 0;
padding: 1ex;
}
TABLE {
border-collapse: collapse;
border-spacing: 0;
border-width: 0;
color: inherit;
}
IMG { border: 0; }
FORM { margin: 0; }
input { margin: 2px; }
A.fred {
text-decoration: none;
}
A:link, A:visited {
background: transparent;
color: #006699;
}
A[href="#POD_ERRORS"] {
background: transparent;
color: #FF0000;
}
TD {
margin: 0;
padding: 0;
}
DIV {
border-width: 0;
}
DT {
margin-top: 1em;
}
.credits TD {
padding: 0.5ex 2ex;
}
.huge {
font-size: 32pt;
}
.s {
background: #dddddd;
color: inherit;
}
.s TD, .r TD {
padding: 0.2ex 1ex;
vertical-align: baseline;
}
TH {
background: #bbbbbb;
color: inherit;
padding: 0.4ex 1ex;
text-align: left;
}
TH A:link, TH A:visited {
background: transparent;
color: black;
}
.box {
border: 1px solid #006699;
margin: 1ex 0;
padding: 0;
}
.distfiles TD {
padding: 0 2ex 0 0;
vertical-align: baseline;
}
.manifest TD {
padding: 0 1ex;
vertical-align: top;
}
.l1 {
font-weight: bold;
}
.l2 {
font-weight: normal;
}
.t1, .t2, .t3, .t4 {
background: #006699;
color: white;
}
.t4 {
padding: 0.2ex 0.4ex;
}
.t1, .t2, .t3 {
padding: 0.5ex 1ex;
}
/* IE does not support .box>.t1 Grrr */
.box .t1, .box .t2, .box .t3 {
margin: 0;
}
.t1 {
font-size: 1.4em;
font-weight: bold;
text-align: center;
}
.t2 {
font-size: 1.0em;
font-weight: bold;
text-align: left;
}
.t3 {
font-size: 1.0em;
font-weight: normal;
text-align: left;
}
/* width: 100%; border: 0.1px solid #FFFFFF; */ /* NN4 hack */
.datecell {
text-align: center;
width: 17em;
}
.cell {
padding: 0.2ex 1ex;
text-align: left;
}
.label {
background: #aaaaaa;
color: black;
font-weight: bold;
padding: 0.2ex 1ex;
text-align: right;
white-space: nowrap;
vertical-align: baseline;
}
.categories {
border-bottom: 3px double #006699;
margin-bottom: 1ex;
padding-bottom: 1ex;
}
.categories TABLE {
margin: auto;
}
.categories TD {
padding: 0.5ex 1ex;
vertical-align: baseline;
}
.path A {
background: transparent;
color: #006699;
font-weight: bold;
}
.pages {
background: #dddddd;
color: #006699;
padding: 0.2ex 0.4ex;
}
.path {
background: #dddddd;
border-bottom: 1px solid #006699;
color: #006699;
/* font-size: 1.4em;*/
margin: 1ex 0;
padding: 0.5ex 1ex;
}
.menubar TD {
background: #006699;
color: white;
}
.menubar {
background: #006699;
color: white;
margin: 1ex 0;
padding: 1px;
}
.menubar .links {
background: transparent;
color: white;
padding: 0.2ex;
text-align: left;
}
.menubar .searchbar {
background: black;
color: black;
margin: 0px;
padding: 2px;
text-align: right;
}
A.m:link, A.m:visited {
background: #006699;
color: white;
font: bold 10pt Arial,Helvetica,sans-serif;
text-decoration: none;
}
A.o:link, A.o:visited {
background: #006699;
color: #ccffcc;
font: bold 10pt Arial,Helvetica,sans-serif;
text-decoration: none;
}
A.o:hover {
background: transparent;
color: #ff6600;
text-decoration: underline;
}
A.m:hover {
background: transparent;
color: #ff6600;
text-decoration: underline;
}
table.dlsip {
background: #dddddd;
border: 0.4ex solid #dddddd;
}
PRE {
background: #eeeeee;
border: 1px solid #888888;
color: black;
padding: 1em;
white-space: pre;
}
H1 {
background: transparent;
color: #006699;
font-size: +4;
}
H2 {
background: transparent;
color: #006699;
font-size: +3;
}
H3 {
background: transparent;
color: #006699;
font-size: +2;
}
H4 {
background: transparent;
color: #006699;
font-size: +1;
}
IMG {
vertical-align: top;
}
.toc A {
text-decoration: none;
}
.toc LI {
line-height: 1.2em;
list-style-type: none;
}
.faq DT {
font-size: 1.4em;
font-weight: bold;
}
.chmenu {
background: black;
color: red;
font: bold 1.1em Arial,Helvetica,sans-serif;
margin: 1ex auto;
padding: 0.5ex;
}
.chmenu TD {
padding: 0.2ex 1ex;
}
.chmenu A:link, .chmenu A:visited {
background: transparent;
color: white;
text-decoration: none;
}
.chmenu A:hover {
background: transparent;
color: #ff6600;
text-decoration: underline;
}
.column {
padding: 0.5ex 1ex;
vertical-align: top;
}
.datebar {
margin: auto;
width: 14em;
}
.date {
background: transparent;
color: #008000;
}
--=-lQFFCVwI4y3Q74wS0NA2--
From larsga@ontopia.net Mon Feb 16 15:18:26 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Mon, 16 Feb 2004 16:18:26 +0100
Subject: [tmql-wg] tmapi
In-Reply-To: <006701c3ee7f$79ae9e10$76a595c8@copacabana>
References: <006701c3ee7f$79ae9e10$76a595c8@copacabana>
Message-ID:
Hi Adriana,
Apologies for the long wait for a reply. We (the admins) get so much
spam to this list that your email drowned among them, and so we didn't
see it before. Sorry about that.
* adriana_pfernandez@superig.com.br
|
| I'm search about tmql query engine. It's part of api tmapi, but I
| don't find this api for download (it isn't available in
| sourceforge). Someone know where can I to get tmapi?
Actually, the TMQL language has not been defined yet. We are still
working on that. What exists at this moment is a set of proposed
languages like AsTMa?, tolog, TMPath, and now also Toma. You can get
implementations of some of these, but not TMQL, because it doesn't
exist yet.
So getting TMAPI will not help you with this, I'm afraid. If you *do*
want TMAPI you can get it through SourceForge CVS. However, TMAPI is
just a standard API; the TMAPI code does not contain an implementation
of that API. If that's what you want you should probably download
TM4J (see http://www.tm4j.org).
I hope this helps.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From adriana_pfernandez@superig.com.br Mon Feb 16 15:37:52 2004
From: adriana_pfernandez@superig.com.br (Adriana)
Date: Mon, 16 Feb 2004 12:37:52 -0300
Subject: [tmql-wg] tmql
Message-ID: <005101c3f4a2$d7020920$16a495c8@copacabana>
This is a multi-part message in MIME format.
------=_NextPart_000_004E_01C3F489.B16885E0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Hi all,
I'm search about tmql query engine. It's part of tmapi, but I don't find =
this api for download (it isn't available in sourceforge). Someone know =
where can I to get tmapi?
Thanks,=20
Adriana.
------=_NextPart_000_004E_01C3F489.B16885E0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Hi all,
I'm search about tmql query engine. =
It's part of=20
tmapi, but I don't find this api for download (it isn't=20
available in sourceforge). Someone know where can I to get=20
tmapi?
Thanks,
Adriana.
------=_NextPart_000_004E_01C3F489.B16885E0--
From rho@bigpond.net.au Tue Feb 17 01:31:40 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Tue, 17 Feb 2004 11:31:40 +1000
Subject: [tmql-wg] Toma - A suggestion for TMQL
In-Reply-To: <1076940246.5602.83.camel@mikush2>
References: <1076940246.5602.83.camel@mikush2>
Message-ID: <20040217013140.GE8410@namod.qld.bigpond.net.au>
On Mon, Feb 16, 2004 at 03:04:06PM +0100, Rani Pinchuk wrote:
> In the past I published in the tmql-wg mailing list an even more
> un-ready description of the language.
Rani, et. al.
Would it be possible to post this onto a web site so that
we can link to it?
Lars and /me are currently considering to reorganize
http://www.isotopicmaps.org/tmql/
and links are easier to incorporate there.
Any ideas concerning the new content there are appreciated.
\rho
From Rani.Pinchuk@spaceapplications.com Tue Feb 17 11:43:04 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Tue, 17 Feb 2004 12:43:04 +0100
Subject: [tmql-wg] Toma - A suggestion for TMQL
In-Reply-To: <20040217013140.GE8410@namod.qld.bigpond.net.au>
References: <1076940246.5602.83.camel@mikush2>
<20040217013140.GE8410@namod.qld.bigpond.net.au>
Message-ID: <1077018184.5616.41.camel@mikush2>
Hello Robert,
The document about Toma is now available from
http://www.spaceapplications.com/toma/
Kind regards,
Rani.
On Tue, 2004-02-17 at 02:31, Robert Barta wrote:
> On Mon, Feb 16, 2004 at 03:04:06PM +0100, Rani Pinchuk wrote:
> > In the past I published in the tmql-wg mailing list an even more
> > un-ready description of the language.
>
> Rani, et. al.
>
> Would it be possible to post this onto a web site so that
> we can link to it?
>
> Lars and /me are currently considering to reorganize
>
> http://www.isotopicmaps.org/tmql/
>
> and links are easier to incorporate there.
>
> Any ideas concerning the new content there are appreciated.
>
> \rho
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From algermissen@acm.org Tue Feb 17 15:10:45 2004
From: algermissen@acm.org (Jan Algermissen)
Date: Tue, 17 Feb 2004 16:10:45 +0100
Subject: [tmql-wg] Toma - A suggestion for TMQL
References: <1076940246.5602.83.camel@mikush2>
<20040217013140.GE8410@namod.qld.bigpond.net.au> <1077018184.5616.41.camel@mikush2>
Message-ID: <40322EF5.19515820@topicmapping.com>
Rani Pinchuk wrote:
>
> Hello Robert,
>
> The document about Toma is now available from
> http://www.spaceapplications.com/toma/
Hi Rani,
I like this! Especially the concept of viewing associations as ER relations:
select
where
Maybe you are interested in this, I used a similar approach
http://www.gooseworks.org/stmql.html
Jan
BTW: Your demo map has a @ in line 17 and at least expat complains about this.
>
> Kind regards,
>
> Rani.
>
> On Tue, 2004-02-17 at 02:31, Robert Barta wrote:
> > On Mon, Feb 16, 2004 at 03:04:06PM +0100, Rani Pinchuk wrote:
> > > In the past I published in the tmql-wg mailing list an even more
> > > un-ready description of the language.
> >
> > Rani, et. al.
> >
> > Would it be possible to post this onto a web site so that
> > we can link to it?
> >
> > Lars and /me are currently considering to reorganize
> >
> > http://www.isotopicmaps.org/tmql/
> >
> > and links are easier to incorporate there.
> >
> > Any ideas concerning the new content there are appreciated.
> >
> > \rho
> --
> Rani Pinchuk
> Software Engineer
> Space Applications Services
> Leuvensesteenweg, 325
> B-1932 Zaventem
> Belgium
>
> Tel.: + 32 2 721 54 84
> Fax.: + 32 2 721 54 44
>
> http://www.spaceapplications.com
>
> _______________________________________________
> tmql-wg mailing list
> tmql-wg@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/tmql-wg
--
Jan Algermissen http://www.topicmapping.com
Consultant & Programmer http://www.gooseworks.org
From Rani.Pinchuk@spaceapplications.com Wed Feb 18 16:18:34 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Wed, 18 Feb 2004 17:18:34 +0100
Subject: [tmql-wg] Updates in the document about Toma
Message-ID: <1077121113.5652.12.camel@mikush2>
Dear all,
The document about Toma
(http://www.spaceapplications.com/toma/Toma.html) and the example topic
map were updated.
The document about Toma contains now a section that answers section 3.6
("Specific query capabilities") of the TMQL requirements (1.0.0).
Besides, the "missing/future features" section got bigger.
I would really appreciate any feedback, especially about the chaining
rule (of associations), about the returned values from the query
(currently Toma returns only topics) and about the "missing/future
features" section.
Thanks
Rani
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From larsga@ontopia.net Wed Feb 18 16:34:19 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Wed, 18 Feb 2004 17:34:19 +0100
Subject: [tmql-wg] Updates in the document about Toma
In-Reply-To: <1077121113.5652.12.camel@mikush2>
References: <1077121113.5652.12.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| The document about Toma contains now a section that answers section 3.6
| ("Specific query capabilities") of the TMQL requirements (1.0.0).
| Besides, the "missing/future features" section got bigger.
Actually, if you want to go into detail on this you can look at trying
to implement the TMQL use cases with Toma. That would give Toma a real
workout, and also give us a yardstick for comparing it with the other
languages.
(We should have pointed more clearly at this document. I have a
half-written update to the TMQL home page that does this. Hoping to
finish that one of these days.)
| I would really appreciate any feedback, especially about the
| chaining rule (of associations), about the returned values from the
| query (currently Toma returns only topics) and about the
| "missing/future features" section.
Still trying to find time to read this. Sorry. :-(
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From Rani.Pinchuk@spaceapplications.com Thu Feb 19 09:50:28 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Thu, 19 Feb 2004 10:50:28 +0100
Subject: [tmql-wg] Result set requirements
Message-ID: <1077184228.5605.4.camel@mikush2>
Dear all,
It is obvious that there is a trade off between having the result set
containing more complex structures and between the simplicity of TMQL.
When the syntax of the language provides the ability to get for example
the base name of a topic, next to a topic object, it brings up some
confusing problems:
1. Different columns in the result set have different type - base name
column probably will be of type 'string', while topic columns will be
of type 'topic object'.
2. If there is more then one base name for that topic, and the scope
was not specified it probably means that a list of base names
should be supplied. However, it is not clear in which order such a
list should be provided. Finally that list of base names should
come next to one topic object (in the same 'row'), so we end up
with column of type 'list of strings'.
3. Should the scopes, and the variants of the base name be retrieved
with the base names, so actually a structure is retrieved, or
should the base names retrieved by themselves?
Should we then define different structures for the different
sub-structures we have in a topic? For example, when retrieving
base name of a topic we get a structure containing base names,
their scopes, and their variants.
On the other hand, the TMQL could leave the fetching of the actual
primitives from a topic object to the programing languages or tools
that might use TMQL. In that case the TMQL returns only topic
objects (or their ids), and the APIs of different languages will
define a way to access the different primitives inside the objects.
However, also when TMQL supports retrieving of different primitives
such APIs will be needed, unless we restrict TMQL to return only
simple textual rows (like in SQL) which is for sure not the
intention.
So I would like to ask what are the advantages for supplying the
result sets in "different types".
Thanks
Rani
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From larsga@ontopia.net Thu Feb 19 12:13:52 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Thu, 19 Feb 2004 13:13:52 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077184228.5605.4.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| It is obvious that there is a trade off between having the result set
| containing more complex structures and between the simplicity of TMQL.
Absolutely. To me it seems that we probably need to support the
following kinds of result sets:
- single value,
- list/set of values,
- list/set of variable bindings (effectively same as a table), and
- topic map fragment.
There are situations where all of these are needed, I've found when
developing topic map-based applications.
| When the syntax of the language provides the ability to get for
| example the base name of a topic, next to a topic object, it brings
| up some confusing problems:
|
| 1. Different columns in the result set have different type - base
| name column probably will be of type 'string', while topic columns
| will be of type 'topic object'.
This need not be a problem. tolog already does this, and it works
fine. Our query engine even infers the type(s) of each column, and it
all works just fine.
| 2. If there is more then one base name for that topic, and the scope
| was not specified it probably means that a list of base names
| should be supplied. However, it is not clear in which order such a
| list should be provided.
I think it's perfectly OK for the query language to say that the order
is undefined. If the user asks for ordering there should be an
ordering rule for each type. This should actually be quite
straightforward.
| Finally that list of base names should come next to one topic
| object (in the same 'row'), so we end up with column of type
| 'list of strings'.
There you have an interesting question: should you get one row for
each base name, or should the QL support collections as values? I can
see arguments both ways.
| 3. Should the scopes, and the variants of the base name be retrieved
| with the base names, so actually a structure is retrieved, or should
| the base names retrieved by themselves?
Good question. I think the user should be allowed to choose. (This is
how tolog does it at present.)
| Should we then define different structures for the different
| sub-structures we have in a topic? For example, when retrieving
| base name of a topic we get a structure containing base names,
| their scopes, and their variants.
Yep. The data model handles this for you, though.
| On the other hand, the TMQL could leave the fetching of the actual
| primitives from a topic object to the programing languages or tools
| that might use TMQL. In that case the TMQL returns only topic
| objects (or their ids), and the APIs of different languages will
| define a way to access the different primitives inside the objects.
That's how I think it ought to be. TMQL can say what the result set is
abstractly, and then different APIs can choose different ways to
represent it.
| However, also when TMQL supports retrieving of different primitives
| such APIs will be needed, unless we restrict TMQL to return only
| simple textual rows (like in SQL) which is for sure not the
| intention.
Yep.
| So I would like to ask what are the advantages for supplying the
| result sets in "different types".
It depends on the usage situation, really. Quite often when developing
a web page, for example, you want the person who last modified this
topic. That's a single value.
Other times you want all the versions of this topic. That's a list.
And sometimes you want all the versions of this topic with their
version numbers, the person creating them, and the creation times.
That's a table.
And, finally, sometimes you want to create a fragment of a topic map
in order to export it (so user X can take his personal data
elsewhere), to send it across the network (so the graphical visualizer
can show a part of the topic map, or so application X can access TM
server Y), and so on.
So I see important use cases for all of these. (Hope I understood your
question correctly.)
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From Rani.Pinchuk@spaceapplications.com Thu Feb 19 14:15:40 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Thu, 19 Feb 2004 15:15:40 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To:
References: <1077184228.5605.4.camel@mikush2>
Message-ID: <1077200140.5610.51.camel@mikush2>
On Thu, 2004-02-19 at 13:13, Lars Marius Garshol wrote:
> | On the other hand, the TMQL could leave the fetching of the actual
> | primitives from a topic object to the programing languages or tools
> | that might use TMQL. In that case the TMQL returns only topic
> | objects (or their ids), and the APIs of different languages will
> | define a way to access the different primitives inside the objects.
>
> That's how I think it ought to be. TMQL can say what the result set is
> abstractly, and then different APIs can choose different ways to
> represent it.
>
> | However, also when TMQL supports retrieving of different primitives
> | such APIs will be needed, unless we restrict TMQL to return only
> | simple textual rows (like in SQL) which is for sure not the
> | intention.
>
> Yep.
What I don't understand is why to create double APIs here: One API in
TMQL which let us access to the primitives of the topics, to structures
within the topics or to generate XML out of the result set, and yet
another API in the environment we work in which interfaces with the
results of that first API.
I think that it is a must to have this second API (like JDBC for SQL).
I also think that when we generate anything else then simple text rows,
that API becomes as complex as the API of TMQL.
Another issue is the generation of XML (or other textual presentation of
the results) - there are in different environments different ways to
generate XML out of data structures. Why TMQL should add its own way of
generating XML - does that mean that the TMQL way is better then the
other ways (so people should spend time learning how to use that part of
TMQL to generate their XML results)?
So I am not objecting that different representations of the results are
needed. But I think that if the ability to create those different
representations is included in TMQL, there will be unnecessary
duplications (two APIs, two XML generators) - so we gain nothing but
loose in the simplicity of TMQL.
Rani
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From larsga@ontopia.net Thu Feb 19 16:38:12 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Thu, 19 Feb 2004 17:38:12 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077200140.5610.51.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| What I don't understand is why to create double APIs here: One API
| in TMQL which let us access to the primitives of the topics, to
| structures within the topics or to generate XML out of the result
| set, and yet another API in the environment we work in which
| interfaces with the results of that first API.
I guess by the latter you mean something like TMAPI? If so, I have to
say I'm sympathetic to what you say. I've wondered myself whether a
JDBC-like API might be all that's needed, with no ability to retrieve
TMAPI-like structures, or whether something like TMAPI is indeed
needed.
Personally I don't have an answer yet, but I don't think this
necessarily affects the form of the query language.
| I think that it is a must to have this second API (like JDBC for
| SQL).
It will be, though I think the first priority is to get TMQL itself.
Once we have that probably updates is the second priority. And once we
have updates, we could start thinking about an API. I think it would
be a good idea, but at the moment I think it's best to focus on the
TMQL itself.
| I also think that when we generate anything else then simple text
| rows, that API becomes as complex as the API of TMQL.
Quite possibly it does, yes. There are a few choices there, but it
could easily wind up this way, I think.
| Another issue is the generation of XML (or other textual
| presentation of the results) - there are in different environments
| different ways to generate XML out of data structures. Why TMQL
| should add its own way of generating XML - does that mean that the
| TMQL way is better then the other ways (so people should spend time
| learning how to use that part of TMQL to generate their XML
| results)?
The reason to include this in TMQL is basically that this will be a
very common usage of TMQL, and that if we don't standardize this what
will happen is that everyone will create non-standard mechanisms for
turning TMQL query results into XML/text/HTML. If that happens those
building applications with TMQL will not really be much closer to
interoperability between implementations than they are today.
Also, it's not that the TMQL way of generating XML will necessarily be
any better than other ways of generating XML, it's just that it will
be tailored towards dealing with TMQL result sets. Other ways of
generating XML can't be, simply because they won't be designed
specifically for TMQL.
| So I am not objecting that different representations of the results
| are needed. But I think that if the ability to create those
| different representations is included in TMQL, there will be
| unnecessary duplications (two APIs, two XML generators) - so we gain
| nothing but loose in the simplicity of TMQL.
I'm not sure this is a problem, actually. We'll have to see, basically.
We still have quite a few options left open.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From Rani.Pinchuk@spaceapplications.com Thu Feb 19 19:57:49 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Thu, 19 Feb 2004 20:57:49 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <4034D2F3.5C622B17@topicmapping.com>
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2> <4034D2F3.5C622B17@topicmapping.com>
Message-ID: <1077220669.5526.28.camel@mikush2>
OK, I understand now the reason for having result set specific for
certain primitives inside the topics. This indeed save sub-queries to
the engine.
My plan was actually to return all the topics that are in the result
set, but this is also not that good, because when someone looks only
for the base names in certain scope, it might be far too heavy to
retrieve all rest of the topic data (like resourceData that might be
in those topics)...
Thanks for the explanation. I will have to think how to incorporate
this functionality into Toma. One difficulty is that when we allow
results that are not scalars, we cannot present in proper way the
results (like the nice output tables I have now) - so if I want to
implement it I should already start to think about API in the
environment I code. Any ideas?
> What I currently think is that TMQL should return topic maps and that
> the client will be given a lightweight topic map object that can be
> accessed with the same API as 'big' (normal) topic maps. The query
> process would be something like this:
>
> - information need
> - describe desired result set as query and send of to server
> - server extracts result set (essentially a subset of the queried
> map) and sends it off to client
> - client TMQL library turns (however serialized) result set into
> a topic map object
> - client uses API as desired
I am not sure that I understood this - Do you mean that you return
light topics (so let's say, for a base names query you return the
resulted topic objects where only their base names are populated), or
light topic map (so you return full topic objects, but only the ones
that are resulted from the query)?
> there are in different environments different ways to
> > generate XML out of data structures. Why TMQL should add its own way
of
> > generating XML - does that mean that the TMQL way is better then the
> > other ways (so people should spend time learning how to use that
part of
> > TMQL to generate their XML results)?
>
> I don't understand what you mean, can you explain more?
Assuming that TMQL is returning always the same XML, then the above is
not relevant. I actually think that XML might be a good candidate as a
format for the result set.
However, if the intention is to generate _different_ XMLs and
other formats, so custom XMLs for a query, or for an application,
I would suggest to create those XML (or other textual outputs)
outside of the TMQL engine: The TMQL will generate the result sets,
the API will get them, and then in the environment that the user is
used to, a generator of XML with that data will be written.
What I try to avoid is that TMQL should generate a custom formats like
the one described in
http://www.y12.doe.gov/sgml/sc34/document/0449.htm#id2612104.
The reason is that I think that the scope of TMQL should be retrieving
data from Topic maps and not generation of XML from data.
But after reading the reply of Lars to my last email, I am not totally
sure about this point. See my reply to it.
Rani
From Rani.Pinchuk@spaceapplications.com Thu Feb 19 20:11:19 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Thu, 19 Feb 2004 21:11:19 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To:
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
Message-ID: <1077221479.5526.55.camel@mikush2>
> | Another issue is the generation of XML (or other textual
> | presentation of the results) - there are in different environments
> | different ways to generate XML out of data structures. Why TMQL
> | should add its own way of generating XML - does that mean that the
> | TMQL way is better then the other ways (so people should spend time
> | learning how to use that part of TMQL to generate their XML
> | results)?
>
> The reason to include this in TMQL is basically that this will be a
> very common usage of TMQL, and that if we don't standardize this what
> will happen is that everyone will create non-standard mechanisms for
> turning TMQL query results into XML/text/HTML. If that happens those
> building applications with TMQL will not really be much closer to
> interoperability between implementations than they are today.
>
> Also, it's not that the TMQL way of generating XML will necessarily be
> any better than other ways of generating XML, it's just that it will
> be tailored towards dealing with TMQL result sets. Other ways of
> generating XML can't be, simply because they won't be designed
> specifically for TMQL.
I am not so sure about this point - it might be true that the standard
should guide the users how to create those formats - so to avoid a
situation where as you write "everyone will create non-standard
mechanisms for turning TMQL query results into XML/text/HTML".
I would prefer, though, that this will be outside of TMQL - results from
SQL can also be turned into XML in non standard mechanisms. So maybe
there should be other standard for generating textual formats like XML
text of HTML from data.
I am also not sure about the special case of TMQL result sets - those
will be after all scalars and structures - and there are many
applications that might have scalars and structures that should be
turned into XML/text/HTML. As long as this XML/text/HTML is not special
for TMQL, I don't see the advantage of having the definition of
generating that format within TMQL.
So actually I still think that the advantage of having TMQL simpler is
more beneficial here then introducing the ability to generate textual
formats from the results.
Rani
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From larsga@ontopia.net Fri Feb 20 10:30:17 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Fri, 20 Feb 2004 11:30:17 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077221479.5526.55.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
<1077221479.5526.55.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| I am not so sure about this point - it might be true that the
| standard should guide the users how to create those formats - so to
| avoid a situation where as you write "everyone will create
| non-standard mechanisms for turning TMQL query results into
| XML/text/HTML".
I think there needs to be a standard way of doing that, because
otherwise we will see a proliferation of such mechanisms. Indeed we
have them already.
| I would prefer, though, that this will be outside of TMQL - results
| from SQL can also be turned into XML in non standard mechanisms. So
| maybe there should be other standard for generating textual formats
| like XML text of HTML from data.
This is something we can still choose how to do. Robert has advocated
incorporating it directly in the language, whereas for tolog I did
TMTL[1] which keeps it on the outside. I can see this argument both
ways and personally haven't made up my mind yet.
The jury is still out on this one, and I think Robert and I have more
or less tacitly agreed to wait until we *have* a TMQL, and only then
do we plan to deal with this.
| I am also not sure about the special case of TMQL result sets -
| those will be after all scalars and structures - and there are many
| applications that might have scalars and structures that should be
| turned into XML/text/HTML.
Oh, absolutely! We can't require all results to be turned into an
output format.
| As long as this XML/text/HTML is not special for TMQL, I don't see
| the advantage of having the definition of generating that format
| within TMQL.
If you can demonstrate that the use of TMQL within some other
framework does this well I can assure you that we'll listen to you.
It's not as if we write standards for the fun of it. :-)
| So actually I still think that the advantage of having TMQL simpler
| is more beneficial here then introducing the ability to generate
| textual formats from the results.
That's a trade-off, really. Personally I'd like to see this generation
capability layered on top of the QL itself, whether it's external to
the QL (like in TMTL) or internal to it (like in AsTMa?), so that
people can choose whether to implement it or not. The requirements
document is also formulated this way. But we'll do the basic QL first
and then look at this.
[1]
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From rho@bigpond.net.au Fri Feb 20 10:55:19 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Fri, 20 Feb 2004 20:55:19 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077184228.5605.4.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2>
Message-ID: <20040220105519.GA21106@namod.qld.bigpond.net.au>
On Thu, Feb 19, 2004 at 10:50:28AM +0100, Rani Pinchuk wrote:
> When the syntax of the language provides the ability to get for example
> the base name of a topic, next to a topic object, it brings up some
> confusing problems:
>
> 1. Different columns in the result set have different type - base name
> column probably will be of type 'string', while topic columns will be
> of type 'topic object'.
I would have no problem to get as result a "list of tuples", where
every tuple value has a different type, e.g.
1. Match: [ 23, "Rumsti", ]
2. Match: [ 42, "Ramsti", ]
.. ^^ ^^ ^^
|| || ||
number|| ||
|| ||
string map-fragment
We have this in SQL, XQuery, so why not in TMQL?
> 2. If there is more then one base name for that topic, and the scope
> was not specified it probably means that a list of base names
> should be supplied. However, it is not clear in which order such a
> list should be provided. Finally that list of base names should
> come next to one topic object (in the same 'row'), so we end up
> with column of type 'list of strings'.
Not necessarily. Exactly in the same way as other languages we would
repeat the values:
1. Match: [ "basename XYZ", ]
2. Match: [ "basename ABC", ]
In this __relational__ interpretation of results we always "multiply"
results. That lies in the nature of a relation.
> 3. Should the scopes, and the variants of the base name be retrieved
> with the base names, so actually a structure is retrieved, or
> should the base names retrieved by themselves?
I would say the user specifies what he wants and _exactly_ this should
be returned. No hidden magic, no tricks and hidden intelligence.
> Should we then define different structures for the different
> sub-structures we have in a topic? For example, when retrieving
> base name of a topic we get a structure containing base names,
> their scopes, and their variants.
We could, but this is already done in an abstract way in the TMDM,
or not?
I am rather sceptical about "query languages" which return "nodes" or
other internal data structures. This is like XPath and has the problem
that you have to use XPath _always_ with something else: XSLT, XQuery,
or a programming language.
> On the other hand, the TMQL could leave the fetching of the actual
> primitives from a topic object to the programing languages or tools
> that might use TMQL. In that case the TMQL returns only topic
> objects (or their ids), and the APIs of different languages will
> define a way to access the different primitives inside the objects.
I agree that NOT INCLUDING output data generation makes the language
smaller, but I think we should learn from the experiences in the
XPath/XSLT/XQuery universe and we SHOULD take care that following
principal content models should be supported on the output side:
- lists for relational data
- XML for tree-like data
- TMs for ... voila! TM data
I actually had an earlier version of AsTMa? doing exactly what you say
and found myself repeating the same programming patterns over and over
again.
\rho
From larsga@ontopia.net Fri Feb 20 11:00:11 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Fri, 20 Feb 2004 12:00:11 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077220669.5526.28.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
<4034D2F3.5C622B17@topicmapping.com>
<1077220669.5526.28.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| My plan was actually to return all the topics that are in the result
| set, but this is also not that good, because when someone looks only
| for the base names in certain scope, it might be far too heavy to
| retrieve all rest of the topic data (like resourceData that might be
| in those topics)...
Exactly. Not only that, but sometimes what is wanted is precisely to
return a specific topic name. (We do this kind of thing to bind it
into a context so that it can be edited in our web editor framework.
No doubt there are many other uses for it.)
| Thanks for the explanation. I will have to think how to incorporate
| this functionality into Toma. One difficulty is that when we allow
| results that are not scalars, we cannot present in proper way the
| results (like the nice output tables I have now) - so if I want to
| implement it I should already start to think about API in the
| environment I code. Any ideas?
I think you shouldn't worry about it. Just say that the result
contains information items from the TMDM, and then implementors can
worry about how to represent it. You can still do a standard API if
you want, but that then becomes a separate layer.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From larsga@ontopia.net Fri Feb 20 11:02:24 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Fri, 20 Feb 2004 12:02:24 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <4034D2F3.5C622B17@topicmapping.com>
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
<4034D2F3.5C622B17@topicmapping.com>
Message-ID:
* Jan Algermissen
|
| Right! The simplicity of SQL query results is IMHO the key thing
| that makes it possible to use it with reasonable effort within
| applications.
I strongly agree, and the same applies to XPath as well. I think we
need this for TMQL as well, but without giving up the ability to
produce result sets that are topic maps.
(Actually, I agree with the rest of your email as well, so I'll stop
here. :-)
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From rho@bigpond.net.au Fri Feb 20 11:08:59 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Fri, 20 Feb 2004 21:08:59 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077200140.5610.51.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2> <1077200140.5610.51.camel@mikush2>
Message-ID: <20040220110859.GB21106@namod.qld.bigpond.net.au>
On Thu, Feb 19, 2004 at 03:15:40PM +0100, Rani Pinchuk wrote:
> What I don't understand is why to create double APIs here: One API in
> TMQL which let us access to the primitives of the topics, to structures
> within the topics or to generate XML out of the result set, and yet
> another API in the environment we work in which interfaces with the
> results of that first API.
The APIs I see here have no overlap:
TMQL API:
- TMQL API: compile a query -> get a query handle
- TMQL API: execute a query -> get a result handle
- if the result is a list then
iterate over the list (lists are native to most programming languages)
- elsif the result is an XML structure
do something with that, maybe it is DOM, maybe the implementation offers SAX or pull
- elsif the result is a TM
use the TMAPI to walk over this data structure
So there is no overlap I can see.
> I think that it is a must to have this second API (like JDBC for SQL).
> I also think that when we generate anything else then simple text rows,
> that API becomes as complex as the API of TMQL.
As above, if we stick to the (list/XML/TM) output then there are already
interfaces for dealing with them. No further complexity is introduced.
> Another issue is the generation of XML (or other textual presentation of
> the results)
!!! Hold on !!!
This is _NOT_ a textual generation. That would be rather stupid, because
someone would have to parse this. If I ask for XML data to be returned
{
forall $t [ $a (album) ] in $m
return
{$t/bn}
}
then it is NOT XML text which is generated. My implementation generates a DOM.
> - there are in different environments different ways to
> generate XML out of data structures. Why TMQL should add its own way of
> generating XML - does that mean that the TMQL way is better then the
> other ways (so people should spend time learning how to use that part of
> TMQL to generate their XML results)?
No, it is not a "better" way, but it is probably the fastest. You, of
course can also ask for the results in list forms:
forall $t [ $a (album) ] in $m
return
( $a, $t/bn )
and then generate your XML by walking over the list, but TMQL already can
prebuild parts of the result XML as template which can be quickly cloned
whereas your program code would have to generate the nodes from scratch.
> So I am not objecting that different representations of the results
> are needed. But I think that if the ability to create those
> different representations is included in TMQL, there will be
> unnecessary duplications (two APIs, two XML generators) - so we gain
> nothing but loose in the simplicity of TMQL.
I cannot see any duplication yet, maybe you can elaborate with an
example?
\rho
From rho@bigpond.net.au Fri Feb 20 11:29:42 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Fri, 20 Feb 2004 21:29:42 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077220669.5526.28.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2> <1077200140.5610.51.camel@mikush2> <4034D2F3.5C622B17@topicmapping.com> <1077220669.5526.28.camel@mikush2>
Message-ID: <20040220112942.GC21106@namod.qld.bigpond.net.au>
On Thu, Feb 19, 2004 at 08:57:49PM +0100, Rani Pinchuk wrote:
> > From Lars:
> > - information need
> > - describe desired result set as query and send of to server
> > - server extracts result set (essentially a subset of the queried
> > map) and sends it off to client
> > - client TMQL library turns (however serialized) result set into
> > a topic map object
> > - client uses API as desired
>
> I am not sure that I understood this - Do you mean that you return
> light topics (so let's say, for a base names query you return the
> resulted topic objects where only their base names are populated), or
> light topic map (so you return full topic objects, but only the ones
> that are resulted from the query)?
Or both.
Consider the symmetry and its consequences if we allow TMQL also
to generate TMs out off a TM-server:
- in the simplest case the query simply extracts the requested
information from the backend and creates a small sub-map with
all the required information. That is returned to the application
and that will use a TM API for processing.
- in the general case the query may create a completely new TM. This
means that TMQL is a TM-to-TM transformer. The new TM can be following
a completely different ontology that the original TM.
[ I once tried an experiment at
http://astma.it.bond.edu.au/junk/bibtex-use-case.dbk?section=4 ]
> However, if the intention is to generate _different_ XMLs and
> other formats, so custom XMLs for a query, or for an application,
> I would suggest to create those XML (or other textual outputs)
Again, this is NOT textual!
> outside of the TMQL engine: The TMQL will generate the result sets,
> the API will get them, and then in the environment that the user is
> used to, a generator of XML with that data will be written.
>
> What I try to avoid is that TMQL should generate a custom formats like
> the one described in
> http://www.y12.doe.gov/sgml/sc34/document/0449.htm#id2612104.
>
> The reason is that I think that the scope of TMQL should be retrieving
> data from Topic maps and not generation of XML from data.
But _generating data_ is one of the main tasks of a query language.
SQL does this:
SELECT name, age FROM ....
^^^^^^^^^^^^^^^^
generates a new table
XQuery does this
let $authors := /book/author
return
{
$authors
}
generates new XML
So why should not TMQL generate something which the application
can directly post-process?
And pretty much every major DB vendor now allows the generation of XML
__DIRECTLY__ from the data. Exactly because it is such a frequent
operation and creating XML 'manually' is booooooooooooring. :-)
\rho
From Rani.Pinchuk@spaceapplications.com Fri Feb 20 13:26:34 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Fri, 20 Feb 2004 14:26:34 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To:
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2> <4034D2F3.5C622B17@topicmapping.com>
<1077220669.5526.28.camel@mikush2>
Message-ID: <1077283594.7142.68.camel@mikush2>
> | Thanks for the explanation. I will have to think how to incorporate
> | this functionality into Toma. One difficulty is that when we allow
> | results that are not scalars, we cannot present in proper way the
> | results (like the nice output tables I have now) - so if I want to
> | implement it I should already start to think about API in the
> | environment I code. Any ideas?
>
> I think you shouldn't worry about it. Just say that the result
> contains information items from the TMDM, and then implementors can
> worry about how to represent it. You can still do a standard API if
> you want, but that then becomes a separate layer.
Meanwhile, I try to have the new features of Toma actually implemented
in the simple prototype I wrote. This way I am sure that I don't have a
big flow in the definition of the language (so something that cannot be
coded or syntax that doesn't make sence)... So I guess I am one of those
implementors - when I deal with a query that returns a node I need to
get a representation of that node that is clear enough to be able to
debug it :-)
BTW, is there any elegant full presentation of TMDM like the one for XTM
found in http://www.geocities.com/xtopicmaps/topic_maps_color.html ?
Rani
From Rani.Pinchuk@spaceapplications.com Fri Feb 20 13:31:08 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Fri, 20 Feb 2004 14:31:08 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To:
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
<1077221479.5526.55.camel@mikush2>
Message-ID: <1077283868.7142.74.camel@mikush2>
> | As long as this XML/text/HTML is not special for TMQL, I don't see
> | the advantage of having the definition of generating that format
> | within TMQL.
>
> If you can demonstrate that the use of TMQL within some other
> framework does this well I can assure you that we'll listen to you.
I am not sure what is needed to be demonstrated. I see TMQL used as SQL.
When there are results set the data from them can be used to generate
different formats by an application.
Take for example an application that generate HTMLs from SQL queries.
I would like to separate the generation of the SQL from the rest of the
program so the relational database expert can create, profile and
maintain those queries without touching the rest of the application. The
programmer who program in certain environment, should not take into
account the SQL when making calculations over the queries results.
The graphics person should create/maintain the presentation (so the
HTMLs) without the need to go into the SQL or the rest of the code...
Well, this is the vision I find the best (see [1] and [2] for how it can
be done). What I am afraid of is that the moment we incorporate the
presentation of the results within the TMQL we practically mix the three
roles described above, and for sure mix TMQL with XML or HTML.
[1]http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/Pinchuk/Pinchuk.pdf
[2]http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/Sharon/Sharon.pdf
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From Rani.Pinchuk@spaceapplications.com Fri Feb 20 13:32:06 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Fri, 20 Feb 2004 14:32:06 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040220105519.GA21106@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
Message-ID: <1077283925.7142.76.camel@mikush2>
> I would have no problem to get as result a "list of tuples", where
> every tuple value has a different type, e.g.
>
> 1. Match: [ 23, "Rumsti", ]
> 2. Match: [ 42, "Ramsti", ]
> .. ^^ ^^ ^^
> || || ||
> number|| ||
> || ||
> string map-fragment
>
> We have this in SQL, XQuery, so why not in TMQL?
What bothers me is the mix between scalar types and non scalar types
(topic node in this case).
> > 3. Should the scopes, and the variants of the base name be retrieved
> > with the base names, so actually a structure is retrieved, or
> > should the base names retrieved by themselves?
>
> I would say the user specifies what he wants and _exactly_ this should
> be returned. No hidden magic, no tricks and hidden intelligence.
>
I support this. I would think that one query could be:
select $topic.bn as string ....
which returns a list of baseNameString for each topic $topic (in
whatever scopes), and:
select $topic.bn as node
which returns the baseName node (including all the nodes inside it) of
the topic $topic.
I see (now) the need for both selects.
Rani
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From rho@bigpond.net.au Fri Feb 20 20:56:08 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Sat, 21 Feb 2004 06:56:08 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077283157.7133.59.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2>
Message-ID: <20040220205608.GD21106@namod.qld.bigpond.net.au>
On Fri, Feb 20, 2004 at 02:19:18PM +0100, Rani Pinchuk wrote:
> > 2. Match: [ 42, "Ramsti", ]
> > .. ^^ ^^ ^^
> > || || ||
> > number|| ||
> > || ||
> > string map-fragment
> >
> What bothers me is the mix between scalar types and non scalar types
> (topic node in this case).
Yes, we have to be careful not to go overboard with this.
> > I would say the user specifies what he wants and _exactly_ this should
> > be returned. No hidden magic, no tricks and hidden intelligence.
> >
>
> I support this. I would think that one query could be:
>
> select $topic.bn as string ....
>
> which returns a list of baseNameString for each topic $topic (in
> whatever scopes), and:
>
> select $topic.bn as node
>
> which returns the baseName node (including all the nodes inside it) of
> the topic $topic.
I would assume that a
select $topic.bn ...
could be reasonably defaulted to 'string' output.
--
What I am bit sceptical about is to allow _any_ kind of node type in
the result. So, to allow basename and occurrence and this and that
items as described in the TM DM as possible parts of a list
result. Going that path would link TMQL standard-wise VERY STRONGLY to
TMDM and I am not a friend of strong couplings.
One option would be to allow only maps or fragment thereofs and let
TMQL be completely opaque about this. Another (mumble, mumble) would
be to allow topics and assocations only, but this has some
problems....
\rho
From rho@bigpond.net.au Fri Feb 20 21:12:33 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Sat, 21 Feb 2004 07:12:33 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077283868.7142.74.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2> <1077200140.5610.51.camel@mikush2> <1077221479.5526.55.camel@mikush2> <1077283868.7142.74.camel@mikush2>
Message-ID: <20040220211233.GE21106@namod.qld.bigpond.net.au>
On Fri, Feb 20, 2004 at 02:31:08PM +0100, Rani Pinchuk wrote:
> > | As long as this XML/text/HTML is not special for TMQL, I don't see
> > | the advantage of having the definition of generating that format
> > | within TMQL.
> >
> > If you can demonstrate that the use of TMQL within some other
> > framework does this well I can assure you that we'll listen to you.
>
> I am not sure what is needed to be demonstrated. I see TMQL used as SQL.
> When there are results set the data from them can be used to generate
> different formats by an application.
>
> Take for example an application that generate HTMLs from SQL queries.
> I would like to separate the generation of the SQL from the rest of the
> program so the relational database expert can create, profile and
> maintain those queries without touching the rest of the application. The
> programmer who program in certain environment, should not take into
> account the SQL when making calculations over the queries results.
> The graphics person should create/maintain the presentation (so the
> HTMLs) without the need to go into the SQL or the rest of the code...
Ah! I think I sense a possible source of some misunderstanding:
I completely agree with what you say that the separation of concerns
is a pivot requirement. And I definitely agree that the generation of
__SOME__ (=arbitrary) output is not a good idea to have it in the language.
As I understand Lars' TMTL solution, that one allows to use _ARBITRARY_
text and that arbitrary text is returned (to the application or to the
environment).
In AsTMa? we do not allow to "generate arbitrary text". We only allow
- lists
- XML as an internal representation, and
- TM data (again in internal representation)
The only way to get text directly is to have it as part of the list.
I still think that these 3 formats (at least the first two) are widely
used and allow for flexible postprocessing. The inclusion of a "TM
Data Type" simply is added for symmetry.
The application designer will choose what structure suits the results
best. In all 3 cases he can add abstraction layers as you describe
in
> [1]http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/Pinchuk/Pinchuk.pdf
Hope this clears up a bit what I mean with "generating output". It
is, again, not text.
\rho
From Rani.Pinchuk@spaceapplications.com Fri Feb 20 22:38:48 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Fri, 20 Feb 2004 23:38:48 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040220211233.GE21106@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
<1077221479.5526.55.camel@mikush2>
<1077283868.7142.74.camel@mikush2>
<20040220211233.GE21106@namod.qld.bigpond.net.au>
Message-ID: <1077316727.5528.51.camel@mikush2>
>
> In AsTMa? we do not allow to "generate arbitrary text". We only allow
>
> - lists
> - XML as an internal representation, and
> - TM data (again in internal representation)
>
> The only way to get text directly is to have it as part of the list.
Indeed I was thinking that the idea is to create any textual format and
not only the three formats you pointed out.
But when looking at the example in the the tutorial of AsTMa?:
function test (map $m) as xml return
{
forall [ $a (album)
bn: $bn ] in $m
return
{$bn}
}
I cannot avoid from thinking about similar example like generating html:
function test2 (map $m) as text return
{
forall [ $a (album)
bn: $bn ] in $m
return
{$a}
{$bn}
}
or generating comma separated records:
function test3 (map $m) as text return
{
forall [ $a (album)
bn: $bn ] in $m
return
{$a},{$bn}
}
I mean - unless I missed something, the _syntax_ of AsTMa? does not limit
the output only to XML (apart from "as xml" declaration) and if there
is such a limitation, I find it maybe a bit artificial.
If, for example, XML is generated by the defining DTD and tying the result
set to different elements, the language is really built to generate only
XML and in my opinion, is also simpler because the fact that we narrowing
down to XML only is used to make it simpler for the user. Example (totally
imaginary extension to Toma which probably is not that consistent, but
hopefully will make sense):
# define dtd for the xml to be produced:
define dtd 'albums':
]>
;
# for clarity: without the presentation, the query looks like:
# select $a.id, $a.bn where exists $a;
#
select 'garbage' as attribute(albums, group),
$a.id as attribute(album, id),
$a.bn as element_data(album)
with dtd 'albums'
where exists $a;
Here there the fact that we use DTD, means that build in the syntax we
limit the language to produce XML. Hopefully this lead to simpler code
(mmm... suppose you have different queries that create the same XML, you
can use the same DTD, etc)?
I wonder if you find it the above reasonable...
Rani
From rho@bigpond.net.au Sat Feb 21 00:37:48 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Sat, 21 Feb 2004 10:37:48 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077316727.5528.51.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2> <1077200140.5610.51.camel@mikush2> <1077221479.5526.55.camel@mikush2> <1077283868.7142.74.camel@mikush2> <20040220211233.GE21106@namod.qld.bigpond.net.au> <1077316727.5528.51.camel@mikush2>
Message-ID: <20040221003748.GG21106@namod.qld.bigpond.net.au>
On Fri, Feb 20, 2004 at 11:38:48PM +0100, Rani Pinchuk wrote:
> > In AsTMa? we do not allow to "generate arbitrary text". We only allow
> >
> > - lists
> > - XML as an internal representation, and
> > - TM data (again in internal representation)
> Indeed I was thinking that the idea is to create any textual format and
> not only the three formats you pointed out.
>
> But when looking at the example in the the tutorial of AsTMa?:
>
> function test (map $m) as xml return
> {
> forall [ $a (album)
> bn: $bn ] in $m
> return
> {$bn}
> }
>
> I cannot avoid from thinking about similar example like generating html:
>
> function test2 (map $m) as text return
>
Could be that you are lured into thinking this. :-) The fact is that
it is XML in both cases and that the AsTMa? parser will, for instance,
refuse to accept this:
function test2 (map $m) as text return
{
forall [ $a (album)
bn: $bn ] in $m
return
{$a}
{$bn}
# superfluous here and
not closed
}
as this will never be able to generate well-formed XML.
It goes without saying (I like that phrase) that adopting particular
XML vocabularies does not lead to anywhere.
> or generating comma separated records:
>
> function test3 (map $m) as text return
> {
> forall [ $a (album)
> bn: $bn ] in $m
> return
> {$a},{$bn}
> }
Right, but AsTMa? does NOT allow this. You will have to open a tuple
like this:
function test3 (map $m) as list return {
forall [ $a (album)
bn: $bn ] in $m
return
({$a},{$bn})
}
So it is indicated by () that you plan to return a list of tuples.
If you start with I mean - unless I missed something, the _syntax_ of AsTMa? does not
> limit the output only to XML (apart from "as xml" declaration) and
> if there is such a limitation, I find it maybe a bit artificial.
Indeed AsTMa? DOES limit you to use these content types:
http://astma.it.bond.edu.au/astma%3F-spec.dbk?section=7
The reasoning is: To (a) work with TMs which encourage/force you to
structure your content and (b) work with XML which forces you to
structure your content and then (c) to allow to generate arbitrary
text seems rather hypocritical to me :-)
> If, for example, XML is generated by the defining DTD and tying the result
> set to different elements, the language is really built to generate only
> XML and in my opinion, is also simpler because the fact that we narrowing
> down to XML only is used to make it simpler for the user. Example (totally
> imaginary extension to Toma which probably is not that consistent, but
> hopefully will make sense):
>
> # define dtd for the xml to be produced:
> define dtd 'albums':
>
>
>
>
>
> ]>
> ;
>
> # for clarity: without the presentation, the query looks like:
> # select $a.id, $a.bn where exists $a;
> #
> select 'garbage' as attribute(albums, group),
> $a.id as attribute(album, id),
> $a.bn as element_data(album)
> with dtd 'albums'
> where exists $a;
This is exactly, what AsTMa? does except that it works without a
necessary commitment to DTD/WXS/Relax/Schematron/... and it has the
content inlined. I found that convenient at that time.
I could imagine that Toma uses XPath:
select 'garbage' into /albums/@group,
$a.id into /albums/album/@id,
$a.bn into /albums/album/text()
returns # type information here
and the returns clause could be optional.
> Here there the fact that we use DTD, means that build in the syntax
> we limit the language to produce XML. Hopefully this lead to simpler
> code ....
That was the idea. But as Lars pointed out: "It depends". In some
cases I'd rather prefer to have a list. Maybe I hate working with XML,
or do not like the overhead or do not need the structure, because my
content is just a simple string. Then forcing XML onto a user would be
rather impolite.
In some cases I _definitely_ want a TM as output. I am into content
syndication and without that feature all my solutions and ideas would
not work. If I cannot express a TM -> TM transformation which converts
content between two different ontologies, I would need dedicated
software for each of these transformations. A nightmare.
\rho
From Rani.Pinchuk@spaceapplications.com Sat Feb 21 11:46:23 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Sat, 21 Feb 2004 12:46:23 +0100
Subject: [tmql-wg] aliases
Message-ID: <1077363982.5529.225.camel@mikush2>
Hi all,
I would like to ask you about your opinion over an operator called
"alias" or "al" for short. The idea is that many topics might have
different basenames some in different scopes, but some are just
aliases.
For example - "processor" is also "central processing unit" or
"cpu".
Probably, all those strings will be located in the baseName node -
some as baseNameString and some as resourceData of a variant.
When looking for the processor topic, one might search it using
other terms then 'processor'. If the query should define a search
in the baseName + the resourceData of the variants (of maybe more
then one level), the query can become quite big. Instead the alias
operator can be used as follows:
select $topic where $topic.alias = 'cpu';
This query will look for all the topics that have baseNameString or
variant resourceData equal to 'cpu'.
Of course the alias can be used with scope (over the basename node):
select $topic where $topic.alias@foo = 'bar';
which will look for all the baseNameString or variant resourceData
equal to 'bar' in the scope 'foo'.
or
select $topic, $scope where $topic.alias@$scope = 'cpu';
which will look for all the baseNameString or variant resourceData
equal to 'cpu' in the different scopes and will return also the
scopes.
Do you find that this feature is needed in TMQL or maybe there is
other way to deal with aliases in Topic Maps?
Thanks,
Rani
From dmitryv@cogeco.ca Sat Feb 21 19:33:10 2004
From: dmitryv@cogeco.ca (Dmitry)
Date: Sat, 21 Feb 2004 14:33:10 -0500
Subject: [tmql-wg] Result set requirements
References: <1077184228.5605.4.camel@mikush2> <1077200140.5610.51.camel@mikush2> <1077221479.5526.55.camel@mikush2> <1077283868.7142.74.camel@mikush2> <20040220211233.GE21106@namod.qld.bigpond.net.au>
Message-ID: <003b01c3f8b1$8a3f4d10$7d01a8c0@dbn>
----- Original Message -----
From: "Robert Barta"
> In AsTMa? we do not allow to "generate arbitrary text". We only allow
>
> - lists
> - XML as an internal representation, and
> - TM data (again in internal representation)
>
> The only way to get text directly is to have it as part of the list.
>
> I still think that these 3 formats (at least the first two) are widely
> used and allow for flexible postprocessing. The inclusion of a "TM
> Data Type" simply is added for symmetry.
>
> The application designer will choose what structure suits the results
> best.
I think that it is important to have part of TMQL responsible for
generating results in formats which Robert described in his email.
It helps to minimize data model mismatch when we move results between topic
map engine(s) and external systems.
Let's say that we limit ourselves with tables as TMQL result set. We have
TMDM on the topic map engine side.
And we have something like TMAPI-based program on external system side. If
we have only tables as result set we have to
deconstruct rich TM structures into tuple streams and construct again
object-oriented structures on the client side.
If we support XML as a result set we can have very smooth transformations
between data models, for example:
TMDM - TMQL -> XML (with any vocabulary)
XML - XQuery,XSLT -> XML
XML - XSLT -> HTML
All these transformations are very natural and simple.
If we support TM data as a result set we can move TM data between systems /
layers whithout deconstructing/constracting TM data structures.
For example:
server TMDM - TMQL -> client TMDM
client TMDM -> - GUI binding + TMQL UPDATE -> client TMDM
client TMDM -> - TMQL UPDATE -> server TMDM
When we do GUI binding we can transform client TMDM into lists or XML, for
example.
Again, we have very natural transformations.
I believe that in future, programming languages will have native support for
XML (more precisely XML-like data models). XML APIs will be transformed
into language rich data models and syntax. If we have XML and TM data as
result sets we can have efficient and easy to use environment for building
TM-based systems. Otherwise we ask programmers to spend time on bridging
gaps between different data models.
Dmitry
From Rani.Pinchuk@spaceapplications.com Sat Feb 21 20:42:35 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Sat, 21 Feb 2004 21:42:35 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <003b01c3f8b1$8a3f4d10$7d01a8c0@dbn>
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
<1077221479.5526.55.camel@mikush2>
<1077283868.7142.74.camel@mikush2>
<20040220211233.GE21106@namod.qld.bigpond.net.au>
<003b01c3f8b1$8a3f4d10$7d01a8c0@dbn>
Message-ID: <1077396155.5533.249.camel@mikush2>
Dear all,
As you convinced me in the need to have different ways to generate
results from TMQL, I wrote some of the ideas I had or got from the
discussion here into a draft located in
http://www.spaceapplications.com/toma/draft_result_sets_and_associations.html
I warn you that this is VERY drafty document. Unlike
http://www.spaceapplications.com/toma/Toma.html where the examples are
actually run with the prototype, the queries and the output of them in
the draft were written manually.
You can ignore also the last part of the draft (about associations).
I will be happy if you find a moment to react on that draft (especially
if it really answers the needs for the output formats of TMQL).
Thanks
Rani
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From larsga@ontopia.net Mon Feb 23 08:46:34 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Mon, 23 Feb 2004 09:46:34 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040220205608.GD21106@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
Message-ID:
* Robert Barta
|
| I would assume that a
|
| select $topic.bn ...
|
| could be reasonably defaulted to 'string' output.
Actually, I'm not so sure. We currently have queries that do return
topic name items, and that need to do so. So I think there needs to be
a way to distinguish between the item and its string value.
| What I am bit sceptical about is to allow _any_ kind of node type in
| the result. So, to allow basename and occurrence and this and that
| items as described in the TM DM as possible parts of a list result.
| Going that path would link TMQL standard-wise VERY STRONGLY to TMDM
| and I am not a friend of strong couplings.
Don't you think that the very design of the TMQL would bind us very
strongly to TMDM anyway? If not, how do you think we can avoid that?
| One option would be to allow only maps or fragment thereofs and let
| TMQL be completely opaque about this.
How does that work?
| Another (mumble, mumble) would be to allow topics and assocations
| only, but this has some problems....
Yes... :-)
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From larsga@ontopia.net Mon Feb 23 09:17:17 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Mon, 23 Feb 2004 10:17:17 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040220105519.GA21106@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
Message-ID:
* Robert Barta
|
| I am rather sceptical about "query languages" which return "nodes"
| or other internal data structures. This is like XPath and has the
| problem that you have to use XPath _always_ with something else:
| XSLT, XQuery, or a programming language.
I'm surprised to hear you describe this as a problem. I've seen this
as one of the great strengths of XPath, and a major part of why it's
been such a resounding success. XML development without XPath would be
horribly much harder.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From larsga@ontopia.net Mon Feb 23 09:19:26 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Mon, 23 Feb 2004 10:19:26 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040220211233.GE21106@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2>
<1077200140.5610.51.camel@mikush2>
<1077221479.5526.55.camel@mikush2>
<1077283868.7142.74.camel@mikush2>
<20040220211233.GE21106@namod.qld.bigpond.net.au>
Message-ID:
* Robert Barta
|
| As I understand Lars' TMTL solution, that one allows to use
| _ARBITRARY_ text and that arbitrary text is returned (to the
| application or to the environment).
That's a goal, but unfortunately one that has not yet been met.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From rho@bigpond.net.au Mon Feb 23 21:20:25 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Tue, 24 Feb 2004 07:20:25 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To:
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au>
Message-ID: <20040223212025.GA2817@namod.qld.bigpond.net.au>
On Mon, Feb 23, 2004 at 09:46:34AM +0100, Lars Marius Garshol wrote:
>
> * Robert Barta
> |
> | I would assume that a
> |
> | select $topic.bn ...
> |
> | could be reasonably defaulted to 'string' output.
>
> Actually, I'm not so sure. We currently have queries that do return
> topic name items, and that need to do so. So I think there needs to be
> a way to distinguish between the item and its string value.
Yes, that's true, but...
> | What I am bit sceptical about is to allow _any_ kind of node type in
> | the result. So, to allow basename and occurrence and this and that
> | items as described in the TM DM as possible parts of a list result.
> | Going that path would link TMQL standard-wise VERY STRONGLY to TMDM
> | and I am not a friend of strong couplings.
>
> Don't you think that the very design of the TMQL would bind us very
> strongly to TMDM anyway?
Yes, we will probably do it this way, but it does not need to be this
way. I could, for instance bind the semantics of AsTMa? onto TMDM
underneath and could choose NOT to return any TMDM items.
Maybe there are dependencies I do not see yet...
* Robert Barta
|
| I am rather sceptical about "query languages" which return "nodes"
| or other internal data structures. This is like XPath and has the
| problem that you have to use XPath _always_ with something else:
| XSLT, XQuery, or a programming language.
> Lars Marius Garshol wrote:
> I'm surprised to hear you describe this as a problem. I've seen this
> as one of the great strengths of XPath, and a major part of why it's
> been such a resounding success. XML development without XPath would be
> horribly much harder.
Sorry, too fast and sloppy writing.
It is not a problem that "you have to use XPath _always_ with
something else". That you have the same sublanguage XPath in
XQuery/XSLT/Schematron is actually a nice thing to have.
It is a problem for the language designers to find the right level of
abstraction how XPath interacts with these languages. It is actually
only now that XPath 2.0/XSLT 2.0 is aligned with XQuery 1.0 and the
alignment with the DOM2 came also rather late.
[
XPath is harmless in the sense that it can return nodes and those
are basically DOM nodes. And XPath _in many_ case silently converts
nodes into string:
/..../....[some-element = "Rumsti"]
So that is an _easy_ model for a developer.
]
If we would allow TMQL to spit out everything the TMDM offers us, that
would, I think, drive the complexity for developers up. They would
have to learn TMDM AND TMQL at the same time. It would be a rather
high entry price for me as a developer.
--
> If not, how do you think we can avoid that?
> | One option would be to allow only maps or fragment thereofs and let
> | TMQL be completely opaque about this.
>
> How does that work?
I had this problem with AsTMa? as well. Here I decided to allow only
_one_ data structure to be passed back, that of a maplet (it is
nothing else than an association _including_ all necessary topic
data). So, if you say
foreach $a [ (is-associated-with) # I like that one, it comes from a student map
] return $a
you get a list of these associations together with the topics
they include (type, scope, roles, players). If you say
foreach [ $p (person) ] return $p
you also get a list of maplets (a single topic is a trivial maplet).
--
With that I simplified things for me as language designer and
hopefully also the developer. If he/she chooses only to see strings,
then the AsTMaPath expression gives this:
foreach [ $p (person) ]
return $p/bn # returns only the basename as string
If the developer then likes climbing through the data structure, then
he asks for $p only and drills down himself.
> | Another (mumble, mumble) would be to allow topics and assocations
> | only, but this has some problems....
>
> Yes... :-)
I think we _should_ allow to pass out structures. Again, whether we
can simply inject all of TMDM into the language is less clear to me.
If I say in Toma
select $t.basename, .....
and I get a basename item where in 99% of the cases I am interested in
the string, I would not be too happy to do the XPath-like conversion
select string($t.basename), ...
Argh, I do not like typing :-)
--
Another consideration in this context is 'typing'. XQuery has done
that to quite some extent and I know that many people in the database
community take that quite seriously for an excellent means to arrive
at some performance.
Introducing all TMDM data structures would probably add another
type dimension into TMQL.
--
One option we have is to treat this as "language pragma"
pragma INTERFACE_MODEL (TMDM)
return $t/bn # returns basename item
pragma INTERFACE_MODEL (simple)
return $t/bn # return string
pragma INTERFACE_MODEL (maplet)
return $t # returns topic as part of a maplet
only allowed on the outest level so that it does not affect language
semantics, but only interface semantics.
Or, to choose a reasonable default and use 'ATTRIBUTE'
(meta-information on data):
return $t/bn'TMDM # returns basename as item
return $t/bn # default is stringification
return $t/bn'VALUE # same
--
\rho
From dmitryv@cogeco.ca Tue Feb 24 02:26:21 2004
From: dmitryv@cogeco.ca (Dmitry)
Date: Mon, 23 Feb 2004 21:26:21 -0500
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040223212025.GA2817@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au>
Message-ID:
>
> Introducing all TMDM data structures would probably add another
> type dimension into TMQL.
>
> --
>
> One option we have is to treat this as "language pragma"
>
> pragma INTERFACE_MODEL (TMDM)
> return $t/bn # returns basename item
>
> pragma INTERFACE_MODEL (simple)
> return $t/bn # return string
>
> pragma INTERFACE_MODEL (maplet)
> return $t # returns topic as part of a maplet
>
> only allowed on the outest level so that it does not affect language
> semantics, but only interface semantics.
>
> Or, to choose a reasonable default and use 'ATTRIBUTE'
> (meta-information on data):
>
> return $t/bn'TMDM # returns basename as item
> return $t/bn # default is stringification
> return $t/bn'VALUE # same
>
>
I am not sure about Toma and AsTMa but in TMPath I have special
'shortcuts' which help to get and return values.
If I need base name node I use:
$topic/bn[] or just $topic/bn
If I need value I use $topic/bn:: or $topic/bn::*
TMPath provides access and allow to return TMDM constructs.
For example, long path:
$topic/roleOf[who]/association[born-in]/role[where]/topic[city]
goes by all TMDM nodes.
But there are also 'shortcuts' which help to move in 'hyper space'.
$topic/roleOf::who[born-in]/role::where[city]
or $topic/who[born-in]/where[city]
or (with binary projection bornIn for association born-in)
$topic/bornIn[city]
Detailed paths are used typically in questions with introspection.
Shortcuts are used when I know types of names, occurrences, roles and
associations and I am interested in "values".
I think that "full" result sets such as XML, TopicMaps/fragments and
tables can be reasonably easy integrated with host environment.
XML->DOM or SAX
Topicmap -> TMAPI based set of classes
tuple streams -> lists or streams or cursors
I am not so sure about partial results such as list of base names.
Partial results are useful between TMQL layers (TOLOG->TMTL)
But I do not see "standard" way of integration of partial results with
hosting languages.
Dmitry
From dmitryv@cogeco.ca Tue Feb 24 13:20:33 2004
From: dmitryv@cogeco.ca (Dmitry)
Date: Tue, 24 Feb 2004 08:20:33 -0500
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040223212025.GA2817@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au>
Message-ID: <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
>
> Introducing all TMDM data structures would probably add another
> type dimension into TMQL.
>
> --
>
> One option we have is to treat this as "language pragma"
>
> pragma INTERFACE_MODEL (TMDM)
> return $t/bn # returns basename item
>
> pragma INTERFACE_MODEL (simple)
> return $t/bn # return string
>
> pragma INTERFACE_MODEL (maplet)
> return $t # returns topic as part of a maplet
>
> only allowed on the outest level so that it does not affect language
> semantics, but only interface semantics.
>
> Or, to choose a reasonable default and use 'ATTRIBUTE'
> (meta-information on data):
>
> return $t/bn'TMDM # returns basename as item
> return $t/bn # default is stringification
> return $t/bn'VALUE # same
>
> \rho
>
I personally prefer explicit XQuery-like constructors.
We can have constructors for XML, lists and Topic Maps as default set.
And we can allow to provide additional set of constructors (wiki pages,
for example)
Constructors, I think, can be mixed with any "select" query language.
Dmitry
From larsga@ontopia.net Wed Feb 25 22:03:38 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Wed, 25 Feb 2004 23:03:38 +0100
Subject: [tmql-wg] aliases
In-Reply-To: <1077363982.5529.225.camel@mikush2>
References: <1077363982.5529.225.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| I would like to ask you about your opinion over an operator called
| "alias" or "al" for short. The idea is that many topics might have
| different basenames some in different scopes, but some are just
| aliases.
Are they not all aliases, in the sense that they are different names
for the same thing?
| Probably, all those strings will be located in the baseName node -
| some as baseNameString and some as resourceData of a variant.
Yep.
| When looking for the processor topic, one might search it using
| other terms then 'processor'. If the query should define a search in
| the baseName + the resourceData of the variants (of maybe more then
| one level), the query can become quite big. Instead the alias
| operator can be used as follows:
|
| select $topic where $topic.alias = 'cpu';
Hmmmm. We actually support this in tolog as follows:
select $TOPIC from
{ value($TN, "cpu") |
value($VN, "cpu"), variant-name($TN, $VN) },
topic-name($TOPIC, $TN)?
As you say the alias bit is kind of awkward, but you could solve that
by making an inference rule as follows:
alias($TOPIC, $NAME) :- {
value($TN, $NAME) |
value($VN, $NAME), variant-name($TN, $VN)
}, topic-name($TOPIC, $TN).
Now doing this sort of query is suddenly trivial:
alias($TOPIC, "cpu")?
In my opinion this kind of operation is not sufficiently fundamental
that it belongs in the QL itself, but that we should have some kind of
modularization feature that allows this to be encapsulated and reused
by the language user. AsTMa? has this in the form of function
definitions, tolog in the form of inference rules, and it's our goal
that TMQL will also have this in some form.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From larsga@ontopia.net Wed Feb 25 22:06:49 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Wed, 25 Feb 2004 23:06:49 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
Message-ID:
* dmitryv@cogeco.ca
|
| I personally prefer explicit XQuery-like constructors.
|
| We can have constructors for XML, lists and Topic Maps as default set.
|
| And we can allow to provide additional set of constructors (wiki
| pages, for example)
|
| Constructors, I think, can be mixed with any "select" query language.
That's an interesting idea, I think. Personally, I'm not clear on what
I prefer, but it would be interesting to see the constructor approach
and compare it with what AsTMa? does.
I have to admit I hadn't thought of this myself, but it does seem that
you can do TM -> TM transformations even with something as un-tm-ish
(to use Robert's terminology :) as tolog. I already had an idea for TM
constructors for tolog, but didn't think of them being used in this
way.
Hmmmmmmmm.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From dmitryv@cogeco.ca Thu Feb 26 02:06:48 2004
From: dmitryv@cogeco.ca (Dmitry)
Date: Wed, 25 Feb 2004 21:06:48 -0500
Subject: [tmql-wg] Result set requirements
References: <1077184228.5605.4.camel@mikush2><20040220105519.GA21106@namod.qld.bigpond.net.au><1077283157.7133.59.camel@mikush2><20040220205608.GD21106@namod.qld.bigpond.net.au><20040223212025.GA2817@namod.qld.bigpond.net.au><39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
Message-ID: <000b01c3fc0d$3635dd60$7d01a8c0@dbn>
From: "Lars Marius Garshol"
> |
> | I personally prefer explicit XQuery-like constructors.
> |
> | We can have constructors for XML, lists and Topic Maps as default set.
> |
> | And we can allow to provide additional set of constructors (wiki
> | pages, for example)
> |
> | Constructors, I think, can be mixed with any "select" query language.
>
> That's an interesting idea, I think. Personally, I'm not clear on what
> I prefer, but it would be interesting to see the constructor approach
> and compare it with what AsTMa? does.
>
> I have to admit I hadn't thought of this myself, but it does seem that
> you can do TM -> TM transformations even with something as un-tm-ish
> (to use Robert's terminology :) as tolog. I already had an idea for TM
> constructors for tolog, but didn't think of them being used in this
> way.
I am thinking about something like this (I just rewrote my TMPath
implementation using Tolog for "select" queries):
54.1
Select only journal papers and deliver them together with the relevant
author associations as
topic map (fragments [xtmfragments]). There is no need to include the author
topics as well.
Replace all occurrences of type publication-date with associations of type
was-published-in.
For the date-role the topic players should use ids of the form x-dates-yyyy.
declare default subjectIndicator
namespace="http://psi........com/defaultpsi/#"
topicMap{
for $JournalPaper in instanceOf($JournalPaper,journal-paper)
for $IsAuthor in
association($IsAuthor,is-author_of),role($IsAuthor,opus,$journalPaper)
return(
topic($JournalPaper,$NewTopic){ ## we construct
new topic based on existing one and $newTopic binds to it
retract publication-date($Self,_)
## delete some assertions from $newTopic
}
association($IsAuthor,_){ ##copy
and modify association, association constructor takes care of referencing
"hidden" topics
replace role opus{$NewTopic}
}
association
was-published-in{ ## create new
association
role opus {$NewTopic}
role date-role {
topic{
id{
select $PubDate from
publication-date($JournalPaper,$PubDate)
select $NewValue from
concat("x-dates-",$PubDate,$NewValue)
return $NewValue
}
}
}
}
)
}
Dmitry
From dmitryv@cogeco.ca Thu Feb 26 02:29:19 2004
From: dmitryv@cogeco.ca (Dmitry)
Date: Wed, 25 Feb 2004 21:29:19 -0500
Subject: [tmql-wg] Result set requirements
References: <1077184228.5605.4.camel@mikush2><20040220105519.GA21106@namod.qld.bigpond.net.au><1077283157.7133.59.camel@mikush2><20040220205608.GD21106@namod.qld.bigpond.net.au><20040223212025.GA2817@namod.qld.bigpond.net.au><39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn>
Message-ID: <002601c3fc10$564821f0$7d01a8c0@dbn>
From: "Lars Marius Garshol"
> |
> | I personally prefer explicit XQuery-like constructors.
> |
> | We can have constructors for XML, lists and Topic Maps as default set.
> |
> | And we can allow to provide additional set of constructors (wiki
> | pages, for example)
> |
> | Constructors, I think, can be mixed with any "select" query language.
>
> That's an interesting idea, I think. Personally, I'm not clear on what
> I prefer, but it would be interesting to see the constructor approach
> and compare it with what AsTMa? does.
>
> I have to admit I hadn't thought of this myself, but it does seem that
> you can do TM -> TM transformations even with something as un-tm-ish
> (to use Robert's terminology :) as tolog. I already had an idea for TM
> constructors for tolog, but didn't think of them being used in this
> way.
I just copied sample with "good formatting":
http://homepage.mac.com/dmitryv/TopicMaps/TMPath/TologBasedCostructorsSample.htm
Dmitry
From Rani.Pinchuk@spaceapplications.com Thu Feb 26 09:10:24 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Thu, 26 Feb 2004 10:10:24 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <000b01c3fc0d$3635dd60$7d01a8c0@dbn>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
Message-ID: <1077786624.5732.71.camel@mikush2>
Hi Dmitry,
I am not sure about this attitude: in my opinion it is not that good to
mix two languages in such a way. It brings into a very un-readable
result. The almost "prove" to that is that you had to indent your
example very carefully so it will be readable.
Other problem with mixing two languages is that you have to know the two
in order to use them.
As I see it, the constructors controls the way the data is presented,
while the data itself is controlled by the query language. I would try
to separate those two (so separating the logic from the presentation).
Arguments for such separation can be found in the Skin and the
Phrasebook design patterns
(http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/proceedings.html)
Rani
On Thu, 2004-02-26 at 03:06, Dmitry wrote:
> From: "Lars Marius Garshol"
> > |
> > | I personally prefer explicit XQuery-like constructors.
> > |
> > | We can have constructors for XML, lists and Topic Maps as default set.
> > |
> > | And we can allow to provide additional set of constructors (wiki
> > | pages, for example)
> > |
> > | Constructors, I think, can be mixed with any "select" query language.
> >
> > That's an interesting idea, I think. Personally, I'm not clear on what
> > I prefer, but it would be interesting to see the constructor approach
> > and compare it with what AsTMa? does.
> >
> > I have to admit I hadn't thought of this myself, but it does seem that
> > you can do TM -> TM transformations even with something as un-tm-ish
> > (to use Robert's terminology :) as tolog. I already had an idea for TM
> > constructors for tolog, but didn't think of them being used in this
> > way.
>
> I am thinking about something like this (I just rewrote my TMPath
> implementation using Tolog for "select" queries):
>
> 54.1
>
> Select only journal papers and deliver them together with the relevant
> author associations as
> topic map (fragments [xtmfragments]). There is no need to include the author
> topics as well.
> Replace all occurrences of type publication-date with associations of type
> was-published-in.
> For the date-role the topic players should use ids of the form x-dates-yyyy.
>
>
> declare default subjectIndicator
> namespace="http://psi........com/defaultpsi/#"
>
> topicMap{
> for $JournalPaper in instanceOf($JournalPaper,journal-paper)
> for $IsAuthor in
> association($IsAuthor,is-author_of),role($IsAuthor,opus,$journalPaper)
> return(
>
> topic($JournalPaper,$NewTopic){ ## we construct
> new topic based on existing one and $newTopic binds to it
> retract publication-date($Self,_)
> ## delete some assertions from $newTopic
> }
>
>
> association($IsAuthor,_){ ##copy
> and modify association, association constructor takes care of referencing
> "hidden" topics
> replace role opus{$NewTopic}
> }
> association
> was-published-in{ ## create new
> association
> role opus {$NewTopic}
> role date-role {
> topic{
> id{
> select $PubDate from
> publication-date($JournalPaper,$PubDate)
> select $NewValue from
> concat("x-dates-",$PubDate,$NewValue)
> return $NewValue
> }
> }
> }
> }
> )
> }
>
>
> Dmitry
>
>
>
> _______________________________________________
> tmql-wg mailing list
> tmql-wg@isotopicmaps.org
> http://www.isotopicmaps.org/mailman/listinfo/tmql-wg
--
Rani Pinchuk tel: +32 3 326 79 97
fax: +32 2 612 43 08
rani@cpan.org
http://pinchuk.homeip.net
From Rani.Pinchuk@spaceapplications.com Thu Feb 26 09:46:33 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Thu, 26 Feb 2004 10:46:33 +0100
Subject: [tmql-wg] aliases
In-Reply-To:
References: <1077363982.5529.225.camel@mikush2>
Message-ID: <1077788793.5747.83.camel@mikush2>
Hi Lars,
Two comments/questions:
1. Variants can be placed one inside the other. I am not sure if
select $TOPIC from
{ value($TN, "cpu") |
value($VN, "cpu"), variant-name($TN, $VN) },
topic-name($TOPIC, $TN)?
covers that, but I might be wrong (?).
2. It seems to me that in many applications it is necessary to find a
topic only by its name (for example question/answering systems), and if
so, the alias command can become quite popular.
On the other hand, macros/functions/inference rules usually suffer in
performance compared to a built in feature (especially here, if you take
into account the recursive nature of the variant).
I am not against having a mechanism to extend the language like
inference rules, but I think that because of performance issues, the
popular functions should be built in. Isn't alias popular enough?
Rani
On Wed, 2004-02-25 at 23:03, Lars Marius Garshol wrote:
> * Rani Pinchuk
> |
> | I would like to ask you about your opinion over an operator called
> | "alias" or "al" for short. The idea is that many topics might have
> | different basenames some in different scopes, but some are just
> | aliases.
>
> Are they not all aliases, in the sense that they are different names
> for the same thing?
>
> | Probably, all those strings will be located in the baseName node -
> | some as baseNameString and some as resourceData of a variant.
>
> Yep.
>
> | When looking for the processor topic, one might search it using
> | other terms then 'processor'. If the query should define a search in
> | the baseName + the resourceData of the variants (of maybe more then
> | one level), the query can become quite big. Instead the alias
> | operator can be used as follows:
> |
> | select $topic where $topic.alias = 'cpu';
>
> Hmmmm. We actually support this in tolog as follows:
>
> select $TOPIC from
> { value($TN, "cpu") |
> value($VN, "cpu"), variant-name($TN, $VN) },
> topic-name($TOPIC, $TN)?
>
> As you say the alias bit is kind of awkward, but you could solve that
> by making an inference rule as follows:
>
> alias($TOPIC, $NAME) :- {
> value($TN, $NAME) |
> value($VN, $NAME), variant-name($TN, $VN)
> }, topic-name($TOPIC, $TN).
>
> Now doing this sort of query is suddenly trivial:
>
> alias($TOPIC, "cpu")?
>
> In my opinion this kind of operation is not sufficiently fundamental
> that it belongs in the QL itself, but that we should have some kind of
> modularization feature that allows this to be encapsulated and reused
> by the language user. AsTMa? has this in the form of function
> definitions, tolog in the form of inference rules, and it's our goal
> that TMQL will also have this in some form.
From larsga@ontopia.net Thu Feb 26 10:15:10 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Thu, 26 Feb 2004 11:15:10 +0100
Subject: [tmql-wg] aliases
In-Reply-To: <1077788793.5747.83.camel@mikush2>
References: <1077363982.5529.225.camel@mikush2>
<1077788793.5747.83.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| 1. Variants can be placed one inside the other.
Nope. They can in the XTM syntax, but that's flattened in the data
model, so this isn't an issue. (Look at the variant item and you'll
see.)
| I am not sure if
|
| select $TOPIC from
| { value($TN, "cpu") |
| value($VN, "cpu"), variant-name($TN, $VN) },
| topic-name($TOPIC, $TN)?
|
| covers that, but I might be wrong (?).
It doesn't cover it, but that's because it doesn't need to. :-)
| 2. It seems to me that in many applications it is necessary to find
| a topic only by its name (for example question/answering systems),
| and if so, the alias command can become quite popular.
Oh, certainly. The trouble is that there is a *lot* of this stuff that
probably will be wanted often. We can put some shortcuts into the
language, but it's very important not to add too many. So erring on
the side of caution to begin with is, I think, crucial.
| On the other hand, macros/functions/inference rules usually suffer
| in performance compared to a built in feature [...]
That is true.
| I am not against having a mechanism to extend the language like
| inference rules, but I think that because of performance issues, the
| popular functions should be built in. Isn't alias popular enough?
I don't think so, but experience could prove me wrong. Note that what
we could do is do the first version without it, then add it in a later
version. We could also define it as a function/inference rule in the
standard, and then let implementations decide how they want to
implement it.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From dmitryv@cogeco.ca Thu Feb 26 13:25:02 2004
From: dmitryv@cogeco.ca (Dmitry)
Date: Thu, 26 Feb 2004 08:25:02 -0500
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077786624.5732.71.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au> <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn> <1077786624.5732.71.camel@mikush2>
Message-ID: <2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
On Feb 26, 2004, at 4:10 AM, Rani Pinchuk wrote:
>
> As I see it, the constructors controls the way the data is presented,
> while the data itself is controlled by the query language. I would try
> to separate those two (so separating the logic from the presentation).
> Arguments for such separation can be found in the Skin and the
> Phrasebook design patterns
> (http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/proceedings.html)
>
>
Constructors have nothing to do with data presentation. They allow to
create new objects.
These objects can be based on TMDM, XML DOM or other data models. They
are data structures.
Pure "select" query languages do not allow creation of new objects. I
see it as a problem for TMQL.
My proposal, I think, has a good separation between "select" and
"construct" parts. That's why it was easy to replace
XPath-like select language with Prolog-like select language
I think also that tolog can get additional power with such constructs
as "for", "if", "every, some ... satisfies".
Actually, TMTL does some of that for tolog, I just try to generalize
the idea of constructors.
Dmitry
From Rani.Pinchuk@spaceapplications.com Thu Feb 26 14:47:11 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Thu, 26 Feb 2004 15:47:11 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
<1077786624.5732.71.camel@mikush2>
<2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
Message-ID: <1077806830.5732.145.camel@mikush2>
> >
> Constructors have nothing to do with data presentation. They allow to
> create new objects.
> These objects can be based on TMDM, XML DOM or other data models. They
> are data structures.
>
> Pure "select" query languages do not allow creation of new objects. I
> see it as a problem for TMQL.
Sorry for my mistake. However, the constructors playing other role then
querying - Their role is to construct data models from whatever data.
Among that, also data from TMQL queries.
If you look at the example you gave, you see there queries which
responsible to WHAT data is delivered, and you see there the constructor
parts which responsible to HOW the data is delivered.
Now look at the queries only - this is what I would call TMQL. All the
rest, in my opinion, should not be included into TMQL.
It is not that I find it wrong to write queries within
presentation/construction templates - I think that it SOMETIMES wrong
(usually when the project is big, and you need separation of
responsibilities between the people who work on different domains).
But if this mixture is built into one language, it means that you cannot
avoid from it in whatever circumstance.
You write that pure "select" query languages do not allow creation of
new objects. I also find this as a problem - strings are not enough as
result sets.
But on the other hand I am not sure that TMQL should be able to return
ANY object. And I think that the solution should be within what I call
TMQL, and not outside of it. So I think that the selects themselves
should be able to return something else then just strings - not any
object, but maybe topic nodes and association nodes.
My initial attitude to the whole problem was to return nothing but topic
objects and association objects from the selects. And have in the API
that uses the TMQL (this API for TMQL is like JDBC for SQL) methods that
let us access the different elements in the topics. This way, the
language becomes much simpler, and we can avoid from the above
questions. However, it is true that there are some performance penalties
for this attitude - some topics might hold many big resourceData
elements, and it might be annoying to pay that price for seeing a simple
list of topic ids...
But this performance problem might be solved in other ways - for example
using the idea that came I think from Jan Algermissen of using light
weight objects - so a topic object that contains only the base name
element and id if we asked to see the base names.
I wrote some of those ideas for Toma in
http://www.spaceapplications.com/toma/Toma.html#select_clause__nodes
but note that the idea of constructing XML from DTD there is
conceptually similar to the constructors (so it is again a mixture of
data querying and the way that data is delivered).
Rani
From larsga@ontopia.net Thu Feb 26 22:08:50 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Thu, 26 Feb 2004 23:08:50 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <000b01c3fc0d$3635dd60$7d01a8c0@dbn>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
Message-ID:
* dmitryv@cogeco.ca
|
| declare default subjectIndicator
| namespace="http://psi........com/defaultpsi/#"
| [...]
Hmmmm. Interesting. This is similar to what I thought of, though I was
thinking of recycling the LTM syntax, something like this:
evaluate
now-associated-in-new-way-that-we-just-found-out($A : role1, $B : role2)
for
were-already-associated-in-way-one($A : role1, $C : role3),
were-already--associated-in-way-two($C : role2, $B : role14)?
It's of course also possible to do things like 5.4.1, but I suspect it
would be rather cumbersome. It's too late at night for me to give it a
real try right now.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From larsga@ontopia.net Thu Feb 26 22:18:16 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Thu, 26 Feb 2004 23:18:16 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077786624.5732.71.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
<1077786624.5732.71.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| I am not sure about this attitude: in my opinion it is not that good
| to mix two languages in such a way. It brings into a very
| un-readable result. The almost "prove" to that is that you had to
| indent your example very carefully so it will be readable.
I'm not sure whether the problem here is the approach in general or
the particular syntax chosen. It could be that it is the approach, but
I think it's worth a try.
| Other problem with mixing two languages is that you have to know the
| two in order to use them.
True.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From larsga@ontopia.net Thu Feb 26 22:22:05 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Thu, 26 Feb 2004 23:22:05 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
<1077786624.5732.71.camel@mikush2>
<2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
Message-ID:
* dmitryv@cogeco.ca
|
| Pure "select" query languages do not allow creation of new objects.
| I see it as a problem for TMQL.
It's clear that a pure "select" won't do this.
| I think also that tolog can get additional power with such
| constructs as "for", "if", "every, some ... satisfies".
It would certainly need grouping in the constructors, but how much
more we need I don't know. It already has "if", and you can do "every"
and "some" as well.
| Actually, TMTL does some of that for tolog, I just try to generalize
| the idea of constructors.
Yep.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From larsga@ontopia.net Thu Feb 26 22:33:44 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Thu, 26 Feb 2004 23:33:44 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077806830.5732.145.camel@mikush2>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
<1077786624.5732.71.camel@mikush2>
<2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
<1077806830.5732.145.camel@mikush2>
Message-ID:
* Rani Pinchuk
|
| But on the other hand I am not sure that TMQL should be able to
| return ANY object. And I think that the solution should be within
| what I call TMQL, and not outside of it. So I think that the selects
| themselves should be able to return something else then just strings
| - not any object, but maybe topic nodes and association nodes.
I think we'll need access to all objects, actually, and topics and
associations are the heavy objects, so it's difficult to see that
there's any real cost to adding in the other ones as well.
| My initial attitude to the whole problem was to return nothing but
| topic objects and association objects from the selects. And have in
| the API that uses the TMQL (this API for TMQL is like JDBC for SQL)
| methods that let us access the different elements in the topics.
I think you'll find that in quite a few cases you want to find
finer-grained objects as well. We've certainly needed this.
| This way, the language becomes much simpler, [...]
I'm not sure it does. In the case of tolog we'd have to add
restrictions rather than be able to simplify the language in any way.
| However, it is true that there are some performance penalties for
| this attitude - some topics might hold many big resourceData
| elements, and it might be annoying to pay that price for seeing a
| simple list of topic ids...
|
| But this performance problem might be solved in other ways - for
| example using the idea that came I think from Jan Algermissen of
| using light weight objects - so a topic object that contains only
| the base name element and id if we asked to see the base names.
True, but doesn't this undermine your argument that retrieving the
objects themselves is too costly? (This, by the way, is precisely what
we do today.)
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From rho@bigpond.net.au Thu Feb 26 23:23:48 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Fri, 27 Feb 2004 09:23:48 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To:
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au>
Message-ID: <20040226232348.GA13873@namod.qld.bigpond.net.au>
On Mon, Feb 23, 2004 at 09:26:21PM -0500, Dmitry wrote:
> I am not sure about Toma and AsTMa but in TMPath I have special
> 'shortcuts' which help to get and return values.
You mean 'string' values (in contrast to node values), I guess?
> If I need base name node I use:
>
> $topic/bn[] or just $topic/bn
>
> If I need value I use $topic/bn:: or $topic/bn::*
Aren't we talking here about 'stringification' of nodes? If so,
then this could be solved, as above, with a postfix. It could be
solved with a function:
string($topic/bn) # this returns a string
It could be solved contextual, so in a node context I get nodes:
$topic/roleOf[who]/association[born-in]/role[where]/topic[city]
^ ^
| | ....
all nodes
my string $s = $topic/roleOf[who] # get a string here, because I wanted one
> Shortcuts are used when I know types of names, occurrences, roles
> and associations and I am interested in "values".
But you have to "know" it. The language could know that for you as
it knows in which context you are using something. This is like XPath
//chapter[title = "Hell Freezes Over"]
where the 'title' node is converted to string for the comparison to
happen.
> I think that "full" result sets such as XML, TopicMaps/fragments and
> tables can be reasonably easy integrated with host environment.
> XML->DOM or SAX
> Topicmap -> TMAPI based set of classes
> tuple streams -> lists or streams or cursors
Yes, ...
> I am not so sure about partial results such as list of base names.
...do you now have problems with 'base name strings' or 'base name
nodes'?
\rho
From rho@bigpond.net.au Thu Feb 26 23:30:46 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Fri, 27 Feb 2004 09:30:46 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <002601c3fc10$564821f0$7d01a8c0@dbn>
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au> <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn> <002601c3fc10$564821f0$7d01a8c0@dbn>
Message-ID: <20040226233046.GB13873@namod.qld.bigpond.net.au>
On Wed, Feb 25, 2004 at 09:29:19PM -0500, Dmitry wrote:
> > | I personally prefer explicit XQuery-like constructors.
> I just copied sample with "good formatting":
>
> http://homepage.mac.com/dmitryv/TopicMaps/TMPath/TologBasedCostructorsSample.htm
Dmitry,
That does not look like XQuery at all :-))
topic($JournalPaper,$NewTopic){ ## we construct new topic based on existing one and
## $newTopic binds to it
retract publication-date($Self,_) ## delete some assertions from $newTopic
Cloning nodes and adding/retracting information looks very complext to
me. And, more generally, it facvours those transformations where there
is a lot of similarity between incoming and outgoing information
(read: ontology).
Not sure, whether this is good and bad.
--
I think I discussed this a while back with Lars on IRC, that
- Once we decide to generate something fancier than lists,
- we _HAVE TO_ commit ourselves to a notation for that.
For lists this is easy:
return (or for the SQL fans 'select')
# here the values go
A, B, C, ...
For XML it probably is...XML!
return/select
....
I do not think that DOM2 constructors will make us happy ;-)
For TM it is ....hmmmm?
Of course I used AsTMa= for AsTMa? All other approaches will have to
come up with some syntax as well.
\rho
From larsga@ontopia.net Thu Feb 26 23:50:34 2004
From: larsga@ontopia.net (Lars Marius Garshol)
Date: Fri, 27 Feb 2004 00:50:34 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040226233046.GB13873@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2>
<20040220105519.GA21106@namod.qld.bigpond.net.au>
<1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
<002601c3fc10$564821f0$7d01a8c0@dbn>
<20040226233046.GB13873@namod.qld.bigpond.net.au>
Message-ID:
* Robert Barta
|
| I do not think that DOM2 constructors will make us happy ;-)
DOM is just evil, and DOM2 doubly so. Using the syntax is waaay
better.
| For TM it is ....hmmmm?
I think we should seriously consider the AsTMa* path of making a
compact interchange syntax as the basis for the QL and then use that
for the constructors as well (if we do decide to have constructors).
| Of course I used AsTMa= for AsTMa? All other approaches will have to
| come up with some syntax as well.
Yeah. It's best if we can make all these syntaxes consistent.
--
Lars Marius Garshol, Ontopian
GSM: +47 98 21 55 50
From dmitryv@cogeco.ca Fri Feb 27 01:43:49 2004
From: dmitryv@cogeco.ca (Dmitry)
Date: Thu, 26 Feb 2004 20:43:49 -0500
Subject: [tmql-wg] Result set requirements
In-Reply-To:
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au> <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn>
Message-ID: <639E8BBE-68C6-11D8-9C18-000A957183D4@cogeco.ca>
On Feb 26, 2004, at 5:08 PM, Lars Marius Garshol wrote:
>
> * dmitryv@cogeco.ca
> |
> | declare default subjectIndicator
> | namespace="http://psi........com/defaultpsi/#"
> | [...]
>
> Hmmmm. Interesting. This is similar to what I thought of, though I was
> thinking of recycling the LTM syntax, something like this:
>
> evaluate
> now-associated-in-new-way-that-we-just-found-out($A : role1, $B :
> role2)
> for
> were-already-associated-in-way-one($A : role1, $C : role3),
> were-already--associated-in-way-two($C : role2, $B : role14)?
>
>
Actually, constructors can be "literal" and explicit. LTM-like literal
constructor can be equivalent to XML literal constructors in XQuery
For example,
topicmap ## this is an explicit constructor
{
[johnSmith] ## literal constructors
[paper20040226]
topic ## explicit constructor
{
id{someOtherTopic}
}
is-author-of(johnSmith:author,paper20040226:work)
## but we also can include code here
{
for $X in
return
[$NewID={};{} ]
}
}
Dmitry
From dmitryv@cogeco.ca Fri Feb 27 02:40:46 2004
From: dmitryv@cogeco.ca (Dmitry)
Date: Thu, 26 Feb 2004 21:40:46 -0500
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040226233046.GB13873@namod.qld.bigpond.net.au>
References: <1077184228.5605.4.camel@mikush2> <20040220105519.GA21106@namod.qld.bigpond.net.au> <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au> <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn> <002601c3fc10$564821f0$7d01a8c0@dbn> <20040226233046.GB13873@namod.qld.bigpond.net.au>
Message-ID: <58A386E2-68CE-11D8-9C18-000A957183D4@cogeco.ca>
On Feb 26, 2004, at 6:30 PM, Robert Barta wrote:
> On Wed, Feb 25, 2004 at 09:29:19PM -0500, Dmitry wrote:
>>> | I personally prefer explicit XQuery-like constructors.
>
>> I just copied sample with "good formatting":
>>
>> http://homepage.mac.com/dmitryv/TopicMaps/TMPath/
>> TologBasedCostructorsSample.htm
>
> Dmitry,
>
> That does not look like XQuery at all :-))
>
> topic($JournalPaper,$NewTopic){ ## we construct new
> topic based on existing one and
> ## $newTopic binds to
> it
> retract publication-date($Self,_) ## delete some
> assertions from $newTopic
>
> Cloning nodes and adding/retracting information looks very complext to
> me. And, more generally, it facvours those transformations where there
> is a lot of similarity between incoming and outgoing information
> (read: ontology).
>
> Not sure, whether this is good and bad.
I decided to add "coping with small modification" just because I think
it will be typical pattern.
But it is possible to use coping and pure constructors without
modifications.
> I think I discussed this a while back with Lars on IRC, that
>
> - Once we decide to generate something fancier than lists,
>
> - we _HAVE TO_ commit ourselves to a notation for that.
>
> [...]
> For TM it is ....hmmmm?
>
> Of course I used AsTMa= for AsTMa? All other approaches will have to
> come up with some syntax as well.
I have less concerns about specific notation here. It can be LTM, AsTMa
= - based.
I would probably prefer Python-like syntax :-) , something like this:
topicMap:
johnSmith:
bn="John Smith"
who[is-author-of]:
id=paper2004-02-26
bn="Some paper name"
oc::publicationDate="2003-09-03"
....
But...I think important idea is that TMQL should have rich
"constructor" part.
Dmitry
From rho@bigpond.net.au Fri Feb 27 11:38:51 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Fri, 27 Feb 2004 21:38:51 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077806830.5732.145.camel@mikush2>
References: <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au> <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn> <1077786624.5732.71.camel@mikush2> <2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca> <1077806830.5732.145.camel@mikush2>
Message-ID: <20040227113851.GE14304@namod.qld.bigpond.net.au>
On Thu, Feb 26, 2004 at 03:47:11PM +0100, Rani Pinchuk wrote:
> > Constructors have nothing to do with data presentation. They allow to
> > create new objects.
> > These objects can be based on TMDM, XML DOM or other data models. They
> > are data structures.
> It is not that I find it wrong to write queries within
> presentation/construction templates - I think that it SOMETIMES
> wrong (usually when the project is big, and you need separation of
> responsibilities between the people who work on different
> domains). But if this mixture is built into one language, it means
> that you cannot avoid from it in whatever circumstance.
Rani,
I do not think that we are talking anymore about presentation issues.
If we say "constructor", then we are talking about constructing data
structures.
Which structures we return, is now a design issue. The more complex
data we allow, the more complex the constructor part will be.
> But on the other hand I am not sure that TMQL should be able to return
> ANY object. And I think that the solution should be within what I call
> TMQL, and not outside of it. So I think that the selects themselves
> should be able to return something else then just strings - not any
> object, but maybe topic nodes and association nodes.
I see it here as a aut-caesar-aut-nihil (all or nothing) question. It
is difficult to sell to an engineer that - to get an occurrence - he
has to select a full topic and then (THIS IS NOW OUTSIDE TMQL) drill
into the topic to get the particular occurrence.
I would not like that. Either I have a query language which gives me
access to ALL components of a map, or not.
> My initial attitude to the whole problem was to return nothing but topic
> objects and association objects from the selects. And have in the API
> that uses the TMQL (this API for TMQL is like JDBC for SQL) methods that
> let us access the different elements in the topics. This way, the
> language becomes much simpler, and we can avoid from the above
> questions. However, it is true that there are some performance penalties
> for this attitude - some topics might hold many big resourceData
> elements, and it might be annoying to pay that price for seeing a simple
> list of topic ids...
Exactly my point: I want to get what I ask for.
> But this performance problem might be solved in other ways - for example
> using the idea that came I think from Jan Algermissen of using light
> weight objects - so a topic object that contains only the base name
> element and id if we asked to see the base names.
Argh :-))
I would rather prefer to have a reasonable way to 'stringify' parts of
a topic map and to choose (as a developer) whether I want the string
or the internal data structure.
\rho
From rho@bigpond.net.au Fri Feb 27 11:54:30 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Fri, 27 Feb 2004 21:54:30 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <58A386E2-68CE-11D8-9C18-000A957183D4@cogeco.ca>
References: <1077283157.7133.59.camel@mikush2> <20040220205608.GD21106@namod.qld.bigpond.net.au> <20040223212025.GA2817@namod.qld.bigpond.net.au> <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn> <002601c3fc10$564821f0$7d01a8c0@dbn> <20040226233046.GB13873@namod.qld.bigpond.net.au> <58A386E2-68CE-11D8-9C18-000A957183D4@cogeco.ca>
Message-ID: <20040227115430.GF14304@namod.qld.bigpond.net.au>
On Thu, Feb 26, 2004 at 09:40:46PM -0500, Dmitry wrote:
> >Cloning nodes and adding/retracting information looks very complext to
> >me. And, more generally, it facvours those transformations where there
> >is a lot of similarity between incoming and outgoing information
> >(read: ontology).
> >
> >Not sure, whether this is good and bad.
>
> I decided to add "coping with small modification" just because I think
> it will be typical pattern.
It probably is, yes.
But someone could argue that you are introducing through the back door
a TM update language: You have a TM information item and you would like
to modify some aspects of it.
I think we need that at some stage, but this is certainly nothing I would
do ad-hoc.
> But it is possible to use coping and pure constructors without
> modifications.
I would VERY MUCH prefer this in TMQL at this stage. Mostly because it
makes clear that the language remains explicitely "functional"
(copying and cloning and modifying has this "side effect" touch).
> >For TM it is ....hmmmm?
> >
> >Of course I used AsTMa= for AsTMa? All other approaches will have to
> >come up with some syntax as well.
>
> I have less concerns about specific notation here. It can be LTM, AsTMa
> = - based.
>
> I would probably prefer Python-like syntax :-) , something like this:
>
> topicMap:
>
> johnSmith:
> bn="John Smith"
> who[is-author-of]:
> id=paper2004-02-26
> bn="Some paper name"
> oc::publicationDate="2003-09-03"
I think we should drive something like this forward.
\rho
From Rani.Pinchuk@spaceapplications.com Fri Feb 27 11:58:37 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Fri, 27 Feb 2004 12:58:37 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040227113851.GE14304@namod.qld.bigpond.net.au>
References: <1077283157.7133.59.camel@mikush2>
<20040220205608.GD21106@namod.qld.bigpond.net.au>
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
<1077786624.5732.71.camel@mikush2>
<2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
<1077806830.5732.145.camel@mikush2>
<20040227113851.GE14304@namod.qld.bigpond.net.au>
Message-ID: <1077883117.5561.35.camel@mikush2>
>
> I do not think that we are talking anymore about presentation issues.
> If we say "constructor", then we are talking about constructing data
> structures.
>
Sure, but I make the separation between the part of the query that
determined WHAT is the data, and the other part which determined HOW the
data is delivered (presented or constructed to whatever structure etc).
> I see it here as a aut-caesar-aut-nihil (all or nothing) question. It
> is difficult to sell to an engineer that - to get an occurrence - he
> has to select a full topic and then (THIS IS NOW OUTSIDE TMQL) drill
> into the topic to get the particular occurrence.
>
> I would not like that. Either I have a query language which gives me
> access to ALL components of a map, or not.
You are totally right. I find that it is as important to return strings,
exactly because of the reason you gave above.
>
> I would rather prefer to have a reasonable way to 'stringify' parts of
> a topic map and to choose (as a developer) whether I want the string
> or the internal data structure.
I understand 'stringify' as returning simple strings (like in SQL) - am
I right in my interpretation? if so:
I totally support that. I find that TMQL should be able to return
strings so it is possible let's say to return simple list of topic base
names. But it should be able also to return topic and association
objects (I would prefer them as light weight nodes).
What I am not sure about is to have mechanisms in the language that
define HOW the data is delivered. I think it is better to have few built
in formats - strings, Topic Maps slices, and topic and association
objects. If it would depend on me (:-)) I would even not include generic
XML (like the example with generating XML from DTD) within the TMQL,
because I can see situations when the same queries are used to generate
different XMLs.
Rani
>
> \rho
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com
From rho@bigpond.net.au Sat Feb 28 08:38:31 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Sat, 28 Feb 2004 18:38:31 +1000
Subject: [tmql-wg] aliases
In-Reply-To: <1077788793.5747.83.camel@mikush2>
References: <1077363982.5529.225.camel@mikush2> <1077788793.5747.83.camel@mikush2>
Message-ID: <20040228083831.GB2001@namod.qld.bigpond.net.au>
On Thu, Feb 26, 2004 at 10:46:33AM +0100, Rani Pinchuk wrote:
> 2. It seems to me that in many applications it is necessary to find a
> topic only by its name (for example question/answering systems), and if
> so, the alias command can become quite popular.
No doubt on this. In fact "addressing a topic with its name" will
be one of the most query patterns.
> On the other hand, macros/functions/inference rules usually suffer in
> performance compared to a built in feature (especially here, if you take
> into account the recursive nature of the variant).
OK, as Lars already said, that the variant (I desparately hope that
this 'unfeature' will go away at some stage and will be replaced by
typed basenames :-) is flattened out.
But:
First, what an 'alias' could be might be clear in your particular
application, maybe it is something different in mine. So, defining a
function or a inference rule is indispensible.
Secondly, yes, this might be combined with a performance penalty
relative to a hardwired solution (everything has a penalty relative
to a hardwired solution), but that can be kept small.
In my current AsTMa? implementation, the query engine keeps track
in the atomic access patterns which queries do agains a map. I have
not spent much benchmarking, but according to these statistics the
low-level database engine can defer that it would pay off to create
an index for particular access patterns.
And then, there would not be much difference between the hardwired and
the flexible solution. Again, this is more conjecture, then proof.
\rho
PS: Can we all use the same mail quoting discipline, please? It is
very difficult to follow threads when text is just fired off
somewhere at the start of an email.
From rho@bigpond.net.au Sat Feb 28 09:07:41 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Sat, 28 Feb 2004 19:07:41 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077883117.5561.35.camel@mikush2>
References: <20040223212025.GA2817@namod.qld.bigpond.net.au> <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn> <1077786624.5732.71.camel@mikush2> <2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca> <1077806830.5732.145.camel@mikush2> <20040227113851.GE14304@namod.qld.bigpond.net.au> <1077883117.5561.35.camel@mikush2>
Message-ID: <20040228090741.GC2001@namod.qld.bigpond.net.au>
On Fri, Feb 27, 2004 at 12:58:37PM +0100, Rani Pinchuk wrote:
> > I do not think that we are talking anymore about presentation issues.
> > If we say "constructor", then we are talking about constructing data
> > structures.
> >
> Sure, but I make the separation between the part of the query that
> determined WHAT is the data, and the other part which determined HOW the
> data is delivered (presented or constructed to whatever structure etc).
OK, but isn't it ALWAYS the case that - if you would like to convert
from one kind of storage into another kind of storage - there MUST be
something which defines this connection?
In XSLT you have your incoming vocabulary in the 'match' and 'select'
attributes inside XPath expressions and the template body holds the
content according to the outgoing vocabulary.
These might be two different parts of a language, but in an XSLT sheet
these two things MUST be WITHIN one document. Otherwise the whole
transformation does not make sense.
I see this exactly so with TMQL: We have on the incoming side TM
content in a backend store, we detect interesting information with
patterns or predicates, drill further to detail what we want (some TM
path language) and then embed this into an outgoing structure.
> I totally support that. I find that TMQL should be able to return
> strings so it is possible let's say to return simple list of topic base
> names.
Yup.
> What I am not sure about is to have mechanisms in the language that
> define HOW the data is delivered. I think it is better to have few built
> in formats - strings, Topic Maps slices, and topic and association
> objects.
What would be the difference between a 'TM slice' and a set of
topics/association objects?
> If it would depend on me (:-)) I would even not include generic XML
> (like the example with generating XML from DTD) within the TMQL,
> because I can see situations when the same queries are used to
> generate different XMLs.
Hmm, what are 'different' XMLs?
In AsTMa?, for example, I have no problem to create XML output according
to different schemas. I can create
{
forall $t [ $a (album)
bn: $bn ] in $m
return
{$bn}
}
I can also create
{
forall ...
What I cannot see is that you try to separate this into different
documents.
\rho
From Rani.Pinchuk@spaceapplications.com Sat Feb 28 10:26:22 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Sat, 28 Feb 2004 11:26:22 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040228090741.GC2001@namod.qld.bigpond.net.au>
References:
<20040223212025.GA2817@namod.qld.bigpond.net.au>
<39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
<1077786624.5732.71.camel@mikush2>
<2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
<1077806830.5732.145.camel@mikush2>
<20040227113851.GE14304@namod.qld.bigpond.net.au>
<1077883117.5561.35.camel@mikush2>
<20040228090741.GC2001@namod.qld.bigpond.net.au>
Message-ID: <1077963981.5562.147.camel@mikush2>
> OK, but isn't it ALWAYS the case that - if you would like to convert
> from one kind of storage into another kind of storage - there MUST be
> something which defines this connection?
Sure, I just suggest not to build that "something" into the TMQL.
Because when it is not in the TMQL, it gives the user of TMQL the
opportunity to separate the queries (what data is retrieved) from the
definition of how that data is delivered.
You could say that we can add it to TMQL in a way that keep the WHAT and
the HOW separated, but usually in the different environments you will
find other techniques for the HOW, so I don't think it is good to
enforce another technique on the users.
> > What I am not sure about is to have mechanisms in the language that
> > define HOW the data is delivered. I think it is better to have few built
> > in formats - strings, Topic Maps slices, and topic and association
> > objects.
>
> What would be the difference between a 'TM slice' and a set of
> topics/association objects?
"TM slice" is what I call a sub topic map. A topic map that contains all
the topics and/or associations from the query AND all the other topics
and associations to complete it to a topic map that can be used as a
stand alone topic map.
Set of topics/association objects is exactly those objects with no
additions. Such a set will not always represent a stand alone topic map
(for example, if topics are given without the topics that are their
instanceOf parents, that topic map is missing).
> Hmm, what are 'different' XMLs?
My stupid way of saying XMLs with different DTDs (or schemes). This is
in line with what I write in the end the section:
http://www.spaceapplications.com/toma/Toma.html#xml.
- It is a suggestion of how to generate XML with different DTDs that I
would NOT like to include into TMQL because of the separation argument
above.
> What I cannot see is that you try to separate this into different
> documents.
That separation served me so well in the past that I co-authored the
following:
http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/Sharon/Sharon.pdf
http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/Pinchuk/Pinchuk.pdf
Both explain why to separate and how.
Rani
From rho@bigpond.net.au Sat Feb 28 11:48:16 2004
From: rho@bigpond.net.au (Robert Barta)
Date: Sat, 28 Feb 2004 21:48:16 +1000
Subject: [tmql-wg] Result set requirements
In-Reply-To: <1077963981.5562.147.camel@mikush2>
References: <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca> <000b01c3fc0d$3635dd60$7d01a8c0@dbn> <1077786624.5732.71.camel@mikush2> <2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca> <1077806830.5732.145.camel@mikush2> <20040227113851.GE14304@namod.qld.bigpond.net.au> <1077883117.5561.35.camel@mikush2> <20040228090741.GC2001@namod.qld.bigpond.net.au> <1077963981.5562.147.camel@mikush2>
Message-ID: <20040228114816.GD2001@namod.qld.bigpond.net.au>
On Sat, Feb 28, 2004 at 11:26:22AM +0100, Rani Pinchuk wrote:
>
> > OK, but isn't it ALWAYS the case that - if you would like to convert
> > from one kind of storage into another kind of storage - there MUST be
> > something which defines this connection?
> Sure, I just suggest not to build that "something" into the TMQL.
And my claim is that you cannot do that if you want TMQL have other
results than lists of topic map items. And even then you have
specified implicitely a 'list of topic map items':
select $basename, $occurrence, ......
in contrast to, say,
select $occurrence, $basename
> Because when it is not in the TMQL, it gives the user of TMQL the
> opportunity to separate the queries (what data is retrieved) from
> the definition of how that data is delivered.
No. Because if the user does not specify HOW the data (which has
been detected within the query) should be embedded in the outgoing
data structure (list, XML, whatever), then how is that data organized?
As list?
And how are the list items then mapped into the outgoing data structure?
With another template (or skin in your paper)?
If so, we would squeeze a multidimensional data structure through a
list, just to embed into a text template which - in case it is XML -
would have to be post-parsed to make accessible to the application.
This does not sound too good to me :-)
> You could say that we can add it to TMQL in a way that keep the WHAT
> and the HOW separated, but usually in the different environments you
> will find other techniques for the HOW, so I don't think it is good
> to enforce another technique on the users.
My reasoning is, that most people are familiar with list and XML
data-structures and that there are enough programmatic means (even
sometimes within the language) to post-process these.
> > What would be the difference between a 'TM slice' and a set of
> > topics/association objects?
> "TM slice" is what I call a sub topic map. A topic map that contains all
> the topics and/or associations from the query AND all the other topics
> and associations to complete it to a topic map that can be used as a
> stand alone topic map.
So, I guess, if a topic has a type, you include the type topic. If an
association has a particular role you add the topic node for that? Will
this not result in rather big submaps. And - in some cases which I could
construct - every submap would be identical with the original map :-)
> > What I cannot see is that you try to separate this into different
> > documents.
> That separation served me so well in the past that I co-authored the
> following:
> http://jerry.cs.uiuc.edu/~plop/plop2k/proceedings/Sharon/Sharon.pdf
> Both explain why to separate and how.
I do not think so. You describe there templates. They, btw, suffer
from the problem that they need 'if' and 'loop' statements. So, if you
already have an if statement and a loop statement in your development
language (in our case TMQL), then you would need yet another language
for the template. By splitting this into two languages you get no
advantage but incur costs of two languages.
If think I understand your concern, namely, if someone has a particular
query like
forall [ some pattern P ]
return
some constructor C
that you want to factor out the way the data is used in the constructur
and avoid that the repetition of the pattern P
forall [ some pattern P ]
return
another constructor D
That would only be possible if you store the captured data which was
detected in the pattern in an intermediate store which you can then
reuse with different constructors C and D.
In any serious language that intermediate store is SO COMPLEX that
reusing it becomes definitely more expensive than simply repeating
the pattern.
And, worse, you would undermine a lot of optimization opportunities.
Think about this example:
forall [ $p (person)
bn: $bn ]
return
{$bn}
If the language processor could not see the constructor and would have
to keep all relevant data, it could not throw away the binding for
$p. $p is useless, because it is never used for output.
\rho
From Rani.Pinchuk@spaceapplications.com Sat Feb 28 18:36:40 2004
From: Rani.Pinchuk@spaceapplications.com (Rani Pinchuk)
Date: Sat, 28 Feb 2004 19:36:40 +0100
Subject: [tmql-wg] Result set requirements
In-Reply-To: <20040228114816.GD2001@namod.qld.bigpond.net.au>
References: <39F93B06-66CC-11D8-92ED-000A957183D4@cogeco.ca>
<000b01c3fc0d$3635dd60$7d01a8c0@dbn>
<1077786624.5732.71.camel@mikush2>
<2EED965E-685F-11D8-9C18-000A957183D4@cogeco.ca>
<1077806830.5732.145.camel@mikush2>
<20040227113851.GE14304@namod.qld.bigpond.net.au>
<1077883117.5561.35.camel@mikush2>
<20040228090741.GC2001@namod.qld.bigpond.net.au>
<1077963981.5562.147.camel@mikush2>
<20040228114816.GD2001@namod.qld.bigpond.net.au>
Message-ID: <1077993355.5563.524.camel@mikush2>
> And my claim is that you cannot do that if you want TMQL have other
> results than lists of topic map items. And even then you have
> specified implicitely a 'list of topic map items':
>
> select $basename, $occurrence, ......
>
> in contrast to, say,
>
> select $occurrence, $basename
It is true that any query language should be able to deliver
somehow the results. The question is how complex you make this delivery.
I see here kind of scale. You can start with something extremely simple
like single string. You can continue with something like what SQL gives.
You can still continue with let's say adding the ability to deliver
whatever XML. Next step might be the ability to deliver any format or
structure. Actually you can still continue letting the user write
algorithms within the query language.
Something like (obviously, very approximate):
---+--------+---------+----------+-----------+---------
single SQL XML Any algorithms
string like Format
As long as we go to the right on this scale, the language becomes more
complex and hard to learn. Beside, that separation that I speak about
become impossible because the mix is build in the language.
I am not totally sure where we should end up in that scale. I guess I
would prefer somewhere between SQL-like and XML.
You gave good reasons to include (any) XML. However, I think the query
language could do without it. I have two reasons for that:
1. It seems that in AsTMa (as well as in TMTL) the creation of XML
(or other formats) is based on processing strings. So actually the
separation is easy to achieve:
{
forall $t [ $a (album)
bn: $bn ] in $m
return
{$bn}
}
Could also be written like:
template:
$b
where "loop_over_query_results" is a callback function that gets the
query from a phrasebook of queries, runs it and place in $a and $b
the results (possibly with some extra processing of the results
before sending them to the template).
2. I am not sure what is the correct way to generate XML like the
above. Maybe if we decide in the end that TMQL can return topic
objects and association objects as excerpts of XTM (so XML elements),
we should use XSLT to generate other XMLs. But maybe this is too
complex, and we should do something like you do it in AsTMa, or
maybe we should do it using DTD and Xpath (actually your ideas
as well) described in
http://www.spaceapplications.com/toma/Toma.html#xml
So unless someone really know what is the correct way to do it,
I think it is wrong to include it into TMQL.
> My reasoning is, that most people are familiar with list and XML
> data-structures and that there are enough programmatic means (even
> sometimes within the language) to post-process these.
True, and this is why I am not totally sure that we should not include
XML output in TMQL. Maybe it is a real must in a modern language. Maybe
we are missing a standard of how to do this kind of things "correctly".
Anyway this is why I did include that DTD+Xpath->XML in the
missing/future section of Toma description.
> So, I guess, if a topic has a type, you include the type topic. If an
> association has a particular role you add the topic node for that? Will
> this not result in rather big submaps. And - in some cases which I could
> construct - every submap would be identical with the original map :-)
This is correct. I think that TMQL should be able to generate back a
topicmap that can be read into a topic map engine (so nothing is missing
to make it a complete topic map). BTW, It is not my idea - I am not sure
who wrote it, but it was written by someone on tmql-wg mailing list, and
I found it very correct.
> If think I understand your concern, namely, if someone has a particular
> query like
>
> forall [ some pattern P ]
> return
> some constructor C
>
> that you want to factor out the way the data is used in the constructur
> and avoid that the repetition of the pattern P
>
> forall [ some pattern P ]
> return
> another constructor D
>
> That would only be possible if you store the captured data which was
> detected in the pattern in an intermediate store which you can then
> reuse with different constructors C and D.
>
> In any serious language that intermediate store is SO COMPLEX that
> reusing it becomes definitely more expensive than simply repeating
> the pattern.
I am not sure that we understood each other. My only gain in the
separation is that the final application is more readable and
maintainable. The price I pay is performance and complexity.
It is very similar to the usual way one deals with error messages:
you can hard code them into your application, which makes it
faster to code, and to run. Or you can place them in a separate file,
which make your code more readable and maintainable in the price of
some overhead (performance and complexity) with getting the error
message from the other file.
So I don't try to avoid running the same query. I try to avoid
hard-coding the same query. And the same with the code that generate
the output - I try to avoid mixing those two, and hard-code them in
more then one place.
Rani
--
Rani Pinchuk
Software Engineer
Space Applications Services
Leuvensesteenweg, 325
B-1932 Zaventem
Belgium
Tel.: + 32 2 721 54 84
Fax.: + 32 2 721 54 44
http://www.spaceapplications.com