[sc34wg3] Introduction to CTM -- and its implications for assertions

Mon Aug 14 11:44:45 EDT 2006

Before it gets too far from my mind, I want to comment on the discussion we
had on Friday of SVO and SOV structures for associations. The further
discussions of delimiters, etc., may have made this comment unnecessary, but
here it is anyway:

1. I think it's generally unwise to model computer languages on superficial
similarities to natural languages. There are enough varieties of sentence
structure (http://en.wikipedia.org/wiki/Word_order
<http://en.wikipedia.org/wiki/Word_order> ) with significant representatives
that picking one or another on the basis of surface structure is likely to
offend someone. (So English, Chinese and French are SVO and German, Korean,
and Japanese are SOV; but then Classical Arabic, Biblical Hebrew, and Welsh
are VSO; and Klingon is OVS.) 

While it's important to make entry of data convenient (a major justification
for CTM), I think it's also important to have a consistent style across a
language and to have it within the general expectations for computer (as
opposed to natural) languages. 

Furthermore, I don't think we have the expertise in SC34 to make informed
decisions on modelling surface structure of CTM after natural languages. I
have a Ph.D. in historical linguistics, and I'm certainly uncomfortable.
Thinking we can get to Chomsky's deep structure is getting in too deep.

2. Discussions of SVO and SOV miss a significant point: this is TMs, not RDF.
Although we make many binary assertions in three-part statements (e.g.,
Steve's "puccini lucca born-in" or "puccini born-in lucca") and may even
privilege some (e.g., ISA), we always say that one of the distinguishing
characteristics of TMs is their support of n-ary associations. 

We also say that TM associations are reversable, so that "puccini born-in
lucca" and "lucca birthplace-of puccini" are the same statement, just seen
from opposite ends. If one called the first SVO, would that make the second
OVS? Actually, I think not. Because of the reversability, I am inclined to
think both are SVS, which is not a typical pattern of natural languages. What
then of n-ary assertions? We know that n-ary associations can be decomposed
into clumps of binary ones or that clumps of binary associations can be used
to generate n-ary ones. We do have a tendency to privilige one component of
n-ary associations; thus Steve says  "scarpia killed-by tosca stabbing" and
restates it as "killed-by( victim scarpia; killer tosca; how stabbing)". In
that association "killed-by" is a passive-voice verb component. Furthermore,
"stabbing" is pretty close to instrumental case, something that is lost as a
grammatical case in most Indo-European languages (but was still in early
Anglo Saxon). If I were to put letters on the restatement, I'd call it
something like VOSI. If we don't want to go that far, I'd say it's VSSS. So
we've moved out of simplistic structures for natural language.

To summarize: Let's look for an efficient expression as a computer language
and not mess with similarities to the syntaxes of natural languages.

Jim Mason

(FWIW: one branch of my ancestry spoke Cherokee, which is a polysynthetic
language in which a sentence boils down to a single highly-inflected word of
many morphemes. Forget SVO, OSV, etc., unless you can map down to the deep
structure.)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.petesbox.net/pipermail/sc34wg3/attachments/20060814/d97cec60/attachment.html