[sc34wg3] Whitespace agnosticism in CTM and descendants
Robert Barta
rho at bond.edu.au
Thu Apr 5 01:29:31 EDT 2007
On Tue, Apr 03, 2007 at 04:18:33AM +1200, Xu?n Baldauf wrote:
> I'd like to raise the issue of whitespace agnosticism.........
> An LTM example:
> a(b:c : d) has a different meaning to
> a(b : c:d)
> As it is easy to make a "whitespace spelling" error ......., I like to
> recommend that whitespaces are not meaningful in CTM, except in string
> literals.
I do not see this is a major problem as CTM should be human-centric
and overloading of symbols _can_ help in this. If used wisely.
> ...................................................(and also as it is
> not straightforward anymore generate parsers for languages with mixed
> whitespace meaningfulness using common parser generators)
It may not be straightforward, but it is also not unduly difficult if
I go with our experiences with AsTMa.
> 1. De-overload the colon ":".
> 1. Use another symbol, like "=", "->", ":=", ",", whatever, for
> separation of "type" expressions and "player" expressions in
> "role" expressions.
Two characters are too long. And the most obvious character still is ':'.
> 2. Use another symbol, like '#', '+', '&', '%', whatever, for
> separation of "prefix" expressions and "local" expressions
> in "qname" expressions.
This breaks conceptual compatibility with everything the XML/RDF world
does. This would reduce adoption, IMHO.
> 5. Overload the colon and make the life of CTM authors and CTM parser
> authors unnecessarily harder, more troublesome and error-prone,
> forever.
I think, that we do not have the complete picture yet. In the expression
a(x:y:z)
there is actually NO ambiguity:
- if x is NOT a prefix, then is MUST be topic id, so it reads
x : y:z
- if x is a prefix, so it MUST read
x:y : z
And prefixes MUST be declared before they are used.
Maybe the editors may analyze similar situations, but if it is true,
what I suspect here, then the 'whitespace rule' could go away
completely.
Implementationwise, it is a probably matter how much intelligence one
can put into the lexer to handle the prefixing. I would agree that
some older parser generators can make it very hard to handle this
flexibly.
> P.S.: To avoid these types of problems, one should really write a clean
> room reference implementation of a CTM parser (maybe even along with a
> test suite) before freezing the specs.
Seconded!
"We reject kings, presidents and voting. We believe in rough consensus
and running code". David Clark
> P.P.P.S.: One could use the comma ',' as a replacement for the colon ':'
> in "role" expressions. The comma separating roles (in "roles"
> expressions) may then be replaced by the semicolon, letting associations
> look like "type(pr:type0,pr:value0; pr:type1,pr:value1;
> pr:type2,pr:value2)".
Brrr :-))
\rho
More information about the sc34wg3
mailing list