[antlr-interest] beginner question - 'unexpected ast node' when generating from combined grammar
Nick C
nick.curry+antlr at gmail.com
Fri Feb 11 11:42:22 PST 2011
Hi,
I'm trying learn antlr by writing a parser for a simple HTML
templating language (in combination with reading the Definitive Antlr
Reference book - I'm only just past the calculator example so far
though.)
The parser should handle something like this:
{namespace My.Namespace}
{template MyTemplate}
hello
{if $name}
{print $name}
{else}
world
{/if}
<br/>
{/template}
My current attempt is to first build a simple version of the parser
without any actions, just to get it to parse valid input correctly:
grammar Test;
options { language = 'CSharp2'; }
doc : ns
WS*
(template)+
WS* ;
ns : '{namespace' WS+ DOTTED_IDENT WS* '}';
template: '{template' WS+ IDENT WS* '}' content '{/template}';
cmdSp : '{sp' WS* '/}';
cmdIf : '{if' WS* '}' content ('{elseif}' content)* ('{else}'
content)? WS* '{/if}' ;
anyCmd : cmdSp | cmdIf;
nonCmd : ~(anyCmd); /* ~('{')*;*/
content : (anyCmd | nonCmd)*;
WS : ' '|'\t'|'\r'|'\n';
IDENT : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
DOTTED_IDENT
: IDENT ((WS)* '.' (WS)* IDENT)*;
I was hoping to parse the first example minus the {$name} print
statement and the conditional in the {if} statement (i.e. the $name in
the {if}.)
When I try to generate the parser with the above definition, I get the
following warnings/errors:
Test.g:0:0: syntax error: buildnfa: <AST>:19:16: unexpected AST node: anyCmd
Test.g:19:14: set complement is empty
I'm guessing my use of ~(anyCmd) is incorrect, but I don't understand why?
If I try replacing that with ~('{')* as per the comment above, I get
these errors:
Test.g:19:18: Decision can match input such as "'{else}'" using
multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
... more errors like this ...
Test.g:19:18: The following alternatives can never be matched: 2
I thought this was specifying 'any character apart from {', so I don't
understand how '{else}' could be a match (or why there are multiple
alternatives - I thought I only specified one, unless * counts as
many?)
A brief explanation and/or a pointer to the section of the book I
should be reading would be most welcome.
Thanks,
Nick
More information about the antlr-interest
mailing list