[antlr-interest] Problem with an underline and semantic action
José María García Rodríguez
darthia at gmail.com
Thu Sep 15 23:37:26 PDT 2005
Your first problem is that the rule:
CONST_IDENT
options { testLiterals=true; }
: ('a'..'z') ('a'..'z'|'A'..'Z'|'0'..'9')*
;
is the one which has to recognise your "under_line" literal. But that
rule doesn't recognise the '_' character, so you need to add this
character to the rule itself. As you said there isn't any problem with
your charVocabulary. So your rule would be the following:
CONST_IDENT
options { testLiterals=true; }
: ('a'..'z') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
About yor second problem I'm sorry but I have never used C++ with
ANTLR (actually, I have hardly ever used C++ ;-)
HTH
--
José María García Rodríguez
2005/9/16, Nicola Cuomo <ncuomo at gmail.com>:
> Hi,
> I'm a antlr newby trying to make a translator.
>
> After reading the manual and looking at the examples on the site i
> still have a lot of problem/question.
>
> So let's start :)
>
> First problem
>
> I've this grammar:
>
> -------------------------------------------------------------------
> options
> {
> language = "Cpp";
> }
>
> class TestParser extends Parser;
>
> options
> {
> buildAST = false;
> k = 3;
> }
>
> spec
> : UNDERLINE CONST_IDENT
> ;
>
>
> class TestLexer extends Lexer;
>
> options
> {
> charVocabulary='\u0000'..'\u00ff';
> k = 3;
> }
>
> tokens
> {
> UNDERLINE = "under_line";
> }
>
> /* Whitespaces */
> WS
> : ( ' '
> | '\t'
> | '\f'
>
> // handle newlines
> | ( "\r\n" // DOS/Windows
> | '\r' // Macintosh
> | '\n' // Unix
> )
> { newline(); }
> )
> { $setType(antlr::Token::SKIP); }
> ;
>
> COMMENT
> : "%" (~('\n'|'\r'))*
> { $setType(antlr::Token::SKIP); }
> ;
>
> CONST_IDENT
> options { testLiterals=true; }
> : ('a'..'z') ('a'..'z'|'A'..'Z'|'0'..'9')*
> ;
> -------------------------------------------------------------------
>
> It's a test that should parse something like "under_line a123123"
>
> When i execute the program i get
>
> $ ./main
> under_line a123123
> line 1:1: expecting "under_line", found 'under'
> Parse exception: line 1:6: unexpected char: '_'
>
> It seem to stop looking for char when it hit the underline returning
> the "under" token and breaking the parse. My first thought was to
> extend the charVocabulary but i've no clue on how to do it.
>
> charVocabulary='\u0000'..'\u00ff'; shouldn't already include all the
> ascii character??
>
> charVocabulary='\u0000'..'\ufffe'; like someone suggested on this ml
> for a similar problem doesn't work in Cpp mode "warning: underline.g:
> Vocabularies of this size still experimental in C++ mode" and the
> following compilation fail.
>
> The "Second problem" is about semantic action:
>
> I've the following grammar piece
> -----
> formula
> : expression (EQUAL|LESST) expression
> ;
> expression
> : CONST_IDENT
> | VAR_IDENT (PRIME)?
> ... and so on ...
> ;
> -----
>
> I would like to get all the string that match the first expression
> rule in formula.
>
> I've written something like:
>
> -----
> formula
> : exp:expression (EQUAL|LESST) expression { std::cout << exp->getText() << std::endl; }
> ;
> expression
> : CONST_IDENT
> | VAR_IDENT (PRIME)?
> ... and so on ...
> ;
> -----
>
> But the compilation fail saying that no exp is defined. From what i've
> seen it seem to work with terminal token like EQUAL.
> There's a way to get all the text of a matching rule without having to
> build it from the "subexpression"?
>
> Sorry for my english :)
>
> Thanks for the answer :P
> --
> Nicola mailto:ncuomo at gmail.com
>
>
More information about the antlr-interest
mailing list