[antlr-interest] Problem with an underline and semantic action

Thu Sep 15 23:37:26 PDT 2005

Your first problem is that the rule:

CONST_IDENT
   options { testLiterals=true; }
         : ('a'..'z') ('a'..'z'|'A'..'Z'|'0'..'9')*
         ;

is the one which has to recognise your "under_line" literal. But that
rule doesn't recognise the '_' character, so you need to add this
character to the rule itself. As you said there isn't any problem with
your charVocabulary. So your rule would be the following:

CONST_IDENT
   options { testLiterals=true; }
         : ('a'..'z') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
         ;

About yor second problem I'm sorry but I have never used C++ with
ANTLR (actually, I have hardly ever used C++ ;-)

HTH

-- 
José María García Rodríguez 

2005/9/16, Nicola Cuomo <ncuomo at gmail.com>:
> Hi,
>    I'm a antlr newby trying to make a translator.
> 
> After  reading  the  manual  and looking at the examples on the site i
> still have a lot of problem/question.
> 
> So let's start :)
> 
> First problem
> 
> I've this grammar:
> 
> -------------------------------------------------------------------
> options
> {
>         language = "Cpp";
> }
> 
> class TestParser extends Parser;
> 
> options
> {
>         buildAST = false;
>         k = 3;
> }
> 
> spec
>         : UNDERLINE CONST_IDENT
>         ;
> 
> 
> class TestLexer extends Lexer;
> 
> options
> {
>         charVocabulary='\u0000'..'\u00ff';
>         k = 3;
> }
> 
> tokens
> {
>         UNDERLINE               = "under_line";
> }
> 
> /* Whitespaces */
> WS
>   : ( ' '
>     | '\t'
>     | '\f'
> 
>     // handle newlines
>     | ( "\r\n"  // DOS/Windows
>         | '\r'    // Macintosh
>         | '\n'    // Unix
>                         )
>       { newline(); }
>     )
>     { $setType(antlr::Token::SKIP); }
>   ;
> 
> COMMENT
>   : "%" (~('\n'|'\r'))*
>     { $setType(antlr::Token::SKIP); }
>   ;
> 
> CONST_IDENT
>   options { testLiterals=true; }
>         : ('a'..'z') ('a'..'z'|'A'..'Z'|'0'..'9')*
>         ;
> -------------------------------------------------------------------
> 
> It's a test that should parse something like "under_line a123123"
> 
> When i execute the program i get
> 
> $ ./main
> under_line a123123
> line 1:1: expecting "under_line", found 'under'
> Parse exception: line 1:6: unexpected char: '_'
> 
> It  seem  to stop looking for char when it hit the underline returning
> the  "under"  token  and  breaking  the parse. My first thought was to
> extend the charVocabulary but i've no clue on how to do it.
> 
> charVocabulary='\u0000'..'\u00ff';  shouldn't  already include all the
> ascii character??
> 
> charVocabulary='\u0000'..'\ufffe';  like  someone suggested on this ml
> for  a similar problem doesn't work in Cpp mode "warning: underline.g:
> Vocabularies  of  this  size  still  experimental in C++ mode" and the
> following compilation fail.
> 
> The "Second problem" is about semantic action:
> 
> I've the following grammar piece
> -----
> formula
>         : expression (EQUAL|LESST) expression
>         ;
> expression
>         : CONST_IDENT
>         | VAR_IDENT (PRIME)?
>          ... and so on ...
>         ;
> -----
> 
> I  would  like  to  get all the string that match the first expression
> rule in formula.
> 
> I've written something like:
> 
> -----
> formula
>         : exp:expression (EQUAL|LESST) expression { std::cout << exp->getText() << std::endl; }
>         ;
> expression
>         : CONST_IDENT
>         | VAR_IDENT (PRIME)?
>         ... and so on ...
>         ;
> -----
> 
> But the compilation fail saying that no exp is defined. From what i've
> seen  it seem to work with terminal token like EQUAL.
> There's a way to get all the text of a matching rule without having to
> build it from the "subexpression"?
> 
> Sorry for my english :)
> 
> Thanks for the answer :P
> --
>  Nicola                          mailto:ncuomo at gmail.com
> 
>