[antlr-interest] ANTLR Basic Question

Jim Idle jimi at temporal-wave.com
Fri Jul 9 15:00:20 PDT 2010


First add a catch all to your lexer as the last rule:

ANY : . { skip(); /* or error */ } ;

Then change your NONBLOCKING to:

CHARSEQ : ('a'..'z')+ /* or whatever it is */

And put this rule after the keywords.

If that fails then add a predicate.

Jim



> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Klaus Martinschitz
> Sent: Friday, July 09, 2010 12:11 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] ANTLR Basic Question
> 
>   Hi ANTLR Gurus,
> 
> A beginner's question.
> I want to write a compiler for Crystallographic Information File Format
> ' (CIF). I don't want to explain the syntax in detail only the problem
> I
> have to face with.
> 
> The data starts with a token
> 
> 'data_'
> 
> followed by arbitrary characters and an EOL, e.g.
> 
> data_global
> .
> 
> There is also a token
> 
> 'loop_';
> 
> Somewehere in my BNF I write something like
> 
> DATA
>      :(('d'|'D')('a'|'A')('t'|'T')('a'|'A')'_')
>      ;
> 
> LOOP
>      :
>      (('l'|'L')('o'|'O')('o'|'O')('p'|'P')'_')
>      ;
> 
> dataBlockHeading
>      :    (DATA NONBLANCKCHAR+ EOL)
>      ;
> 
> dataItem
>      :    (tag WHITESPACE value) | (LOOP loopHeader loopBody)
>      ;
> 
> The first two expressions are tokens the second are rules. My problem
> is
> following. The file starts with
> 
> data_global
> 
> BUT the *lo* of data_g*lo*bal is parsed from the LOOP token. How can
> this be if the parser is in the dataBlockHeadingrule? The parser must
> know that the characters *lo* belong to NONBLANCKCHAR and not to LOOP,
> or?
> 
> I have attached the whole syntax at the end of the file
> 
> Thanks for help
> 
> Regards,
> Klaus
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> grammar CIF1_1;
> 
> options{
> language=Java;
> }
> 
> @lexer::header{
> package at.netcrystals.cif_1_1.parser;
> }
> 
> @parser::header{
> package at.netcrystals.cif_1_1.parser;
> }
> 
> 
> DATA
>      :(('d'|'D')('a'|'A')('t'|'T')('a'|'A')'_')
>      ;
> 
> LOOP
>      :
>      (('l'|'L')('o'|'O')('o'|'O')('p'|'P')'_')
>      ;
> 
> fragment ORDINARYCHAR
>      :     '!' | '%' | '&' | '(' | ')' | '*' | '+' | ',' | '-' | '.' |
> '/' | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | ':' |
> '<' | '=' | '>' | '?' | '@' | 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' |
> 'H' | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R' | 'S' |
> 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z' | '\\' | '^' | '\`' | 'a' | 'b'
> | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' | 'j' | 'k' | 'l' | 'm' | 'n'
> | 'o' | 'p' | 'q' | 'r' | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z'
> | '{' | '|' | '}' | '~'
>      ;
> 
> 
> NONBLANCKCHAR
>      :    ORDINARYCHAR | '"' | '#' | '$' | '\'' | '_' | ';' | '[' | ']'
>      ;
> 
> 
> 
> WHITESPACE
>      :    '\t'|' '
>      ;
> 
> 
> /**********************************************************************
> **************************
>      WhiteSpace and Comments
> ***********************************************************************
> *************************/
> 
> 
> 
> 
> 
> 
> EOL
>      :'\n'|'\r\n'
>      ;
> 
> 
> 
> 
> 
> 
> /**********************************************************************
> **************************
> *
> * Root
> *
> ***********************************************************************
> *************************/
> 
> cif
>      :      (dataBlock)   EOF
>      ;
> 
> dataBlock
>      :    (dataBlockHeading dataItems)
>      ;
> 
> dataBlockHeading
>      :    (DATA NONBLANCKCHAR+ EOL)
>      ;
> 
> 
> dataItems
>      :    dataItem* EOL
>      ;
> 
> dataItem
>      :    (tag WHITESPACE value) | (LOOP loopHeader loopBody)
>      ;
> 
> tag
>      :    NONBLANCKCHAR+
>      ;
> 
> 
> value
>      :    '.' | '?' | charString
>      ;
> 
> charString
>      :    singleQuotedString
>      ;
> 
> singleQuotedString
>      :    '\'' NONBLANCKCHAR* '\''
>      ;
> 
> loopHeader
>      :    ( (WHITESPACE tag)+)
>      ;
> 
> loopBody
>      :    value (WHITESPACE value)+
>      ;
> 
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address





More information about the antlr-interest mailing list