[antlr-interest] Learning the basics of ANTLR

Tue Oct 13 11:11:44 PDT 2009

You've defined GENERIC_ID as a fragment.

Fragments can only be part of another lexer rule, they are not stand-alone token-producing lexer rules.  Henxe the missing token exception.

Also:
> I cant
>understand why it would break the recognition of "XML" when its before
>the attribute call.

So far as I know, there is no impact of order in which the lexer and parser rules appear in the .g file.

-- Graham

--------------------------------
At 10/13/2009 10:46 AM, Evan Metheny wrote:
>Alright I am currently trying to learn ANTLR via the definitive guide
>book. My current questions regards the following XML Grammar. What I
>am trying to do is; re-write the grammar from the XMLLexer.g example
>to be a parser and lexer grammar, I am doing this as an exercise to
>try to understand ANTLR.
>
>When debugging under ANTLR Works 1.3 i get a missing token exception
>on GENERIC_ID within the "attribute" parser rule. I tried problem
>solving by changing it to a non-fragment lexer rule and to a parser
>rule, but this causes the beginning XML declaration to break. I cant
>understand why it would break the recognition of "XML" when its before
>the attribute call.
>
>Any help would be much appreciated for me to understand this situation better.
>
>
>XML.g:
>-----------------------------------------------------------------
>
>grammar XML;
>
>options {
>backtrack = true;
>}
>
>document
>	:	xmldecl WS? doctype
>	;
>
>doctype
>    :
>        '<!DOCTYPE' WS? GENERIC_ID
>
>        WS?
>        (
>            ( 'SYSTEM' WS? VALUE
>            | 'PUBLIC' WS? VALUE WS? VALUE
>            )
>            ( WS )?
>        )?
>        ( INTERNAL_DTD
>
>        )?
>		'>'
>	;
>
>INTERNAL_DTD : '[' (options {greedy=false;} : .)* ']' ;
>
>pi :
>        '<?' GENERIC_ID WS?
>
>        ( attribute WS? )*  '?>'
>	;
>
>xmldecl :
>        '<?' ('x'|'X') ('m'|'M') ('l'|'L') WS?
>
>        attribute  '?>'
>	;
>
>
>element
>    : ( start_tag
>            (element
>            | PCDATA
>
>            | cdata
>
>            | comment
>
>            | pi
>            )*
>            end_tag
>        | emptyelement
>        )
>    ;
>
>start_tag
>    : '<' WS? GENERIC_ID WS?
>
>        ( attribute WS? )* '>'
>    ;
>
>emptyelement
>    : '<' WS? GENERIC_ID WS?
>
>        ( attribute WS? )* '/>'
>    ;
>
>attribute
>    : GENERIC_ID WS? '=' WS? VALUE
>
>    ;
>
>end_tag
>    : '</' WS? GENERIC_ID WS? '>'
>
>    ;
>
>comment
>	:	'<!--' (options {greedy=false;} : .)* '-->'
>	;
>
>cdata
>	:	'<![CDATA[' (options {greedy=false;} : .)* ']]>'
>	;
>
>
>
>fragment GENERIC_ID
>    : ( LETTER | '_' | ':')
>        ( options {greedy=true;} :
>        LETTER | '0'..'9' | '.' | '-' | '_' | ':' )*
>	;
>
>fragment LETTER
>	: 'a'..'z'
>	| 'A'..'Z'
>	;
>
>
> WS  :
>        (   ' '
>        |   '\t'
>        |  ( '\n'
>            |	'\r\n'
>            |	'\r'
>            )
>        )+
>    ;
>
>fragment PCDATA : (~'<')+ ;
>
>fragment VALUE :
>        ( '\"' (~'\"')* '\"'
>        | '\'' (~'\'')* '\''
>        )
>	;
>
>List: http://www.antlr.org/mailman/listinfo/antlr-interest
>Unsubscribe: 
>http://www.antlr.org/mailman/options/antlr-interest/your-email-address