[antlr-interest] Learning the basics of ANTLR
Evan Metheny
evanpmeth at gmail.com
Tue Oct 13 10:46:23 PDT 2009
Alright I am currently trying to learn ANTLR via the definitive guide
book. My current questions regards the following XML Grammar. What I
am trying to do is; re-write the grammar from the XMLLexer.g example
to be a parser and lexer grammar, I am doing this as an exercise to
try to understand ANTLR.
When debugging under ANTLR Works 1.3 i get a missing token exception
on GENERIC_ID within the "attribute" parser rule. I tried problem
solving by changing it to a non-fragment lexer rule and to a parser
rule, but this causes the beginning XML declaration to break. I cant
understand why it would break the recognition of "XML" when its before
the attribute call.
Any help would be much appreciated for me to understand this situation better.
XML.g:
-----------------------------------------------------------------
grammar XML;
options {
backtrack = true;
}
document
: xmldecl WS? doctype
;
doctype
:
'<!DOCTYPE' WS? GENERIC_ID
WS?
(
( 'SYSTEM' WS? VALUE
| 'PUBLIC' WS? VALUE WS? VALUE
)
( WS )?
)?
( INTERNAL_DTD
)?
'>'
;
INTERNAL_DTD : '[' (options {greedy=false;} : .)* ']' ;
pi :
'<?' GENERIC_ID WS?
( attribute WS? )* '?>'
;
xmldecl :
'<?' ('x'|'X') ('m'|'M') ('l'|'L') WS?
attribute '?>'
;
element
: ( start_tag
(element
| PCDATA
| cdata
| comment
| pi
)*
end_tag
| emptyelement
)
;
start_tag
: '<' WS? GENERIC_ID WS?
( attribute WS? )* '>'
;
emptyelement
: '<' WS? GENERIC_ID WS?
( attribute WS? )* '/>'
;
attribute
: GENERIC_ID WS? '=' WS? VALUE
;
end_tag
: '</' WS? GENERIC_ID WS? '>'
;
comment
: '<!--' (options {greedy=false;} : .)* '-->'
;
cdata
: '<![CDATA[' (options {greedy=false;} : .)* ']]>'
;
fragment GENERIC_ID
: ( LETTER | '_' | ':')
( options {greedy=true;} :
LETTER | '0'..'9' | '.' | '-' | '_' | ':' )*
;
fragment LETTER
: 'a'..'z'
| 'A'..'Z'
;
WS :
( ' '
| '\t'
| ( '\n'
| '\r\n'
| '\r'
)
)+
;
fragment PCDATA : (~'<')+ ;
fragment VALUE :
( '\"' (~'\"')* '\"'
| '\'' (~'\'')* '\''
)
;
More information about the antlr-interest
mailing list