[antlr-interest] Learning the basics of ANTLR

Graham Wideman gwlist at grahamwideman.com
Wed Oct 14 02:23:47 PDT 2009


Hi Evan -- couple more points on your xml grammar directly relating to your questions.

1. On checking, as I suspected yesterday, the pattern:

('x'|'X') ('m'|'M') ('l'|'L')

... is not scoring a hit because the parser slurps up "xml" as a GENERIC_ID, not as a sequence of three single-character tokens for the three letters. You can verify by temporarily changing the rule to look for 'xml' instead, and you'll see that work.

FWIW, specifying a literal like this in the parser causes ANTLR to generate a token and recognizer code for it in the lexer.

To get the case-insensitive effect, you could create a lexer rule:
XML: ('x'|'X')  ('m'|'M') ('l'|'L') ;
... and use the XML token in the parser xmldecl rule.

2. The attribute parser rule refers to VALUE, which is defined as a lexer fragment. As a lexer fragment, it doesn't generate a token, and consequently the parser will never see it.  

(I believe it's a known issue that ANTLR doesn't flag this as an error.)

Hope that helps,

Graham




More information about the antlr-interest mailing list