[antlr-interest] Lexing 7-bit ASCII stream
Avid Trober
avidtrober at gmail.com
Tue Apr 21 02:59:15 PDT 2009
I'm parsing a 7-bit ASCII stream ... 2 questions
Question 1: can't I just fall-thru wrt to lexer rules, where lexer rules are specific-to-general, and avoid indeterminisms at run-time?
For example:
NULL : '\u0000'
;
SOH : '\u0001'
;
... // (EACH CONTROL CHARCTER HAS ITS OWN LEXER RULE)
HTAB : '\u0009' // horizontal tab
;
LF : '\u000A' // carriage return
;
CR : '\u000D' // carriage return
;
SP : '\u0020' // SPACE
;
DQUOTE : '\u0022' // (Double Quote)
;
DIGIT : '\u0030'..'\u0039' // 0-9
;
... // (THEN I WANT TO DENOTE RANGES ... )
UPPER_CASE : '\u0041'..'\u005A' // A..Z
;
TWEEN_CASE : '\u005B'..'\u0060'
;
LOWER_CASE : '\u0061'..'\u007a' // a..z
;
... // (AND IF NOTHING ABOVE MATCHES, AT LEAST WE'RE MATCHING HERE ... )
CHAR : '\u0000'..'\u007F' // any 7-bit US-ASCII character
;
Question 2: I'm at a loss how to match the notation in the spec I'm writing a grammar for where binary digits are '0' or '1' and digits are '0'..'9'. (ABNF-ish) It is prefered to make the grammar rule names match that (whether lexer or parser, it doesn't matter)
Can I write a binary_digit parser rule that works with DIGIT above somehow?
Thanks much for any help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090421/2d25183d/attachment.html
More information about the antlr-interest
mailing list