[antlr-interest] Lexing 7-bit ASCII stream
Avid Trober
avidtrober at gmail.com
Tue Apr 21 18:55:31 PDT 2009
thanks.
org.antlr.Tool is happy with these two, regardless of which one is
above/below the other.
But, won't the DFA's care about the order???
DQUOTE : '"' ;
DQUOTE_STRING : DQUOTE ( ~('"') )* DQUOTE
----- Original Message -----
From: "Gavin Lambert" <antlr at mirality.co.nz>
To: "Avid Trober" <avidtrober at gmail.com>; <antlr-interest at antlr.org>
Sent: Tuesday, April 21, 2009 6:53 AM
Subject: Re: [antlr-interest] Lexing 7-bit ASCII stream
> At 21:59 21/04/2009, Avid Trober wrote:
>>I'm parsing a 7-bit ASCII stream ... 2 questions
>>
>>Question 1: can't I just fall-thru wrt to lexer rules, where lexer rules
>>are specific-to-general, and avoid indeterminisms at run-time?
> [...]
>>... // (AND IF NOTHING ABOVE MATCHES, AT LEAST WE'RE MATCHING HERE ... )
>>
>>CHAR : '\u0000'..'\u007F' // any 7-bit US-ASCII character
>> ;
>
> You can specify a catch-all match like so:
>
> CHAR : .;
>
> If this is the last lexer rule, then it will behave as you're expecting.
>
>>Question 2: I'm at a loss how to match the notation in the spec I'm
>>writing a grammar for where binary digits are '0' or '1' and digits are
>>'0'..'9'. (ABNF-ish) It is prefered to make the grammar rule names match
>>that (whether lexer or parser, it doesn't matter)
>
> Generally, it's best to have the lexer match as wide as possible (ie. have
> DIGIT, not BINARY_DIGIT) and sort it out in the parser, where you can use
> the context to give better error messages if you encounter something
> invalid.
>
>>Can I write a binary_digit parser rule that works with DIGIT above
>>somehow?
>
> Yep. Depending on the context, you may want to either use a
> lookahead-based entry predicate to avoid entering the rule if the DIGITs
> aren't binary-safe, or a exit predicate that raises an error if it turns
> out that the sequence wasn't valid binary.
>
More information about the antlr-interest
mailing list