[antlr-interest] Lexing 7-bit ASCII stream

Avid Trober avidtrober at gmail.com
Tue Apr 21 18:55:31 PDT 2009


thanks.
org.antlr.Tool is happy with these two, regardless of which one is 
above/below the other.
But, won't the DFA's care about the order???

DQUOTE : '"' ;
DQUOTE_STRING :  DQUOTE ( ~('"') )* DQUOTE



----- Original Message ----- 
From: "Gavin Lambert" <antlr at mirality.co.nz>
To: "Avid Trober" <avidtrober at gmail.com>; <antlr-interest at antlr.org>
Sent: Tuesday, April 21, 2009 6:53 AM
Subject: Re: [antlr-interest] Lexing 7-bit ASCII stream


> At 21:59 21/04/2009, Avid Trober wrote:
>>I'm parsing a 7-bit ASCII stream ... 2 questions
>>
>>Question 1: can't I just fall-thru wrt to lexer rules, where lexer rules 
>>are specific-to-general, and avoid indeterminisms at run-time?
> [...]
>>... // (AND IF NOTHING ABOVE MATCHES, AT LEAST WE'RE MATCHING HERE ... )
>>
>>CHAR    : '\u0000'..'\u007F'  // any 7-bit US-ASCII character
>>              ;
>
> You can specify a catch-all match like so:
>
>   CHAR : .;
>
> If this is the last lexer rule, then it will behave as you're expecting.
>
>>Question 2: I'm at a loss how to match the notation in the spec I'm 
>>writing a grammar for where binary digits are '0' or '1'  and digits are 
>>'0'..'9'.  (ABNF-ish)  It is prefered to make the grammar rule names match 
>>that (whether lexer or parser, it doesn't matter)
>
> Generally, it's best to have the lexer match as wide as possible (ie. have 
> DIGIT, not BINARY_DIGIT) and sort it out in the parser, where you can use 
> the context to give better error messages if you encounter something 
> invalid.
>
>>Can I write a binary_digit parser rule that works with DIGIT above 
>>somehow?
>
> Yep.  Depending on the context, you may want to either use a 
> lookahead-based entry predicate to avoid entering the rule if the DIGITs 
> aren't binary-safe, or a exit predicate that raises an error if it turns 
> out that the sequence wasn't valid binary.
> 



More information about the antlr-interest mailing list