[antlr-interest] how to skip/read next n Characters (n is read form input)

Wed Nov 14 12:53:00 PST 2012

Hi,
finally I got a (Kindle-)copy of the ANTLR reference and read some 
chapters.
The cited example generates NullPointerExceptions:

Looking through the generated code I realized that 'b' is translated in 
Java to a List b_list = null;
b_list is never initialized to be a valid object. (no b_list = new 
ArrayList(); anywhere in the readNchars method)

This issue I fixed manually in the generated code...

The second issue came up during runtime: an EarlyExitException is 
thrown.
According to the reference is occurs if "The recognizer did not match 
anything for a (..)+ loop."
This is (at least for me) quite odd. Since I also tried to match 
(b+=.)+ as well as (b+=CHAR)+

Best regards
Thomas

Am 2012-11-02 13:51, schrieb cd.barth at t-online.de:
> Thomas, I would use validating semantic predicate
>
> readNchars
> : NUM
>   (b+=CHAR)+ {$b.size()<=Integer.parseInt($NUM.text)}?
> ;
>
> The idea is from Ter's book The Definitive ANTLR Reference (ANTLR v3)
>
> Gruß Claus-Dieter
>
>
> -----Ursprüngliche Nachricht-----
> Von: Juancarlo Añez [mailto:apalala at gmail.com]
> Gesendet: Donnerstag, 1. November 2012 02:20
> An: Thomas Ruschival
> Cc: antlr-interest at antlr.org
> Betreff: Re: [antlr-interest] how to skip/read next n Characters (n
> is read form input)
>
> Thomas,
>
> ANTLR may be overkill or inadequate for what you're doing.
>
> I think you'd be better of with a program with a main loop that
> dispatches to different functions based on the escape code. Each
> function can affect the input position, or do anything else it
> pleases. It would be a handcrafted state machine.
>
> You can do this in Python or any of the friendly languages.
>
> Cheers,
>
> -- Juancarlo
>
> On Wed, Oct 31, 2012 at 12:17 PM, Thomas Ruschival
> <thomas at ruschival.de>wrote:
>
>> I am a humble EE with little grammar experience, please forgive my
>> ignorance and give me a hint how professionals would do the trick.
>>
>> I came up with a grammar for detecting commands "escape-sequences" 
>> in
>> a input text (for a UnifiedPOS printer) that reads numbers and 
>> boolean
>> argumets for escape sequence commands from the input stream.
>> I can read numeric arguments and use them as function parameters,
>> which function to be called is parsed correctly.
>> For instance "ESC|#rF" means "print feed revers # lines"
>>
>> The question is how to treat "ESC|#E" which means "send the next #
>> bytes untreated to the pinter" in other words:
>>
>> How can I use a number N that I detected on the input stream to read
>> and consume the next N characters 'un-lexed' and 'un-parsed' as
>> string/byte array?
>>
>> I was thinking using something like this in a parse action using the
>> 'input' member of the parser:
>>
>> for (int i=0; i<N; i++){
>>         output.append(input.LA(1));
>>         input.consume();
>> }
>>
>> But it doesn't seem very professional to me. Furthermore this gives 
>> me
>> tokens and not plain bytes....
>> Can you give me a hint?
>>
>> Thanks in advance
>> Thomas
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> 
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>
>
>
> --
> Juancarlo *Añez*