[antlr-interest] how to skip/read next n Characters (n is read form input)

Thu Nov 15 00:46:35 PST 2012

Hi Thomas, 
the sample from ANTLR v3 reference is not (yet?) running with ANTLR v4.0b3.
(b+=CHAR)+ {$b.size()<=Integer.parseInt($NUM.text)}?

I see that {System.out.println(((StartContext)_localctx).b.size());}
is a workaround for the counter but synpred is not working for me too.

It would be nice getting therefor a solution from Ter in the final Version.

Regards Claus-Dieter

-----Ursprüngliche Nachricht-----
Von: Thomas Ruschival [mailto:thomas at ruschival.de] 
Gesendet: Mittwoch, 14. November 2012 21:53
An: cd.barth at t-online.de
Cc: 'Juancarlo Añez'; antlr-interest at antlr.org
Betreff: Re: AW: [antlr-interest] how to skip/read next n Characters (n is read form input)

Hi,
finally I got a (Kindle-)copy of the ANTLR reference and read some chapters.
The cited example generates NullPointerExceptions:

Looking through the generated code I realized that 'b' is translated in Java to a List b_list = null; b_list is never initialized to be a valid object. (no b_list = new ArrayList(); anywhere in the readNchars method)

This issue I fixed manually in the generated code...

The second issue came up during runtime: an EarlyExitException is thrown.
According to the reference is occurs if "The recognizer did not match anything for a (..)+ loop."
This is (at least for me) quite odd. Since I also tried to match (b+=.)+ as well as (b+=CHAR)+

Best regards
Thomas

Am 2012-11-02 13:51, schrieb cd.barth at t-online.de:
> Thomas, I would use validating semantic predicate
>
> readNchars
> : NUM
>   (b+=CHAR)+ {$b.size()<=Integer.parseInt($NUM.text)}?
> ;
>
> The idea is from Ter's book The Definitive ANTLR Reference (ANTLR v3)
>
> Gruß Claus-Dieter
>
>
> -----Ursprüngliche Nachricht-----
> Von: Juancarlo Añez [mailto:apalala at gmail.com]
> Gesendet: Donnerstag, 1. November 2012 02:20
> An: Thomas Ruschival
> Cc: antlr-interest at antlr.org
> Betreff: Re: [antlr-interest] how to skip/read next n Characters (n is 
> read form input)
>
> Thomas,
>
> ANTLR may be overkill or inadequate for what you're doing.
>
> I think you'd be better of with a program with a main loop that 
> dispatches to different functions based on the escape code. Each 
> function can affect the input position, or do anything else it 
> pleases. It would be a handcrafted state machine.
>
> You can do this in Python or any of the friendly languages.
>
> Cheers,
>
> -- Juancarlo
>
> On Wed, Oct 31, 2012 at 12:17 PM, Thomas Ruschival
> <thomas at ruschival.de>wrote:
>
>> I am a humble EE with little grammar experience, please forgive my 
>> ignorance and give me a hint how professionals would do the trick.
>>
>> I came up with a grammar for detecting commands "escape-sequences" 
>> in
>> a input text (for a UnifiedPOS printer) that reads numbers and 
>> boolean argumets for escape sequence commands from the input stream.
>> I can read numeric arguments and use them as function parameters, 
>> which function to be called is parsed correctly.
>> For instance "ESC|#rF" means "print feed revers # lines"
>>
>> The question is how to treat "ESC|#E" which means "send the next # 
>> bytes untreated to the pinter" in other words:
>>
>> How can I use a number N that I detected on the input stream to read 
>> and consume the next N characters 'un-lexed' and 'un-parsed' as 
>> string/byte array?
>>
>> I was thinking using something like this in a parse action using the 
>> 'input' member of the parser:
>>
>> for (int i=0; i<N; i++){
>>         output.append(input.LA(1));
>>         input.consume();
>> }
>>
>> But it doesn't seem very professional to me. Furthermore this gives 
>> me tokens and not plain bytes....
>> Can you give me a hint?
>>
>> Thanks in advance
>> Thomas
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> 
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-addres
>> s
>>
>
>
>
> --
> Juancarlo *Añez*