[antlr-interest] Parsing byte streams

Terence Parr parrt at cs.usfca.edu
Sat Sep 1 16:06:03 PDT 2007


This is no problem.  Just have a byte stream go to the lexer not  
16bit char.  Then lex/parse as normal.
Ter
On Sep 1, 2007, at 2:49 PM, Johannes Luber wrote:

> Felix Schmid wrote:
>> Hi,
>>
>> is it possible to parse byte streams, e.g. class files, using  
>> ANTLR? I
>> think I have seen the topic mentioned somewhere, maybe in the  
>> wiki, but
>> can't find it now....
>>
>>
>> Related to this, if it is possible, how could I skip over padding  
>> bytes
>> while parsing? Say, the binary stream looks something like this:
>>
>> 1 byte length of value in 4-byte words (incl. padding) || 1 byte real
>> value length in bytes || value        (|| means concatenation)
>>
>> so if 'value' was 'aa00aa00aa', the byte stream would look something
>> like '0205aa00aa00aa000000...' and the parser had to be clever  
>> enough to
>> understand that from the 8 bytes following the first byte, only the
>> first 5 are the real value and the 3 bytes at the end have to be  
>> skipped
>> (to hit the first byte of the next segment)....Could it be done  
>> with ANTLR?
>>
>> thanks for your attention,
>>
>> felix
>>
>
> Well, the last time someone asked for this, he was told to write a
> binary parser in his language of his choice. ANTLR has been written to
> parse human-readable files and even if you could create a grammar for
> that stuff without changing the lexer (the lexer reads probably  
> Unicode
> chars with two bytes instead one), I doubt it would be readable.
>
> Best regards,
> Johannes Luber



More information about the antlr-interest mailing list