[antlr-interest] howto ignore unknown tokenstreams/recordsets

Mon Jan 17 10:44:13 PST 2005

Am Montag, 17. Januar 2005 15:59 schrieb Ric Klaren:
>> I'm writing a parser which parses a dokument (datasets separated by
>> semicolon).
>> How can I ignore unknown datasets (in my example recordsets X and Y)?
>
>It depends a bit on what defines an unknown recordset. If you know
>those start with X and Y then you can just skip them in the lexer like
>whitespace. Or you can use tokenstream filtering between lexer and
>parser. Although you should take care that there's no (or very
>controlled) feedback from parser to lexer (when using tokenstream
>filtering). Another approach might be to make some custom error
>handlers in your parser that skip the unrecognized bits, that might
>interfere with normal error handling though.

For a unknown recordset the leading key (X or Y in my example, in general 
<something else>) are not known. The  structure of the to be ignored 
recordsets is <some letters> ( ~(";") )+ ";". 
Because I don't know <some letters> and the following tokens until ";" I can 
not skip them in the lexer. (right ?)

What I want is a parser that recognizes the defined recordsets and ignores all 
other input (must be structured like a recordset).

Hopefuly you can give my some help.

with best regards
Oliver