[antlr-interest] Lexer parsing problem and Documentation

Ric Klaren ric.klaren at gmail.com
Wed Jun 6 10:27:49 PDT 2007


Hi,

On 6/6/07, Wigg, J D <wiggjd at lsbu.ac.uk> wrote:
> In parsing C++ I'm trying to remove asm statements in the lexer as though
> they were comments.
>
> I'm using a predicate in the lexer as follows,
>
> ("__asm"|"_asm"|"asm") LPAREN)=>
>
> but this will only match "asm(" provided there is no space between "asm" and
> "(".

You can probably do without predicates. Just make a rule that can
match the complete asm statement (it's a bit depending on the exact
syntax of course.. if it's multi line it could be a bit trickier)
You'll probably have to left factor that token with the identifier
rule into one lexer rule, since it starts with a valid identifier.
Another option is to use a token filter, or switch lexers (like the
java doc example). Or alternatively just extend your C++ grammar with
the asm statement. I guess that the asm statement's contents are
tokenized similarly to the existing C++ tokens (at least the flavours
I've seen), so it might not be a bad idea.. but I guess it depends on
what you want to achieve as well.

Cheers,

Ric


More information about the antlr-interest mailing list