[antlr-interest] Match the start and end of a line

Gary R. Van Sickle g.r.vansickle at att.net
Thu Dec 25 01:17:43 PST 2008


> From: Gokulakannan Somasundaram
> 
> Hi,
>     I am a beginner in ANTLR and i went through the 
> documentation available fairly. But i am not able to find the 
> proper way of matching the beginning and end of a line in 
> ANTLR. Can someone help me on that?
> 
> Thanks,
> Gokul.

If I may quote one Mr. Homer Simpson, "it looks like ketchup, it tastes like
ketchup, but brother, IT AIN'T KETCHUP!"  ANTLR's lexer isn't doing regular
expression matching, even though it's using many of the same operators for
very similar purposes.  What you want to do is something like the following:

1.  Have the lexer detect EOLs and send a token up to the parser.  A lexer
rule something like this is what you're looking for:

EOL : ('\r'|'\n')+
    ;

This is untested, but should be what you want for this: eat up any number of
carraige returns and linefeeds, in any combination, and send up a token
named "EOL".

2.  With EOLs now coming up, this might be all you need, if in fact you only
need the end-of-lane anchor behavior.  You'd write your parser rules
something like this and you're done:

translation_unit
    : (statement EOL)+
    ;

3.  If you really need that start-of-line anchor (e.g. if you can't skip
whitespace at the beginning of your lines), FAQ #... this one:
<http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497>... can
probably help you out.  Each EOL implies a BOL (beginning of line)
immediately after it, so "simply" emit both an EOL and a BOL token from the
('\r'|'\n')+ match in the lexer, then make your parser rules look like this:

translation_unit
    : (BOL statement EOL)+
    ;

You'd have to be throwing up WS tokens as well though for that to be buying
you anything.

Hope that helps some.

-- 
Gary R. Van Sickle
 



More information about the antlr-interest mailing list