[antlr-interest] Novice Question - Token for all characters from a given point to End of Line
Gavin Lambert
antlr at mirality.co.nz
Wed Aug 6 04:13:08 PDT 2008
At 09:24 6/08/2008, Brisard, Fred D wrote:
>I am currently collecting each "word" (separated by WS) for the
>length of the line and identifying them separately. I really
>just need to get all the words as a single token - at least,
>that's what I think I want to do.
I still don't see why you would want that. That would just make
the job of figuring out what it all means much harder.
>I should describe more of what I'm doing. I'm creating a parser
>that parses a "language" and then provides the ability to
display the
>information in a form-based view for editing. I will then let
the
[...]
>In addition, the command name and keyword values have implied
>abbreviations. So if you have 2 keywords - before and after,
then
>b and a are sufficient to discriminate between them.
All of this stuff is best handled in the parser -- just create
simple tokens eg. WORD, NUMBER, QUOTED_STRING, OPEN_BRACKET, etc,
and work out what they actually mean at the parser level.
>Finally there is the concept of continuation - a statement can
be
>continued by the last character on a line being a + or -. The -
is
>used when whitespace at the beginning of the subsequent line is
>significant; + just ignores any whitespace at the beginning of
the
>subsequent line.
This one you should handle in the lexer; you can swallow up the
intervening EOL and whitespace to hide it from the parser that
way, so it just sees a single continuous statement.
More information about the antlr-interest
mailing list