[antlr-interest] Handling explicit continuation characters

Brisard, Fred D Fred.Brisard at ca.com
Mon Jan 12 09:53:30 PST 2009


I am parsing a grammar that uses minus or plus at the end of line to
indicate a continuation.

Following are some examples --

*	command parm1-
		  parm2

*	command parm1 -
		  parm2

*	command parm1 par-
		  m2

*	command verylongparmthatextends-
		acrosstwolines-
		oreventhreelines

The general structure of the grammar is

	Command <one or more positional parameters> <one or more keyword
parameters>

	Keyword parameters are keyword with one or more optional
subparameters.

I have developed a parser that successfully parses the language, but I
can't seem to resolve the issue of the continuations in a consistent
fashion due to the way that they can be inserted at any point in the
token stream.

I would like to just absorb them into the hidden stream so that the
input appears to be on a single line.  

If I wanted to make a first pass on the input and just absorb '-\n' and
'+\n' then all would be well.  That seems pretty wasteful to make a pass
to just do that.  It seems like I should be able to do it during the
lexical pass.

One other thing, the + and - continuation characters can be considered
part of the parameter; it's only the case that a + or - at the end of
the line is considered a continuation.  For example

*	command 'this is a literal that includes a --
		  in the text'

should reduce to "command 'this is a literal that includes a -in the
text'"

This seems like this should be a simple thing to do considering that
whitespace and comments are so easily absorbed.  

Any advice or suggestion is appreciated.

Regards, Fred

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090112/455fa0f7/attachment.html 


More information about the antlr-interest mailing list