[antlr-interest] unquoted strings in small line-based language

Wed Oct 17 20:42:49 PDT 2007

Hi all,

I'm trying to write a parser for a small language used in 3D modelling
of frame structures. The language uses commands that start on a new line
with a four-letter keyword, then their are usually numbers and other
keywords following on each line. Some commands are multiple lines, but
most are not.

The problem comes with some lines that allow unquoted string labels to
follow the keyword, for example in the following 'named load case'
section of my input file:

CASE     2 Gravity plus 10kg dist load + 1 tonne point load
NDLD     1     -0.004      0.016     -0.037      0.000      0.000      0.000
NDLD     2      0.004      0.016     -0.037      0.000      0.000      0.000

In the above, the stuff following 'CASE    2' (where 2 could be any
integer) is a free-text label and should not be tokenised, but needs to
be captured as a string in my target code.

How can I make the lexer *not* peer into this free-text and attempt to
identify tokens in it? How would I then catch the resulting string in my
parser?

I would be happy to send my full grammar if it helps. I am wanting to
create C or C++ code on top of the grammar, to pull out data from this
file and process and store it in another form.

Any comments or suggestions would be much appreciated.

Cheers
JP