[antlr-interest] Antlr syntax reference

Sam Kuper sam.kuper at uclmail.net
Tue Jun 24 11:41:13 PDT 2008


Hi Raphael,

Thanks for the quick reply. I think you have misunderstood my issue, so I
will clarify. Why does '\n' in the grammar match a newline in the input,
rather than matching '\n' in input? It can only be because the string '\n'
has a special meaning in Antlr. (For comparison, if I search for '\n' in a
text editor, it will match '\n' but not a newline.)

Where can I find a list of all such strings? The closest to a specification
document for Antlr syntax is at
http://www.antlr.org/wiki/display/ANTLR3/Grammars but it doesn't mention the
\n rule.

What I want is something like the HTML 4.01 specification (
http://www.w3.org/TR/html401/) but for Antlr; i.e. an exhaustive document
that mentions everything in the syntax, in a human-readable format. Is there
a document like this anywhere to be found? I bought Terence Parr's Antlr
book from Pragmatic but it doesn't seem to have this either.

(NB. I hope that Antlr is platform independent enough that this won't
matter, but just for info: I am prototyping on Windows, but intend to deploy
to a Linux system.)

Many thanks,

Sam

2008/6/24 Raphael Reitzig <r_reitzi at cs.uni-kl.de>:

> Hi Sam!
>
> I am not sure what you mean by "Antlr's lexical analyser recognises that it
> denotes a newline". Do you mean "ANTLR recognizes '\n' where I have a line
> break in my input"?
>
> If so, this is nothing special. In any ASCII file, which is a linear list
> of characters of 8 Bit each, a line break is encoded by a special symbol.
> For Unix, this is '\n'. So, your text editor puts a '\n' in your text if
> you type ENTER and shows it to you as a line break, hiding existence of
> '\n' from you. Thus, ANTLR _really_ finds the character '\n' in your input
> and does nothing but following your rules.
>
> Note that for different OS' the linebreak character is different. I. e.,
> for Unix it's '\n' (or LF, line feed), whereas Mac uses '\r' (CR, carriage
> return). Windows uses both at once. Refer to
> http://en.wikipedia.org/wiki/Linebreak for mor details. To get your
> grammar
> working for inputs created on either system, you may want to encode a
> linebreak in ANTLR with a token rule like
> LINEBREAK : '\n' | '\r' |'\r\n';
>
> In general, I strongly recommend Terence Parr's book about ANTLR. It may
> not adress this special issue, but it explains most (all?) aspects of ANTLR
> in a entertaining way.
>
> I hope I did not misunderstand your question.
>
> Raphael
>
> On Tue, 24 Jun 2008 15:12:16 +0100, "Sam Kuper" <sam.kuper at uclmail.net>
> wrote:
> > Dear all,
> >
> > I am looking for an exhaustive guide to Antlr 3.0.1 syntax (I am using
> > AntlrWorks 1.1.7); I'll explain why. My grammar so far looks like this:
> >
> > grammar DCP;
> > options {
> >     language=Python;
> > }
> > dcp     : DOCUMENT* EOF;
> > DOCUMENT    : HEADERS;
> > HEADERS    : YEAR_HEADER MONTH_HEADER ;
> > YEAR_HEADER    : '*Y 18\n';
> > MONTH_HEADER    : '*M October\n';
> >
> > Notice that in this grammar, I have used \n to denote new lines. But
> > although I have not declared what \n is, Antlr's lexical analyser
> > recognises
> > that it denotes a newline; in other words, \n is a pre-defined token in
> > Antlr grammar. I'm guessing there are others, and I want to be conscious
> > of
> > them as I work, but I have so far been unable to find a document that
> > lists
> > all the pre-defined tokens in Antlr's grammar. Presumably one exists
> > somewhere. If you know where, please could you tell me?
> >
> > Many thanks,
> >
> > Sam
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080624/e1dbd148/attachment-0001.html 


More information about the antlr-interest mailing list