[antlr-interest] Overloaded Lexemes!

Steve Taplin steve_taplin at yahoo.co.uk
Sat May 1 10:07:00 PDT 2004


Thanks Mark/John.

charVocab was the problem.  The Antlr doc adequately explains why this
is necessary.  I thought I'd tried this. I'll hang my head in shame...

Out of interest, the language I am parsing is quite poorly behaved.
That is, the string literals are not really identified syntactically
(only semantically).  E.g.

COMMENT TEXT(The next line prints an unblinking text string "My
label")\r\n
PRINT   XCOORD(10 + X) YCOORD(20 + Y) LABEL(My label) BLINK(N)\r\n

The language follows pattern of:

COMMAND (PARAMETER)* '\r''\n'

Clearly, some values within the parentheses will need to be tokenised
further whereas string literals need to be sent as tokens there and
then.

The only solution I can think of is to pre-format the input stream and
identify string literals by bounding them in quotes (based on my
semantic understanding of the language).

Unfortunately, this means I cannot define the language with grammar
alone.

Does this seem an appropriate approach or is their a better way within
the lexer grammar?

Steve.

-----Original Message-----
From: John D. Mitchell [mailto:johnm-antlr at non.net] 
Sent: 28 April 2004 17:13
To: antlr-interest at yahoogroups.com
Subject: [antlr-interest] Overloaded Lexemes!

>>>>> "steve" == steve taplin <steve_taplin at yahoo.co.uk> writes:
[...]

> I am attempting to parse a computer language that contains comments
(that
> may contain any characters). They are of the form:

> COMMENT TEXT(jasdfjalk;fjkl;%$£$%lldf'slf)

> COMMENT TEXT(jas...dfjalk;fjkl;%$£$%lldf'slfsd][}{}*&fdsadsvdf#'''""")
> ...

Is the comment start delimiter literally the characters "COMMENT TEXT"
followed by a left-parenthesis or is it the characters "COMMENT"
followed
by another set of chacters (that your are refering to as TEXT) followed
by
a left-parenthesis or something else?

Is whitespace allowed inside the comment?  Anywhere, nowhere, or just
inside the parentheses?

Is the end delimiter exactly a right-parenthesis immediately followed by
a
newline sequence or can there be other whitespace in-between?

In addition, must comments be contained completely on a single line or
can
they span multiple lines?  If there must be on a single line, is there
some
reason that you need to care about the internal structure of the
comment?

Did you make sure that you set a proper charVocab range?

Take care,
	John


 
Yahoo! Groups Links



 



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list