[antlr-interest] Progess with Embedded Language.

Wed Feb 5 09:27:49 PST 2003

On Tuesday, February 4, 2003, at 11:46 PM, craigmain001 
<cmain at pps.co.za> wrote:

> Hi,
>
> I have made some progress with my attempt to parse an embedded
> language. Unfortunately I cannot get the darn lexer to ignore the
> input outside of the delimiters. I need some help.
>
> I need the lexer to merely reproduce all the input to stdout except
> for sections of the input between '[' and ']'. If these characters
> occur outside the delimiters, they must be escaped \[ and \].
>
> All the text within the delimiters must be parsed like a separate
> language. I have looked at the SED sample, and at the JavaDoc sample,
> but I keep getting unknown characters for all the input text. I have
> not managed to get the lexer to do what I need.
>
> Please help. I have tried myself quite extensively, and am sure there
> must be a simple solution to this.

Hi Craig,

Sounds like you have the right approach and those examples should help 
as you say.  Hmm...perhaps the new semantic predicates allowed on left 
edge of lexer rules in 2.7.2 will do the trick.

For example, in my TML (Terence's Markup Language), [...] means table 
and | means next column.  However, | is nothing outside of [...].  I 
use rules like this:

protected
TABLE
     :   '['! {context.inTable=true; captureText(); 
translator.begin_table();}
     ;

protected
END_TABLE
     :   ']'! {context.inTable=false; captureText(); 
translator.end_table();}
     ;

COL_SEP
     :   {context.inTable}? '|'! {captureText(); translator.col();}
     ;

Does this help?

Ter
--
Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org
Lecturer in Comp. Sci., University of San Francisco

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/