[antlr-interest] Lexical hacks ... again.

Sun Feb 27 11:40:35 PST 2011

On 02/27/2011 09:40 AM, g4 at novadsp.com wrote:
> Right. ANTLR != PCCTS. Wow.

Well, er, ah, it was at one time.  PCCTS antlr was ANTLR v1.
I did my first project using it.  I even got to submit a bug fix for its
look-ahead stuff that I needed for my project (in one of the MR releases
long after Ter had moved on to ANTLR v2).

> I've written a modest utility to transform CoCo/R grammar specs to ANTLR 
> 3. CoCo/R allows syntactic predicates (for LL(1) disambiguation) to be 
> embedded as 'IF' '(' anything ')'. Not wanting to parse all the code 
> I've dealt with this by having the lexer eat all content after spotting 
> the 'IF'.
> 
> Question is this: Is there any way to embed a 'sub-lexer' at this point? 
> i.e. grab the input stream and point it at a lexer for C#/C code?

Sounds like you should treat them similar to the way you would treat a
comment.  Have the lexer consume all the text and either write it to a
separate stream or a single token in a separate stream.  Then you could
deal with that text however you wish (like in a separate lexer
instantiation).  Sorry, I'm thinking off the top of my head, no
implementation details.

> I don't need to evaluate the expression itself but storing lexical 
> tokens rather than characters would be a plus in later stages of analysis.

Especially if you could feed them to the same parser at some future
point....

Ter talked about multiple token streams (channels?) in ANTLR v2.  I've
never used them myself.  I'm not sure what support there is for them in v3.

> Thx++
> 
> Jerry.

-- 
Kevin J. Cummings
kjchome at verizon.net
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)