[stringtemplate-interest] Two-Character-Bracket.templateLexer
Dreyer Ulf (CR/APA3)
Ulf.Dreyer at de.bosch.com
Tue Mar 13 08:22:29 PDT 2007
Hi Terence,
thanks for looking into this.
I've made some progress changing the AngleBracketLexer
and today I'm going to write some tests.
Contents:
-) some Comments to Terence's answer
-) [What I did] ... to create said lexer
-) [Testresults] / encountered problems
> -----Original Message-----
> Howdy. Note I just submitted the ANTLR book to publisher...I'm now
> going to try to catch up with ANTLR and ST bug reports etc...
I think, I'll get me one of those even if I am currently only using ST.
> > I'd like to try for <$ ...$> as it is not used in my target
> > language AND it is possible to
> > differentiate begin- and end-delimiters.
> Hmm...two char, eh? well, a syntactic predicate ought to work..hang
> on...hmm...grammar doesn't look too hard to change. Try using
> literals not a rule ref.
[What I did]: (sorry Terence but I find this easier on the eyes ;)
change all '<' Occurrences to ACTIONBEGIN
change all '>' Occurrences to ACTIONEND
and define
ACTIONBEGIN
: "<$"
;
ACTIONEND
: "$>"
;
This its mostly fine for the ACTIONBEGIN-part.
In the grammar ACTIONEND (formally '>') is often inverted ( ~('>') )
and this does NOT work with Rules or even two character combinations.
So I wrote two predicates upcomingACTIONBEGIN(int i <lookahead>)
and upcomingACTIONEND(int i <lookahead>) (checking LA(i) and LA(i+1)).
Now most occurrences of ~(ACTIONEND,'<somechar>', '<someOtherChar') can be
changed to
(!upcomingACTIONEND(1))? ~('<somechar>', '<someOtherChar')
which ought to be equivalent.
The only part I am really unsure about is the escape-character thing.
The easy way would be to disallow escaping of "<$" and "$>" (as those are the choices
BECAUSE the don't occur in the target language) but I feel this is
somewhat sloppy.
If my tests are successful I may post the entire new grammar (or a diff)
if anyone is interested.
[Testresults]
These modifications failed at an early test:
group test;
top(foo,bar) ::= <<<$foo$>___<$bar$>>>
^
This input yields a TokenStreamRecognitionException
Message="expecting '$', found 'r'"
It took me quite a while to figure out, that this Error results
not from the template lexer but from the group lexer.
The BIGSTRING-rule swallows the first ">>" and discards the third ">"
giving only "<<<$foo$>___<$bar$" to the template lexer.
@Terence: 1) I don't think this is easily fixed (especially for arbitrary delimiters) or is there?
We would need a predicate testing for the current template-delimiter and only match "<<"
if it's not part of that.
2) Template lexers are very easily plugged in but is there a mechanism to change
the group-lexer?
[Testresults] continued:
Some quick tests (VERY simple cases) for escaping (LITERAL) and if-else-endif
seem to work as expected.
That's all for today - ... to be continued ;)
-Ulf
--
Dipl. Inf. Ulf Dreyer
Robert Bosch GmbH
Zentralbereich Forschung und Vorausentwicklung
Software und Systemengineering in der Fertigungsautomatisierung CR/APA3
Postfach 30 02 40 D-70442 Stuttgart
Tel.: 0711/811- 34365
Fax: 0711/811-518 34365
eMail: ulf . dreyer at de . bosch . com
Robert Bosch GmbH, Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart HRB 14000 Aufsichtsratsvorsitzender: Hermann Scholl; Geschäftsführung: Franz Fehrenbach, Siegfried Dais; Bernd Bohr, Wolfgang Chur, Rudolf Colm, Gerhard Kümmel, Wolfgang Malchow, Peter Marks; Volkmar Denner, Peter Tyroller.
More information about the stringtemplate-interest
mailing list