[antlr-interest] Some ANTLR questions
Foolish Ewe
foolishewe at hotmail.com
Thu Nov 9 05:36:28 PST 2006
Hello All:
At this point, I've gotten the entire language I'm working on recognized and
have had some good experiences for the most part with ANTLRworks (although
I've
tripped a few of those sorts of things that might be keeping it in Beta).
I have a few questions
1) In my language UP and DOWN are keywords, yet when I tried to create rules
to scan
them, ANTLRworks treated them strangely, so I changed the token names to
MYUP and MYDOWN
and it worked. Are UP and DOWN keywords in ANTLR or Java (my parser
creates a Java file).
2) I'm now ready to actually do something with the language I'm recognizing,
and I'm wondering
if the smart thing is to go with an AST or to try to use rules to do
attribute propagation
(back when dinosaurs ruled the earth, i did a lex/yacc, flex/bison style
compiler which uses
rules to do it). If I do an AST, do I need to give AST management
annotations to each
production, or can I do it incrementally for unit testing?
3) I had some scanning issues, since my language includes undelimited
strings (sort of like
identifiers in many languages). I created a parse rule that matched all
alpha numeric
keywords and created scanning rules at the end which looked like:
ASTERISK: '*';
//The following rule was not well received by the scanner :-(
ALPHANUMSTRINGWITHWILDCARD : (ALPHANUM)+ ASTERISK;
// I think this needs to be last to avoid recognizing keywords.
NONKEYWORDUNDELIMITEDSTRING : (ALPHANUM)+;
ANTLR3 and ANTLRworks didn't like it and the debugger hung, leaving the
TCP/IP
port (I think it was 49153) unavailable under Windows XP (my employer's
development
environment) and preventing future debugging attempts (I didn't bother
resetting
the debugger's port number) until reboot.
As a work around, I created a rule:
alphaNumStringWithWildcard
:
NONKEYWORDUNDELIMITEDSTRING ASTERISK // this is a bit of a hack, it allows
white space between the two
;
But that since I have a rule that discards white space by sending it
to channel=99, then
this should accept say 'abc*' and 'abc *' so it is a bit over
permissive, so I would rather
use the lexer (or fix this rule somehow).
What is the best fix for this kind of rule?
Regards:
Bill M.
_________________________________________________________________
Try Search Survival Kits: Fix up your home and better handle your cash with
Live Search!
http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmtagline
More information about the antlr-interest
mailing list