[antlr-interest] Some ANTLR questions
Jim Idle
jimi at intersystems.com
Thu Nov 9 10:07:04 PST 2006
1) UP and DOWN are indicators of tree structure when the output is an AST. There are a few other things like this, depending on your target language too. So, yes you cannot use these. ANTLR3 error checking is a bit lax right now as this will be filled in when ANTLR3 is written in ANTLR3. It becomes "obvious" as you use it right now ;-)
2) There are differing opinions of course, but if you are doing anything other than just straight output of something that can be done without reference to much else you have parsed, then always use a tree. The tree parser is easy to construct as the syntax of it is basically the same as the rewrite rules in the parser. You will need to cover all the structure in your parser really, but it isn't onerous (IMO).
3) Both ASTERISK and ALPHAblah are lexer rules so antlr will match the first one it finds basically. If you define ASTERISK as a fragment and then use it in two other rules, in order of which should be matched first, you will probably have more success, something like:
fragment
ASTERISK: '*' ;
fragment
ALPHANUM: 'a'..'z' | 0'..'9';
KEY1: 'X*Y' ;
KEY2: 'X' ;
WILD: ALPHANUM+ ASTERISK;
MULT: ASTERISK;
You can of course use lexer predicates if it isn’t as simple as ordering. If only the parser can distinguish because of context, then create LEXER rules that supply the bits for the parser to do so using whatever predicates might be needed, something like:
expr: id (MULT^^ id)*;
wild: ID ( (MULT)=> wild=MULT)?
-> {wild != null}? ^(WILDID $ID)
...
ID: ALPHANUM+;
MULT: '*';
-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
Hello All:
At this point, I've gotten the entire language I'm working on recognized and
have had some good experiences for the most part with ANTLRworks (although
I've
tripped a few of those sorts of things that might be keeping it in Beta).
I have a few questions
1) In my language UP and DOWN are keywords, yet when I tried to create rules
to scan
them, ANTLRworks treated them strangely, so I changed the token names to
MYUP and MYDOWN
and it worked. Are UP and DOWN keywords in ANTLR or Java (my parser
creates a Java file).
2) I'm now ready to actually do something with the language I'm recognizing,
and I'm wondering
if the smart thing is to go with an AST or to try to use rules to do
attribute propagation
(back when dinosaurs ruled the earth, i did a lex/yacc, flex/bison style
compiler which uses
rules to do it). If I do an AST, do I need to give AST management
annotations to each
production, or can I do it incrementally for unit testing?
3) I had some scanning issues, since my language includes undelimited
strings (sort of like
identifiers in many languages). I created a parse rule that matched all
alpha numeric
keywords and created scanning rules at the end which looked like:
ASTERISK: '*';
//The following rule was not well received by the scanner :-(
ALPHANUMSTRINGWITHWILDCARD : (ALPHANUM)+ ASTERISK;
// I think this needs to be last to avoid recognizing keywords.
NONKEYWORDUNDELIMITEDSTRING : (ALPHANUM)+;
ANTLR3 and ANTLRworks didn't like it and the debugger hung, leaving the
TCP/IP
port (I think it was 49153) unavailable under Windows XP (my employer's
development
environment) and preventing future debugging attempts (I didn't bother
resetting
the debugger's port number) until reboot.
As a work around, I created a rule:
alphaNumStringWithWildcard
:
NONKEYWORDUNDELIMITEDSTRING ASTERISK // this is a bit of a hack, it allows
white space between the two
;
But that since I have a rule that discards white space by sending it
to channel=99, then
this should accept say 'abc*' and 'abc *' so it is a bit over
permissive, so I would rather
use the lexer (or fix this rule somehow).
What is the best fix for this kind of rule?
Regards:
Bill M.
_________________________________________________________________
Try Search Survival Kits: Fix up your home and better handle your cash with
Live Search!
http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmtagline
--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.430 / Virus Database: 268.14.0/524 - Release Date: 11/8/2006 1:40 PM
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.430 / Virus Database: 268.14.0/524 - Release Date: 11/8/2006 1:40 PM
More information about the antlr-interest
mailing list