[antlr-interest] literals, identifiers, tokens oh my
ronald.petty at milliman.com
ronald.petty at milliman.com
Fri Apr 9 13:41:10 PDT 2004
I have the following parser rule
type
: "string"
;
and the following lexer rules
ID
options {
testLiterals=true;
paraphrase = "an identifier";
}
: ('a'..'z') ('a'..'z'|'0'..'9'|'_'|'.')*
;
When I run my parser and give the following input (string)
ANTLR Parser Generator Version 2.7.3 1989-2004 jGuru.com
ANTLR Parser Generator Version 2.7.3 1989-2004 jGuru.com
Note: * uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
> program; string
> lexer mID; c==s
< lexer mID; c==
LA(1)==string
< program; LA(1)==string
Doesn't "string" become a literal because it is in a parser rule? I have
the start up parser rule of
start
: (type (WS)+)+
;
So I assume I would be able to type in string whitespace string
whitespace etc...
Could someone clear up using Tokens, Literals, and seperating them from
Identifiers. I have read the docs about 3 times now, and will start the
4th run now.
Thanks
Ron
ps. Here is the parser / lex in the flesh incase I did something wrong
describing (which I do sometimes). Also if any has other advice on how to
split languages up into subrules, I like to hear it.
options {
}
{
import java.io.*;
class Main {
public static void main(String[] args) {
try {
VB6Lexer lexer = new VB6Lexer(System.in);
VB6Parser parser = new VB6Parser(lexer);
parser.program();
} catch(Exception e) {
e.printStackTrace();
}
}
}
}
class VB6Parser extends Parser;
options {
importVocab=VB6;
}
//not sure about Attribute blah
//not sure about Option blah
program
: (declaration | WS | NL | SL_COMMENT)*
;
declaration
: variable
| sub
| function
| type
;
variable
: DIM
( WS | NL | SL_COMMENT )+
ID
( WS | NL | SL_COMMENT )+
( AS ( WS | NL )+ VOID )+
{ System.out.println("MATCHED DIM"); }
;
sub
: SUB ( WS | NL )+ ID
{ System.out.println("MATCHED SUB"); }
;
function
: FUNCTION ( WS | NL )+ ID
{ System.out.println("MATCHED FUNCTION"); }
;
type
: "string"
;
options {
}
class VB6Lexer extends Lexer;
options {
exportVocab=VB6;
charVocabulary='\3'..'\377'; //Latin, need to figure out Japanese
charat
er sets for UNICODE, you can do non continuous ranges
caseSensitive=false;
caseSensitiveLiterals=false;
}
tokens {
DIM = "dim";
FUNCTION = "function";
SUB = "sub";
AS = "as";
VOID = "void";
}
WS
: ' '
| '\t'
{ $setType(Token.SKIP); }
;
NL : '\r' '\n' { newline(); }
| '\n' { newline(); }
;
SEMI : ','
;
ID
options {
testLiterals=true;
paraphrase = "an identifier";
}
: ('a'..'z') ('a'..'z'|'0'..'9'|'_'|'.')*
;
SL_COMMENT
: "'" (~('\r'|'\n'))* (("\r\n")=>'\r''\n'|'\n')
;
**************************************************************************************
This communication is intended solely for the addressee and is
confidential. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in
reliance on it, is prohibited and may be unlawful. Unless indicated
to the contrary: it does not constitute professional advice or
opinions upon which reliance may be made by the addressee or any
other party, and it should be considered to be a work in progress.
**************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20040409/539a2c8d/attachment.html
More information about the antlr-interest
mailing list