[antlr-interest] Re: Reading contents of file using Antlr
micheal_jor <open.zone at virgin.net>
open.zone at virgin.net
Thu Jan 23 03:03:12 PST 2003
Hi Sharon,
I'll touch on a few areas where ANTLR can make your life easier.
Completing the grammar should then be a much easier (and rewarding)
experience.
[Caveat: I haven't run your example with ANTLR. Just looked at it on
the site. I may have missed some issues.]
1. Keywords - See changes ###[1] below.
ANTLR provides the tokens {...} options to let you specify the
keywords in your language. The assumption is that keywords can't also
be identifiers.
NOTE: You can override that assumption in your Parser grammar for
individual keywords as shown in:
http://groups.yahoo.com/group/antlr-interest/message/6503
2. NUMERIC rule -
Are you sure this is what NUMERIC values look like in your system?.
This rule will accept the following:
-...,-
,,9-0,..
,,,
... etc.
Studying a few samples of what a NUMERIC looks like (and musn't look
like) would help make it clearer. I did try to clean it up though be
aware that RECORD and NUMERIC properly belongs in the Parser not the
Lexer - where they would be re-named to start with a lower-case
letter of course.
// Is this what you really meant?
protected LETTER : ( 'a'..'z' | 'A'..'Z' ) ;
protected DIGIT : ( '0'..'9' ) ;
protected NUMBER : ('-')? ((DIGIT)* '.')? (DIGIT)+ ;
ID : LETTER ( DIGIT | LETTER )* ;
NUMERIC : ( NUMBER ( ',' NUMBER )*;
RECORD : (ID | NUMERIC)(!~('\r'|'\n'|':'))+ ;
// Your original
RECORD : ('a'..'z' | 'A'..'Z'| NUMERIC)(!~('\r'|'\n'|':'))+ ;
NUMERIC : ('0'..'9'|','|'.'|'-')+;
3. RECORD rule
What does the trailing pattern '(!~('\r'|'\n'|':'))+' do?. What elase
is part of a record that you are trying to match? Whitespace and
SEMICOLON are already taken care of.
Anyways, good luck and be sure to give Ter's Getting started guide a
good workover ;-)
Micheal
>
> Hi,
>
> Thanks for th help. Below is the code for my grammer file.
> /*******************************************************************
****/
> class CSVParser extends Parser;
> options { k=4; }
> {
> LsystemsString ls = new LsystemsString();
>
> public LsystemsString getLsystemsString(){
> return ls;
> }
> }
> file : ( line (NEWLINE line)*(NEWLINE)? EOF)
> {System.out.println("file matched");}
> ;
> line : ((record)+ )
> ;
> record : ((r:KEYWORD) (sc: SEMICOLON)? (n:RECORD)* (COMMENT)?)
> {
> System.out.println("attribute = " + r.getText());
> System.out.println("value = "+n.getText());
> ls.addNext(r.getText(),n.getText());
> System.out.println("LS size: "+ls.getArrayListSize());
> }
> ;
>
> class CSVLexer extends Lexer;
> options {
> charVocabulary='\3'..'\377';
> k = 4;
> }
// change ###[1] - use tokens for keywords
tokens
{
ANGLE = "angle";
FACTOR = "factor";
INITIAL = "initial";
.........
.........
.........
Z = "z";
ELASTICITY = "elasticity";
INCREMENT = "increment";
RENDER = "render";
MODE = "mode";
}
> RECORD : ('a'..'z' | 'A'..'Z'| NUMERIC)(!~('\r'|'\n'|':'))+ ;
> NUMERIC : ('0'..'9'|','|'.'|'-')+;
> SEMICOLON : ':';
> BRACKET : ('(' | ')');
> COMMENT : "/*" (options {greedy=false;} :.)* "*/" ;
> NEWLINE : ('\r''\n')=> '\r''\n' //DOS
> | '\r' //MAC
> | '\n' //UNIX
> { newline(); }
> ;
>
> WS : (' '|'\t') { $setType(Token.SKIP); } ;
> /*******************************************************************
*****************/
> mzukowski at y... wrote:Lexical nondeterminism means you have two
lexical rules that are in
> conflict, meaning they have the same prefix. Post a small but
complete
> example which has the error message and we'll be able to help you.
>
> Monty
>
>
> -----Original Message-----
> From: Sharon Li [mailto:hushlee83 at y...]
> Sent: Tuesday, January 21, 2003 11:43 PM
> To: Antlr Interest Group
> Subject: [antlr-interest] Reading contents of file using Antlr
>
>
> Hi,
> I'm a Java programmer and relatively new to Antlr. I need to write
Antlr
> code to read in a text file and extract only the necessary
information. How
> can I go about doing that? An example of the contents of the file
might look
> like that :
> angle focus : 0.0005
> color : blue
> line width : 12
> I often get the error msg:
> warning : lexical nondeterminism upon ...
> Also when do we use the TreeParser and what is the different
between a
> Parser and a TreeParser? When do we define tokens and what is it
for ? Pls
> help! Thanks very much.
> Yahoo! Travel
> - Get the latest travel deals in town!
> Your use of Yahoo! Groups is subject to the Yahoo! Terms of
Service.
>
>
>
> Your use of Yahoo! Groups is subject to
http://docs.yahoo.com/info/terms/
>
>
> Yahoo! Travel
> - Get the latest travel deals in town!
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list