[antlr-interest] Re: Reading contents of file using Antlr

Sharon Li hushlee83 at yahoo.com.sg
Thu Jan 23 15:46:28 PST 2003


Hi Micheal,
G'day. Thanks for being a great help. I guess I understand Antlr a lot better now. I'm really still trying to understand the language so pls pardon my mistakes =P Ter's tutorial is a great source of information for beginers like me and I appreciate that. Hope to see more examples. 
Cheers,
Sharon
 "micheal_jor <open.zone at virgin.net>" <open.zone at virgin.net> wrote:Hi Sharon,

I'll touch on a few areas where ANTLR can make your life easier. 
Completing the grammar should then be a much easier (and rewarding) 
experience.

[Caveat: I haven't run your example with ANTLR. Just looked at it on 
the site. I may have missed some issues.]

1. Keywords - See changes ###[1] below.

ANTLR provides the tokens {...} options to let you specify the 
keywords in your language. The assumption is that keywords can't also 
be identifiers.
NOTE: You can override that assumption in your Parser grammar for 
individual keywords as shown in:
http://groups.yahoo.com/group/antlr-interest/message/6503

2. NUMERIC rule - 

Are you sure this is what NUMERIC values look like in your system?. 
This rule will accept the following: 
-...,-
,,9-0,..
,,,
... etc.
Studying a few samples of what a NUMERIC looks like (and musn't look 
like) would help make it clearer. I did try to clean it up though be 
aware that RECORD and NUMERIC properly belongs in the Parser not the 
Lexer - where they would be re-named to start with a lower-case 
letter of course.

// Is this what you really meant?
protected LETTER : ( 'a'..'z' | 'A'..'Z' ) ; 
protected DIGIT : ( '0'..'9' ) ; 
protected NUMBER : ('-')? ((DIGIT)* '.')? (DIGIT)+ ;
ID : LETTER ( DIGIT | LETTER )* ;
NUMERIC : ( NUMBER ( ',' NUMBER )*; 
RECORD : (ID | NUMERIC)(!~('\r'|'\n'|':'))+ ; 

// Your original
RECORD : ('a'..'z' | 'A'..'Z'| NUMERIC)(!~('\r'|'\n'|':'))+ ; 
NUMERIC : ('0'..'9'|','|'.'|'-')+; 

3. RECORD rule

What does the trailing pattern '(!~('\r'|'\n'|':'))+' do?. What elase 
is part of a record that you are trying to match? Whitespace and 
SEMICOLON are already taken care of.

Anyways, good luck and be sure to give Ter's Getting started guide a 
good workover ;-)

Micheal

> 
> Hi,
> 
> Thanks for th help. Below is the code for my grammer file.
> /*******************************************************************
****/
> class CSVParser extends Parser;
> options { k=4; }
> {
> LsystemsString ls = new LsystemsString();
> 
> public LsystemsString getLsystemsString(){
> return ls;
> }
> }
> file : ( line (NEWLINE line)*(NEWLINE)? EOF)
> {System.out.println("file matched");}
> ;
> line : ((record)+ )
> ;
> record : ((r:KEYWORD) (sc: SEMICOLON)? (n:RECORD)* (COMMENT)?)
> {
> System.out.println("attribute = " + r.getText());
> System.out.println("value = "+n.getText());
> ls.addNext(r.getText(),n.getText());
> System.out.println("LS size: "+ls.getArrayListSize());
> }
> ;
> 
> class CSVLexer extends Lexer;
> options { 
> charVocabulary='\3'..'\377'; 
> k = 4;
> }

// change ###[1] - use tokens for keywords
tokens
{
ANGLE = "angle";
FACTOR = "factor";
INITIAL = "initial";
.........
.........
.........
Z = "z";
ELASTICITY = "elasticity";
INCREMENT = "increment";
RENDER = "render";
MODE = "mode";

}

> RECORD : ('a'..'z' | 'A'..'Z'| NUMERIC)(!~('\r'|'\n'|':'))+ ;
> NUMERIC : ('0'..'9'|','|'.'|'-')+;
> SEMICOLON : ':';
> BRACKET : ('(' | ')');
> COMMENT : "/*" (options {greedy=false;} :.)* "*/" ;
> NEWLINE : ('\r''\n')=> '\r''\n' //DOS
> | '\r' //MAC
> | '\n' //UNIX
> { newline(); }
> ;
> 
> WS : (' '|'\t') { $setType(Token.SKIP); } ;
> /*******************************************************************
*****************/
> mzukowski at y... wrote:Lexical nondeterminism means you have two 
lexical rules that are in
> conflict, meaning they have the same prefix. Post a small but 
complete
> example which has the error message and we'll be able to help you. 
> 
> Monty
> 
> 
> -----Original Message-----
> From: Sharon Li [mailto:hushlee83 at y...]
> Sent: Tuesday, January 21, 2003 11:43 PM
> To: Antlr Interest Group
> Subject: [antlr-interest] Reading contents of file using Antlr
> 
> 
> Hi, 
> I'm a Java programmer and relatively new to Antlr. I need to write 
Antlr
> code to read in a text file and extract only the necessary 
information. How
> can I go about doing that? An example of the contents of the file 
might look
> like that : 
> angle focus : 0.0005
> color : blue
> line width : 12
> I often get the error msg:
> warning : lexical nondeterminism upon ...
> Also when do we use the TreeParser and what is the different 
between a
> Parser and a TreeParser? When do we define tokens and what is it 
for ? Pls
> help! Thanks very much.
> Yahoo! Travel
> - Get the latest travel deals in town! 
> Your use of Yahoo! Groups is subject to the Yahoo! Terms of 
Service. 
> 
> 
> 
> Your use of Yahoo! Groups is subject to 
http://docs.yahoo.com/info/terms/ 
> 
> 
>  Yahoo! Travel
> - Get the latest travel deals in town!




Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 


 Yahoo! Travel
- Get the latest travel deals in town!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20030124/8a01f65e/attachment.html


More information about the antlr-interest mailing list