[antlr-interest] Simple Grammar Question
John Gardener
John.Gardener at carrotgarden.com
Sat Jan 17 11:34:49 PST 2009
*Johannes*, hi
Thank you for the answer,
Now I will know:
*1) "If the lexer can match the same input via more than one rule,
it chooses the rules which consumes the most input"
2) "Do not call a token from a token; instead, call fragments from a
token"
*
I wander if ANTLR community has "10 commandments" (or 100?) posted
anywhere? :-)
John
-------- Original Message --------
Subject: Re: [antlr-interest] Simple Grammar Question
From: Johannes Luber <jaluber at gmx.de>
To: John Gardener <John.Gardener at carrotgarden.com>
Cc: antlr-interest at antlr.org
Date: Sat 17 Jan 2009 11:38:33 AM CST
> John Gardener schrieb:
>
>> *Hello;*
>>
>> I am stuck with simple grammar; any help is much welcomed;
>>
>> I want to parse 2 term sentenses, such as:
>> <1: single digit > <2: name containing letters and digits > EOF
>>
>> Below comes:
>> 1) grammar
>> 2) test rig
>> 3) output
>>
>> PROBLEM:
>> Second term (name) seems to greedily consume whole input;
>>
>> Please let me know what is the proper way to deal with this?
>>
>
> If the lexer can match the same input via more than one rule, it chooses
> the rules which consumes the most input. Try the following rules instead:
>
> fragment NAME:;
>
> DIGIT : ('0'..'9'|'A'..'Z' {$type=NAME;}) ('0'..'9'
> {$type=NAME;}|'A'..'Z' {$type=NAME;})*
> { out.println("+DIGIT: " + $text ); } ;
>
> It should only generate DIGITs if no more than one character is matched
> and that character is a digit. But can names start with digits anyway?
> If not, this may work, too:
>
> DIGIT : '0'..'9'
> { out.println("+DIGIT: " + $text ); } ;
>
> NAME : 'A'..'Z' ( 'A'..'Z' | '0'..'9' ) *
> { out.println("+NAME: " + $text ); } ;
>
> Not using fragments for tokens and yet still calling other lexer rules
> in lexer rules may give strange results anyway and is discouraged by the
> experienced users. With fragments the above version looks like this:
>
> fragment DIGIT : '0'..'9';
>
> fragment ALPHA : 'A'..'Z';
>
> NUMBER : DIGIT { out.println("+NUMBER: " + $text ); } ;
>
>
> NAME : ALPHA ( ALPHA | DIGIT ) *
> { out.println("+NAME: " + $text ); } ;
>
> Johannes
>
>
>> *1) GRAMMAR*
>>
>> grammar Simple;
>>
>> options {
>> language = Java;
>> }
>>
>> @parser::header {
>> package simple;
>> import static java.lang.System.out;
>> }
>>
>> @lexer::header{
>> package simple;
>> import static java.lang.System.out;
>> }
>>
>> // PARSER
>>
>> record :
>> digit name EOF
>> { out.println( "+record: " + $text ); };
>>
>> digit : DIGIT
>> { out.println( "+digit: " + $text ); };
>>
>> name : NAME
>> { out.println( "+name: " + $text ); };
>>
>>
>> // LEXER
>>
>> DIGIT : '0'..'9'
>> { out.println("+DIGIT: " + $text ); } ;
>>
>> LETTER : 'A'..'Z'
>> { out.println("+LETTER: " + $text ); } ;
>>
>> NAME : ( LETTER | DIGIT ) +
>> { out.println("+NAME: " + $text ); } ;
>>
>>
>> *2) TEST RIG*
>>
>> package simple;
>>
>> import java.io.ByteArrayInputStream;
>>
>> import org.antlr.runtime.ANTLRInputStream;
>> import org.antlr.runtime.CommonTokenStream;
>>
>> import static java.lang.System.out;
>>
>> public class SimpleTest {
>>
>> public static void main(String[] args) throws Exception {
>>
>> String record = "3B5A";
>>
>> ByteArrayInputStream stream = new ByteArrayInputStream(record
>> .getBytes());
>>
>> ANTLRInputStream input = new ANTLRInputStream(stream);
>>
>> SimpleLexer lexer = new SimpleLexer(input);
>>
>> CommonTokenStream tokens = new CommonTokenStream(lexer);
>>
>> SimpleParser parser = new SimpleParser(tokens);
>>
>> parser.record();
>>
>> out.println(record);
>>
>> }
>>
>> }
>>
>>
>> *3) TEST OUTPUT*
>>
>> +DIGIT: 3
>> +LETTER: 3B
>> +DIGIT: 3B5
>> +LETTER: 3B5A
>> +NAME: 3B5A
>> line 1:0 missing DIGIT at '3B5A'
>> +digit: null
>> +name: 3B5A
>> +record: 3B5A
>> 3B5A
>>
>>
>> *Thank you, *
>>
>> John
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090117/78317d38/attachment.html
More information about the antlr-interest
mailing list