[antlr-interest] Simple grammar doesn't complain about illegal input

amartinez at atc.ugr.es amartinez at atc.ugr.es
Thu Nov 13 11:29:57 PST 2008


If I do what yo say  I obtain this error:
line 7:8 missing EOF at 'adds'

Shouldn't this error be something like this?:
line 7:8 required (...)+ loop did not match anything at input 'adds'

I obtain te previous error by means of calling twice g.prog(), what is
wrong ... but for now it's the only way to make Antlr complain about the
illegal input ...

Isn't there any other way to do this?
Isn't the second error more accurate than the first one? I wonder.

Thanks in advance.

El jue, 13-11-2008 a las 12:46 -0600, Sam Harwell escribió:
Your grammar actually just stopped parsing at addj. You need to add an
> EOF to the end of the prog rule to make sure it prints an error rather
> than stops processing the file:
>
> prog: (add NEWLINE)+ EOF ;
>
> Sam
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org
>> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of
>> amartinez at atc.ugr.es
>> Sent: Thursday, November 13, 2008 12:23 PM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] Simple grammar doesn't complain about illegal
>> input
>>
>> Hi all,
>> I'm having problems in grammars that do not complain about illegal input
>> (throwing a recognition exception).
>> I want to parse a very little/restricted assembly language source, in
>> the
>> attached example only the 'add' instruction is processed for now.
>>
>> The grammar should process this input:
>> add r1, 23
>> add r4, r5
>>
>> Everything seams to work fine, but if I try this source:
>> add  r1, 23
>> addj r4,56
>>
>> the parser does not say anything about the inappropriate 'addj' (which
>> is
>> not a legal assembly token). I have even create an AST from the original
>> grammar, have debugged it on AntlrWorks, and have seen that this
>> environment
>> also does not complain on this input.
>>
>> Where is the mistake?
>>
>> Thank in advance, best regards
>>
>> Attached is an example of a grammar that reproduces my error:
>>
>> grammar T;
>> tokens {ADD;}
>> prog                    :       (add NEWLINE)+ ;
>> add                     :       TOKEN_ADD renreg ',' renreg ;
>> renreg          :       RX | UINT8 | ID ;
>>
>> RX                      :       ('r'|'R') HEXDIGIT;
>> TOKEN_NAMEREG   :       ('namereg' | 'Namereg' | 'NAMEREG');
>> TOKEN_CONST             :       ('const' | 'Const' | 'CONST');
>> TOKEN_ADD               :       'add' ;
>>
>> ID                      :       ('a'..'z'|'A'..'Z'|'_')
>> ('a'..'z'|'A'..'Z'|'_'|'.'|'0'..'9')* ;
>> UINT8                   :       HEXDIGIT? HEXDIGIT;
>> fragment
>> HEXDIGIT                :       ('0'..'9'|'a'..'f'|'A'..'F');
>> NEWLINE                 :       {getCharPositionInLine() > 0}?  =>
>> ('\r'?
>> '\n')+ ;
>> NEWLINE_AT_COLUM_ZERO   :       {getCharPositionInLine() == 0}? =>
>> ('\r'?
>> '\n')+ { $channel=HIDDEN; } ;
>> WS                      :       (' '|'\t') { $channel=HIDDEN; };
>> LINE_COMMENT    :       (';'|'//') (~'\n')* { $channel=HIDDEN; } ;
>>
>> // Java code:
>> import java.io.*;
>> import org.antlr.runtime.*;
>> import org.antlr.runtime.tree.*;
>>
>>
>> public class Main {
>>
>>     public static void main(String args[]) throws Exception {
>>
>>      CharStream input = new ANTLRFileStream(args[0]);
>>      TLexer lex = new TLexer(input);
>>      CommonTokenStream tokens = new CommonTokenStream(lex);
>>      TParser g = new TParser(tokens);
>>       g.prog ();
>>    }
>> }
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-addr
>> ess
>>


More information about the antlr-interest mailing list