[antlr-interest] Simple grammar doesn't complain about illegal input

Sam Harwell sharwell at pixelminegames.com
Thu Nov 13 10:46:41 PST 2008

Your grammar actually just stopped parsing at addj. You need to add an
EOF to the end of the prog rule to make sure it prints an error rather
than stops processing the file:

prog: (add NEWLINE)+ EOF ;


-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of
amartinez at atc.ugr.es
Sent: Thursday, November 13, 2008 12:23 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Simple grammar doesn't complain about illegal

Hi all,
I'm having problems in grammars that do not complain about illegal input
(throwing a recognition exception).
I want to parse a very little/restricted assembly language source, in
attached example only the 'add' instruction is processed for now.

The grammar should process this input:
add r1, 23
add r4, r5

Everything seams to work fine, but if I try this source:
add  r1, 23
addj r4,56

the parser does not say anything about the inappropriate 'addj' (which
not a legal assembly token). I have even create an AST from the original
grammar, have debugged it on AntlrWorks, and have seen that this
also does not complain on this input.

Where is the mistake?

Thank in advance, best regards

Attached is an example of a grammar that reproduces my error:

grammar T;
tokens {ADD;}
prog                    :       (add NEWLINE)+ ;
add                     :       TOKEN_ADD renreg ',' renreg ;
renreg          :       RX | UINT8 | ID ;

RX                      :       ('r'|'R') HEXDIGIT;
TOKEN_NAMEREG   :       ('namereg' | 'Namereg' | 'NAMEREG');
TOKEN_CONST             :       ('const' | 'Const' | 'CONST');
TOKEN_ADD               :       'add' ;

ID                      :       ('a'..'z'|'A'..'Z'|'_')
('a'..'z'|'A'..'Z'|'_'|'.'|'0'..'9')* ;
UINT8                   :       HEXDIGIT? HEXDIGIT;
HEXDIGIT                :       ('0'..'9'|'a'..'f'|'A'..'F');
NEWLINE                 :       {getCharPositionInLine() > 0}?  =>
'\n')+ ;
NEWLINE_AT_COLUM_ZERO   :       {getCharPositionInLine() == 0}? =>
'\n')+ { $channel=HIDDEN; } ;
WS                      :       (' '|'\t') { $channel=HIDDEN; };
LINE_COMMENT    :       (';'|'//') (~'\n')* { $channel=HIDDEN; } ;

// Java code:
import java.io.*;
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;

public class Main {

    public static void main(String args[]) throws Exception {

     CharStream input = new ANTLRFileStream(args[0]);
     TLexer lex = new TLexer(input);
     CommonTokenStream tokens = new CommonTokenStream(lex);
     TParser g = new TParser(tokens);
      g.prog ();

List: http://www.antlr.org/mailman/listinfo/antlr-interest

More information about the antlr-interest mailing list