[antlr-interest] rule vs. subrule conflict, or

Cameron Palmer cameron.palmer at gmail.com
Sun May 20 00:42:27 PDT 2007


Jim - Thank you very much. I believe I have conquered Comment handling, and
I also found a bit on the case insensitive issue at
http://www.antlr.org/wiki/display/ANTLR3/Migrating+from+ANTLR+2+to+ANTLR+3.

I am encountering an interesting conflict between a rule and a subrule.

In the dsBody rule if i choose an ID that matches one of the iStructElement
rules I get a MismatchTokenException or with a slightly different
formulation NoViableAltException.

Example Input that causes error:
.data
i:12 <- problem if you change 'i' to 'b' all is well
i 12
i 13
i 14
.end

what i find stange is that I would assume that the ID COLON INT EOL in the
grammar below would be checked beforehand and therefore no ambiguity. The
syntax diagram in ANTLRworks and the source code generated by ANTLR leads me
to believe this is the correct interpretation.

Grammar fragment:
dataSection
    : DATABEGIN EOL
    dsBody
    DATAEND EOL
    ;

dsBody    : ID COLON INT EOL
    (iStructElement EOL)+
    ;

iStructElement
    : 'i' INT
    | 'ri' INT INT
    | 'f'
    | 'rf' INT
    | 'a' INT
    | 'ra' INT INT
    | 'u'
    | 'ru' INT
    ;

Thank you,

Cameron.

On 5/18/07, Jim Idle <jimi at temporal-wave.com> wrote:
>
>  Cameron,
>
>
>
> Your comment rule for the lexer needs to change a little to terminate in
> a  non-greedy fashion and not to attempt to embed one rule (EOL) in the
> other (otherwise you need to use a separate "fragment" and reference that (I
> prefer the formulation below, but you may or may not want the EOL to be part
> of the comment token (I assume not)):
>
>
>
> COMMENT
>     : SEMI (~( '\r' | '\n'))* {$channel=HIDDEN};
>
>
>
> Case sensitivity is normal for the ANTLR 3 token stream, however I have
> posted (or someone posted my code at least if I don't post it ;-), a way to
> override the default stream with a method that will compare in a case
> insensitive way but preserve the original case in the token text (useful in
> some cases such as formatters and for literals so on). See:
>
>
>
>
> http://www.antlr.org/pipermail/antlr-interest/2007-January/019008.html
>
>
>
> I need to move this code into the wiki, but then, I need to do a lot of
> things ;-)
>
>
> Jim
>
>
>
> *From:* antlr-interest-bounces at antlr.org [mailto:
> antlr-interest-bounces at antlr.org] *On Behalf Of *Cameron Palmer
> *Sent:* Thursday, May 17, 2007 10:17 PM
> *To:* antlr-interest at antlr.org
> *Subject:* [antlr-interest] SDF assembly language grammar
>
>
>
> I am a student rewriting the assembler for an ISA called SDF (Scheduled
> Dataflow). You can check out our work at http://csrl.unt.edu/sdf . Anyway
> I am just writing the grammar and am having trouble making the leap to code
> generation. I had started with v2 but figure I might just convert the
> grammar to v3 today. The goal of this grammar is to emit opcodes for each
> instruction and maintain numeric translations for the labels. I think I
> would rather do the label translation as an AST rather than backpatching. I
> have attached my grammar and sample assembly. My questions:
> Why it seems to stop parsing about the comment line (what am i doing it
> wrong?)
> Where did the caseSensitiveLiterals option disappear?
>
> Thank you,
>
> Cameron.
>
> A sample of the assembly:
> version 2.0.0
> .data
> .end
> code main
>         putr1 main.1
>         forkep r1
>         stop
> main.1:
>         ; alloc sum frame
>         put sum, r4
>         put 4, r5
>         falloc r4, r5, r6
>         ; alloc result frame
>         put main_res, r4
>         put 1, r5
>         falloc r4, r5, r7
>         ;
>         putr1 main.2
>         forksp r1
>         stop
>
>
> The Grammar:
> grammar SDF;
>
> @header {
> package edu.unt.csrl.sdf.asm;
> import java.util.HashMap();
> }
> @members {
> HashMap memory = new HashMap(); // Something like this to hold label
> values that will be rewritten
> }
> program
>     : (EOL)?
>     versionSection
>     dataSection
>     (codeBlock)*
>     ;
>
> versionSection
>     : 'version' (.)+ EOL;
>
> dataSection
>     : DATABEGIN EOL
>         dsBody
>       DATAEND EOL;
>
> dsBody
>     : ID COLON INT EOL
>         iStructBody
>         dsBody
>     | ;
>
> iStructBody
>     : ('i' | 'ri' | 'f' | 'rf' | 'a' | 'ra' | 'u' | 'ru')=>
>         iStructElement
>         iStructBody
>     | ;
>
> iStructElement
>     : 'i' INT EOL
>     | 'ri' INT INT EOL
>     | 'f' FLOAT EOL
>     | 'rf' INT FLOAT EOL
>     | 'a' INT EOL
>     | 'ra' INT INT EOL
>     | 'u' EOL
>     | 'ru' INT EOL;
>
> codeBlock
>     : 'code' ID EOL
>         code;
>
> code
>     : label code
>     | instruction code
>     ;
>
> label
>     : ID COLON (EOL)?;
>
> instruction
>     : 'nop' EOL
>     | 'add' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'sub' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'mul' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'div' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'mod' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'and' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'or' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'not' reg['r'] COMMA reg['r'] EOL
>     | 'xor' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'band' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'bor' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'bnot' reg['r'] COMMA reg['r'] EOL
>     | 'shl' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'shr' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'sar' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'neg' reg['r'] COMMA reg['r'] EOL
>     | 'max' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'min' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'abs' reg['r'] COMMA reg['r'] EOL
>     | 'lt' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'le' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'eq' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'ne' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'ge' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'gt' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'tbl' reg['r'] COMMA reg['r'] EOL
>     | 'tch' reg['r'] COMMA reg['r'] EOL
>     | 'fadd' reg['f'] COMMA reg['f'] COMMA reg['f'] EOL
>     | 'fsub' reg['f'] COMMA reg['f'] COMMA reg['f'] EOL
>     | 'fmul' reg['f'] COMMA reg['f'] COMMA reg['f'] EOL
>     | 'fdiv' reg['f'] COMMA reg['f'] COMMA reg['f'] EOL
>     | 'flr' reg['f'] COMMA reg['f'] EOL
>     | 'ceil' reg['f'] COMMA reg['f'] EOL
>     | 'flt' reg['f'] COMMA reg['f'] COMMA reg['r'] EOL
>     | 'fle' reg['f'] COMMA reg['f'] COMMA reg['r'] EOL
>     | 'feq' reg['f'] COMMA reg['f'] COMMA reg['r'] EOL
>     | 'fne' reg['f'] COMMA reg['f'] COMMA reg['r'] EOL
>     | 'fge' reg['f'] COMMA reg['f'] COMMA reg['r'] EOL
>     | 'fgt' reg['f'] COMMA reg['f'] COMMA reg['r'] EOL
>     | 'trl' reg['r'] COMMA reg['f'] EOL
>     | 'tdb' EOL // TODO
>     | 'tin' reg['f'] COMMA reg['r'] EOL
>     | 'gput' INT COMMA reg['g'] EOL
>     | 'gadd' reg['g'] COMMA reg['g'] COMMA reg['g'] EOL
>     | 'gsub' reg['g'] COMMA reg['g'] COMMA reg['g'] EOL
>     | 'gmul' reg['g'] COMMA reg['g'] COMMA reg['g'] EOL
>     | 'gtl' reg['g'] COMMA reg['r'] EOL
>     | 'ltg' reg['r'] COMMA reg['g'] EOL
>     | 'move' reg['r'] COMMA reg['r'] EOL
>     | 'putr1' (INT | ID) EOL
>     | 'put' (INT | ID) COMMA reg['r'] EOL
>     | 'load' reg['r'] PIPE (reg['r'] | INT) COMMA reg['r'] EOL
>     | 'store' reg['r'] COMMA reg['r'] PIPE (reg['r'] | INT) EOL
>     | 'store1' EOL // TODO
>     | 'rstore' EOL // TODO
>     | 'ialloc' reg['r'] COMMA reg['r'] EOL
>     | 'ifree' reg['r'] EOL
>     | 'ifetch' reg['r'] PIPE (reg['r'] | INT) COMMA reg['r'] EOL
>     | 'istore' reg['r'] COMMA reg['r'] PIPE (reg['r'] | INT) EOL
>     | 'forksp' reg['r'] ( | COMMA reg['r']) EOL
>     | 'forkep' reg['r'] ( | COMMA reg['r']) EOL
>     | 'stop' EOL
>     | 'falloc' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'ralloc' EOL // TODO
>     | 'ffree' ( | reg['r']) EOL
>     | 'input' INT COMMA reg['r'] EOL
>     | 'output' reg['r'] COMMA INT EOL
>     | 'skip' EOL // TODO
>     | 'ifchcl' EOL // TODO
>     | 'pow' reg['r'] COMMA reg['r'] COMMA reg['r'] EOL
>     | 'sfalloc' EOL // TODO
>     | 'fput' FLOAT COMMA reg['f'] EOL
>     | 'finput' INT COMMA reg['f'] EOL
>     | 'halt' EOL
>     | 'fmove' reg['f'] COMMA reg['f'] EOL
>     | 'fneg' reg['f'] COMMA reg['f'] EOL
>     | 'fabs' reg['f'] COMMA reg['f'] EOL
>     | 'fmath' INT COMMA reg['f'] COMMA reg['f'] EOL
>     | 'sforkep' EOL // TODO
>     | 'sforksp' EOL // TODO
>     | 'scforksp' EOL // TODO
>     | 'sffree' EOL // TODO
>     | 'sread' EOL
>     ;
>
> reg[char type] returns[int rNum = -1]
>     : rStr=ID {
>         try {
>             if(Character.toLowerCase (rStr.charAt(0)) != type)
>                 throw new SemanticException("expected " + type + "x, found
> " + rStr);
>
>             if(rStr.equals("RFP"))
>                 rNum = 63;
>             else
>                 rNum = Integer.parseInt(rStr.substring(1));
>         } catch(Exception ex) {
>             throw new SemanticException("expected " + type + "x, found " +
> rStr);
>         }
>     };
>
> DATABEGIN:
>     '.data';
> DATAEND:
>     '.end';
> COMMENT
>     : SEMI (.)* EOL {$channel=HIDDEN};
> fragment DIGIT
>     : ('0'..'9');
> INT
>     : ('-')? (DIGIT)+;
> FLOAT
>     : INT '.' (DIGIT)+;
> SEMI
>     : ';';
> COLON
>     : ':';
> COMMA
>     : ',';
> PIPE
>     : '|';
>
> ID
>     : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'.'|'0'..'9')*;
>
> EOL
>     : (('\r')? '\n') {newline();};
>
> WS
>     : (
>     | (' ' | '\t')
>     ) { $channel=HIDDEN; }
>     ;
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070520/57072837/attachment-0001.html 


More information about the antlr-interest mailing list