[antlr-interest] Cannot match strings combining terminals w/o spaces between

Samuel Lampa samuel.lampa at scilifelab.uu.se
Wed Jul 27 08:56:50 PDT 2011


On 07/27/2011 12:13 PM, Bart Kiers wrote:
> Your input:
>
>     "history":
>
> is not tokenized as a STRING but as a WORD. You need to tell exclude 
> the double quote in your WORD rule.
>
> Also, you put '\n' on the HIDDEN channel, yet you use it in your 
> parser rule 'command'. This will cause the rule to never match 
> properly: you need to remove the '\n' from the 'command' rule, or 
> don't put '\n' it on the HIDDEN channel.
>

Right, thanks!

Seemingly I had a few more similar troubles also, but by quite some 
restructuring in line with your hints, I finally got it working with the 
code below.

Cheers
// Samuel

=== The working code ===

grammar GalaxyToolConfig;
options {output=AST;}

command    :    binary (ifstatement param+ (ELSE param+)? ENDIF | param)*
     ;

binary     :    WORD
     ;

ifstatement
     :    IF ( STRING | VARIABLE ) EQTEST ( STRING | VARIABLE )(COLON)
     ;

param     :    (DBLDASH)(WORD)*(EQ)(VARIABLE|STRING)
     ;

text     :    WORD+
     ;

IF    :    '#if'
     ;

ELSE    :    '#else'
     ;

ENDIF     :    '#end if'
     ;

EQTEST     :    '=='
     ;


DBLDASH    :    '--'
     ;

EQ    :    '='
     ;

COLON     :    ':'
     ;


STRING    :    '"'('a'..'z'|'A'..'Z')+'"'
     ;

VARIABLE
     :    '$'('{')?WORD('}')?
     ;

WORD    :    ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'.'|'_'|'0'..'9')*
     ;

WS  :   ( ' '
         | '\t'
         | '\r'
         | '\n'
         ) {$channel=HIDDEN;}
     ;






> Regards,
>
> Bart.
>
>
> On Wed, Jul 27, 2011 at 11:35 AM, Samuel Lampa 
> <samuel.lampa at scilifelab.uu.se <mailto:samuel.lampa at scilifelab.uu.se>> 
> wrote:
>
>     I got problems matching the string:
>     "history":
>
>     ... with the following ANTLR code (work in progress, really):
>     (STRING)':'
>
>     Where I have the STRING terminal defined as:
>     STRING    :    '"'('a'..'z'|'A'..'Z')+'"'
>         ;
>
>     It works if I add the ending colon in the STRING definition
>     itself, like
>     so (and then remove it from the parent rule):
>     STRING    :    '"'('a'..'z'|'A'..'Z')+'"'':'
>         ;
>
>     ... but this of course makes for a less general string definition
>     :/ ...
>     Any ideas how I should go about this?
>
>     Best regards
>     // Samuel
>
>
>     Addendum: The full input string and EBNF code is as follows:
>
>     === Input string ===
>
>         sam_to_bam.py
>           --input1=$source.input1
>           --dbkey=${input1.metadata.dbkey}
>           #if $source.index_source == "history":
>             --ref_file=$source.ref_file
>           #else
>             --ref_file="None"
>           #end if
>           --output1=$output1
>           --index_dir=${GALAXY_DATA_INDEX_DIR}
>
>
>     === ANTLR code ===
>
>     grammar GalaxyToolConfig;
>     options {output=AST;}
>
>     command    :    binary param* ifstatement '\n' text? ELSE text?
>     ENDIF text?
>         ;
>
>     binary     :    WORD
>         ;
>
>     param     :    '--' PARAMNAME '=' ( VARIABLE | STRING )
>         ;
>
>     ifstatement
>         :    IF ( STRING | VARIABLE ) EQ ( (STRING)':' | (VARIABLE)':' )
>         ;
>
>     text     :    WORD WORD*
>         ;
>
>     IF    :    '#if'
>         ;
>
>     ELSE    :    '#else'
>         ;
>
>     ENDIF     :    '#end if'
>         ;
>
>     EQ     :    '=='
>         ;
>
>     COLON     :    ':'
>         ;
>
>     PARAMNAME:    ('a'..'z')('a'..'z'|'A'..'Z'|'0'..'9'|'.'|'_')*
>         ;
>
>     STRING    :    '"'('a'..'z'|'A'..'Z')+'"'
>         ;
>
>     VARIABLE
>         :    '$''{'?PARAMNAME'}'?
>         ;
>
>
>     // CHAR    :
>     ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'.'|'$'|'{'|'}'|'='|'"'|'-'|':'|';')
>     //     ;
>
>
>     WORD    :    (~(' '|'\t'|'\r'|'\n'))+
>         ;
>
>     WS  :   ( ' '
>             | '\t'
>             | '\r'
>             | '\n'
>             ) {$channel=HIDDEN;}
>         ;
>
>
>
>
>     -- 
>     System Expert / Bioinformatician
>     SNIC-UPPMAX / SciLifeLab Uppsala
>     Uppsala University, Sweden
>     --------------------------------------
>     E-mail: samuel.lampa at scilifelab.uu.se
>     <mailto:samuel.lampa at scilifelab.uu.se>
>     Phone: +46 (0)18 - 471 1060 <tel:%2B46%20%280%2918%20-%20471%201060>
>     WWW: http://www.uppmax.uu.se
>     Uppnex: https://www.uppnex.uu.se
>
>
>     List: http://www.antlr.org/mailman/listinfo/antlr-interest
>     Unsubscribe:
>     http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>


-- 
System Expert / Bioinformatician
SNIC-UPPMAX / SciLifeLab Uppsala
Uppsala University, Sweden
--------------------------------------
E-mail: samuel.lampa at scilifelab.uu.se
Phone: +46 (0)18 - 471 1060
WWW: http://www.uppmax.uu.se
Uppnex: https://www.uppnex.uu.se



More information about the antlr-interest mailing list