[antlr-interest] Cannot match strings combining terminals w/o spaces between
Samuel Lampa
samuel.lampa at scilifelab.uu.se
Wed Jul 27 08:56:50 PDT 2011
On 07/27/2011 12:13 PM, Bart Kiers wrote:
> Your input:
>
> "history":
>
> is not tokenized as a STRING but as a WORD. You need to tell exclude
> the double quote in your WORD rule.
>
> Also, you put '\n' on the HIDDEN channel, yet you use it in your
> parser rule 'command'. This will cause the rule to never match
> properly: you need to remove the '\n' from the 'command' rule, or
> don't put '\n' it on the HIDDEN channel.
>
Right, thanks!
Seemingly I had a few more similar troubles also, but by quite some
restructuring in line with your hints, I finally got it working with the
code below.
Cheers
// Samuel
=== The working code ===
grammar GalaxyToolConfig;
options {output=AST;}
command : binary (ifstatement param+ (ELSE param+)? ENDIF | param)*
;
binary : WORD
;
ifstatement
: IF ( STRING | VARIABLE ) EQTEST ( STRING | VARIABLE )(COLON)
;
param : (DBLDASH)(WORD)*(EQ)(VARIABLE|STRING)
;
text : WORD+
;
IF : '#if'
;
ELSE : '#else'
;
ENDIF : '#end if'
;
EQTEST : '=='
;
DBLDASH : '--'
;
EQ : '='
;
COLON : ':'
;
STRING : '"'('a'..'z'|'A'..'Z')+'"'
;
VARIABLE
: '$'('{')?WORD('}')?
;
WORD : ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'.'|'_'|'0'..'9')*
;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
> Regards,
>
> Bart.
>
>
> On Wed, Jul 27, 2011 at 11:35 AM, Samuel Lampa
> <samuel.lampa at scilifelab.uu.se <mailto:samuel.lampa at scilifelab.uu.se>>
> wrote:
>
> I got problems matching the string:
> "history":
>
> ... with the following ANTLR code (work in progress, really):
> (STRING)':'
>
> Where I have the STRING terminal defined as:
> STRING : '"'('a'..'z'|'A'..'Z')+'"'
> ;
>
> It works if I add the ending colon in the STRING definition
> itself, like
> so (and then remove it from the parent rule):
> STRING : '"'('a'..'z'|'A'..'Z')+'"'':'
> ;
>
> ... but this of course makes for a less general string definition
> :/ ...
> Any ideas how I should go about this?
>
> Best regards
> // Samuel
>
>
> Addendum: The full input string and EBNF code is as follows:
>
> === Input string ===
>
> sam_to_bam.py
> --input1=$source.input1
> --dbkey=${input1.metadata.dbkey}
> #if $source.index_source == "history":
> --ref_file=$source.ref_file
> #else
> --ref_file="None"
> #end if
> --output1=$output1
> --index_dir=${GALAXY_DATA_INDEX_DIR}
>
>
> === ANTLR code ===
>
> grammar GalaxyToolConfig;
> options {output=AST;}
>
> command : binary param* ifstatement '\n' text? ELSE text?
> ENDIF text?
> ;
>
> binary : WORD
> ;
>
> param : '--' PARAMNAME '=' ( VARIABLE | STRING )
> ;
>
> ifstatement
> : IF ( STRING | VARIABLE ) EQ ( (STRING)':' | (VARIABLE)':' )
> ;
>
> text : WORD WORD*
> ;
>
> IF : '#if'
> ;
>
> ELSE : '#else'
> ;
>
> ENDIF : '#end if'
> ;
>
> EQ : '=='
> ;
>
> COLON : ':'
> ;
>
> PARAMNAME: ('a'..'z')('a'..'z'|'A'..'Z'|'0'..'9'|'.'|'_')*
> ;
>
> STRING : '"'('a'..'z'|'A'..'Z')+'"'
> ;
>
> VARIABLE
> : '$''{'?PARAMNAME'}'?
> ;
>
>
> // CHAR :
> ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'.'|'$'|'{'|'}'|'='|'"'|'-'|':'|';')
> // ;
>
>
> WORD : (~(' '|'\t'|'\r'|'\n'))+
> ;
>
> WS : ( ' '
> | '\t'
> | '\r'
> | '\n'
> ) {$channel=HIDDEN;}
> ;
>
>
>
>
> --
> System Expert / Bioinformatician
> SNIC-UPPMAX / SciLifeLab Uppsala
> Uppsala University, Sweden
> --------------------------------------
> E-mail: samuel.lampa at scilifelab.uu.se
> <mailto:samuel.lampa at scilifelab.uu.se>
> Phone: +46 (0)18 - 471 1060 <tel:%2B46%20%280%2918%20-%20471%201060>
> WWW: http://www.uppmax.uu.se
> Uppnex: https://www.uppnex.uu.se
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
--
System Expert / Bioinformatician
SNIC-UPPMAX / SciLifeLab Uppsala
Uppsala University, Sweden
--------------------------------------
E-mail: samuel.lampa at scilifelab.uu.se
Phone: +46 (0)18 - 471 1060
WWW: http://www.uppmax.uu.se
Uppnex: https://www.uppnex.uu.se
More information about the antlr-interest
mailing list