[antlr-interest] Escaping single quotes in a lexer
Emanuele Gesuato
egesuato at ibc.it
Fri Mar 27 09:25:51 PDT 2009
Hi there,
For the problem described below, Gavin Lambert has gently answered me to
use the following:
DELIM : "\'' ('\\' ('\'')? | ~('\\' | '\''))* '\'";
As described below i'm using antlr 2.7.6. I've tried to use it as a
delimiter to my string using:
STRING : DELIM((JOLLY)?(PAROLE|INTEGER)(JOLLY)?)DELIM;
But i've got the following warning:
[antlr] ANTLR Parser Generator Version 2.7.6 (2005-12-22)
warning:lexical nondeterminism between rules DELIM and STRING upon
SQL.g: k==1:'\''
SQL.g: k==2:'\''
SQL.g: k==3:' '
SQL.g: k==4:'('
If i use DELIM using this:
STRING : "'"((JOLLY)?(PAROLE|INTEGER|DELIM)(JOLLY)?)"'";
i don't have any difference in the generated java classes.
How to escape single quotes in the original STRING ?
( STRING : "'"((JOLLY)?(PAROLE|INTEGER)(JOLLY)?)"'"; )
Thanks,
Emanuele
On Tue, Mar 17, 2009 at 11:05 PM, Emanuele Gesuato
<emanuele.gesuato at gmail.com> wrote:
> Hi there,
>
> I'm quite new to the antlr world so my question could be obvious. I'm
> using antlr 2.7.6 in java 5 for generating a lexer class. In this
> lexer (written by an ex-collegue) i'm trying to resolve string like
>
> Invoice.customer='Tom'
>
> to build an hibernate restriction.
>
> I would like to use the ' character inside the string something
similar to:
>
> Invoice.customer='Tom L\'oreal'
> or (better)
> Invoice.customer="Tom L'oreal"
>
> I've got the String definition for such fields that is the following:
> STRING : "'"((JOLLY)?(PAROLE|INTEGER)(JOLLY)?)"'";
> where:
> protected CIFRA : '0'..'9';
> protected LETTERA : ('a'..'z'|'A'..'Z'|'_'|'\\'|'.'|'-');
> protected PAROLA : LETTERA(CIFRA|LETTERA)*;
> protected PAROLE : PAROLA((SPAZIO)+(PAROLA))*;
> protected INTEGER : (CIFRA)+;
>
> and i've tried to use:
> STRING : (" ' "((JOLLY)?(PAROLE|INTEGER)(JOLLY)?)" ' ") | (' "
> '((JOLLY)?(PAROLE|INTEGER)(JOLLY)?)' " ');
>
> (added spaces for more clarity) but it does recognize the string
> "Tom L'oreal". The java class created is no different from the
previous one.
>
>
>
> Here is the original full grammar:
> *************************
> header{
> package it.ibc.jstore.util.parser;
> }
>
> // Lexer ********************************************
> {import it.ibc.jstore.base.Log;}
> class RestrictionsLexer extends Lexer;
>
> options { k=4; }
>
> WHITESPACE : (' '
> | '\t'
> | '\r' '\n' { newline(); }
> | '\n' { newline(); }
> ) { $setType(Token.SKIP); }
> ;
>
> protected SPAZIO : ' ';
> protected CIFRA : '0'..'9';
> protected LETTERA : ('a'..'z'|'A'..'Z'|'_'|'\\'|'.'|'-');
> protected PAROLA : LETTERA(CIFRA|LETTERA)*;
> protected PAROLE : PAROLA((SPAZIO)+(PAROLA))*;
> protected INTEGER : (CIFRA)+;
> protected LONG : INTEGER('L'|'l');
> protected LIKE : ("LIKE"|"like"|"Like");
> protected OR : ("OR"|"or"|"Or");
> protected AND : ("AND"|"and"|"And");
> protected IN : ("IN"|"in"|"In");
>
>
>
>
> UGUALE : "=";
> DIVERSO : "<>";
> MAGGIORE : '>';
> MINORE : '<';
> MAGGIOREUGUALE : ">=";
> MINOREUGUALE : "<=";
> JOLLY : "*";
> LPAREN : '(';
> RPAREN : ')';
> SEPARATORE : ('/');
> VIRGOLA : ",";
> NUMERO : (LONG) => LONG { $setType(LONG); }
> | INTEGER { $setType(INTEGER); }
> ;
> STRING : "'"((JOLLY)?(PAROLE|INTEGER)(JOLLY)?)"'"
> CAMPO : (LIKE) => LIKE { $setType(LIKE); }
> | (OR) => OR { $setType(OR); }
> | (AND) => AND { $setType(AND); }
> | (IN) => IN { $setType(IN); }
> | PAROLA { $setType(CAMPO); }
> ;
>
> // Parser *******************************************
> class RestrictionsParser extends Parser;
> options { buildAST=true; }
>
> valore : STRING | LONG | INTEGER;
> expr : LPAREN^ orExpr RPAREN! ;
> orExpr : andExpr ((OR^) andExpr)* ;
> andExpr : relExpr ((AND^) relExpr)* ;
> relExpr : atom
>
(((UGUALE^|DIVERSO^|MAGGIORE^|MINORE^|MINOREUGUALE^|MAGGIOREUGUALE^|LIKE^)
> rparm) | (IN^ list))* ;
> atom : CAMPO | expr ;
> rparm : atom | valore ;
> list : LPAREN! valore (VIRGOLA^ valore)* RPAREN! ;
>
>
> // Parser dell'albero *******************************
> {
> import it.ibc.jstore.data.Restrictions;
> import it.ibc.jstore.data.MatchMode;
> import java.util.List;
> import java.util.ArrayList;
> }
> class RestrictionsTreeWalker extends TreeParser;
>
> // Elemento base (un campo, un intero..)
> base returns [Object s]
> { s=null; }
> : i:CAMPO { s=i.getText(); }
> | j:INTEGER { s=Integer.valueOf(j.getText()); }
> | k:LONG { int lunghezza=k.getText().length();
> s=Long.valueOf(k.getText().substring(0,lunghezza-1)); }
> | l:STRING { int lunghezza=l.getText().length();
> s=l.getText().substring(1,lunghezza-1); }
> ;
>
> campo returns [String s]
> { s=null; }
> : i:CAMPO { s=i.getText(); }
> ;
>
> stringa returns [String s]
> { s=null; }
> : l:STRING { int lunghezza=l.getText().length();
> s=l.getText().substring(1,lunghezza-1); }
> ;
>
> lista returns [List l]
> { l=new ArrayList(); List t,v; Object a; }
> : #(VIRGOLA v=lista t=lista) { l.addAll(v); l.addAll(t); } // Una
> lista e' un'elenco di liste separate da virgola
> | a=base { l.add(a); } // E questo e' l'elemento base della lista
> ;
>
> expr returns [Restrictions r]
> { Object a,b; Restrictions t,v; r=new Restrictions(); }
> : #(UGUALE a=base b=base) { r.eq((String)a,b); }
> | #(DIVERSO a=base b=base) { r.ne((String)a,b); }
> | #(MINOREUGUALE a=base b=base) { r.le((String)a,b); }
> | #(MAGGIOREUGUALE a=base b=base) { r.ge((String)a,b); }
> | #(MINORE a=base b=base) { r.lt((String)a,b); }
> | #(MAGGIORE a=base b=base) { r.gt((String)a,b); }
> | #(LIKE a=campo b=stringa) { r.ilike((String)a,(String)b,
MatchMode.GUESS); }
> | #(IN a=campo b=lista) { r.in((String)a,(List)b); }
> | #(AND t=expr v=expr) {r.and(t,v);}
> | #(OR t=expr v=expr) {r.or(t,v);}
> | #(LPAREN t=expr) { r=t; }
> ;
>
More information about the antlr-interest
mailing list