[antlr-interest] Pass parameters to DFAs for semantic predicate (or AntLR 3.3 wish list? :o) )
loic.lefevre at bnpparibas.com
loic.lefevre at bnpparibas.com
Wed Dec 16 07:16:23 PST 2009
Hello again,
I continue to struggle with AntLR :o)
I think I've got a real problem now.
I have a grammar that is absolutely ambiguous that's why I absolutely need
backtracking :o)
So ambiguous that I also need variable length tokens.
For example, when I need to parse at most 16 chars (for a given data
type), I've got:
data_x[ int length ]
returns[ String s ]
@init {
final StringBuilder sb = new StringBuilder();
}
@after {
s = sb.toString();
}
: ( ( d=DIGIT { sb.append($d.text);if( sb.length() == length
) { return sb.toString(); }} |
l=LETTER { sb.append($l.text);if( sb.length() ==
length ) { return sb.toString(); }} |
cl=CAPITAL_LETTER { sb.append($cl.text);if(
sb.length() == length ) { return sb.toString(); }} |
SLASH { sb.append('/');if( sb.length() == length ) {
return sb.toString(); }} |
SPACE { sb.append(' ');if( sb.length() == length ) {
return sb.toString(); }} |
ANTI_SLASH { sb.append('\\');if( sb.length() ==
length ) { return sb.toString(); }} |
MINUS { sb.append('-');if( sb.length() == length ) {
return sb.toString(); }} |
COLON { sb.append(':');if( sb.length() == length ) {
return sb.toString(); }} |
LPAREN { sb.append('(');if( sb.length() == length )
{ return sb.toString(); }} |
RPAREN { sb.append(')');if( sb.length() == length )
{ return sb.toString(); }} |
DOT { sb.append('.');if( sb.length() == length ) {
return sb.toString(); }} |
COMMA { sb.append(',');if( sb.length() == length ) {
return sb.toString(); }} |
PLUS { sb.append('+');if( sb.length() == length ) {
return sb.toString(); }} |
QUOTE { sb.append('\'');if( sb.length() == length )
{ return sb.toString(); }} |
QUESTION_MARK { sb.append('?');if( sb.length() ==
length ) { return sb.toString(); }}
)
)+
;
I know this is awful but at least it works or I should precise, it worked.
The problem here is that I can't use a disambiguating semantic predicate
such as:
data_x[ int length ]
returns[ String s ]
@init {
final StringBuilder sb = new StringBuilder();
}
@after {
s = sb.toString();
}
: (
{sb.length() < length}?
( d=DIGIT { sb.append($d.text);if( sb.length() ==
length ) { return sb.toString(); }} |
l=LETTER { sb.append($l.text);if( sb.length() ==
length ) { return sb.toString(); }} |
...
since the sb and length variables are not pushed inside the DFA :o(
It could be interesting to have at least the length parameter "pushed"
into the dfa via a generated setter for example:
class DFA149 extends DFA {
private int length;
public DFA149(BaseRecognizer recognizer) {
...
}
public void setLength( int length ) {
this.length = length;
}
public String getDescription() {
return "()+ loopback of 1163:3: ({...}? (d= DIGIT | l= LETTER
| cl= CAPITAL_LETTER | SLASH | SPACE | ANTI_SLASH | MINUS | COLON | LPAREN
| RPAREN | DOT | COMMA | PLUS | QUOTE | QUESTION_MARK ) )+";
}
public int specialStateTransition(int s, IntStream _input) throws
NoViableAltException {
TokenStream input = (TokenStream)_input;
int _s = s;
switch ( s ) {
case 0 :
int LA149_14 = input.LA(1);
int index149_14 = input.index();
input.rewind();
s = -1;
if ( ((synpred230_SWIFTMT()&&(sb.length() < length
))) ) {s = 17;}
else if ( ((sb.length() < length)) ) {s = 1;}
...
Then the length parameter could be used inside the specialStateTransition
method and we could imagine such a principle used for the
synpred230_SWIFTMT() methods also.
One point I don't understand is why my predicate is not pushed before the
generated syntactic predicate like:
if ( (((sb.length() < length)&&
synpred230_SWIFTMT())) ) {s = 17;}
instead of
if ( ((synpred230_SWIFTMT()&&(sb.length() < length
))) ) {s = 17;}
Since my comparison is faster :o) Maybe there are some reasons for that,
could someone explain me?
Finally, I've got of course another problem with the kind of action I set:
if( sb.length() == length ) { return sb.toString(); }
I just return from the rule if I reached the maximum length. This work
well since there are the blocks catch and finally to handle properly what
needs to be done (backtracking / error handling).
However when backtracking, the action is not run, see generated code:
case 1 :
//
C:\\GRP_Head\\GRP_Dev\\Development\\frameworks\\Foxhound\\target\\generated\\com\\bnpparibas\\acetp\\foxhound\\spec2009\\parser\\SWIFTMT.g:1108:6:
cl= CAPITAL_LETTER
{
cl=(Token)match(input,CAPITAL_LETTER,FOLLOW_CAPITAL_LETTER_in_data_a8285);
if (state.failed) return s;
if ( state.backtracking==0 ) {
sb.append((cl!=null?cl.getText():null)); if(
sb.length() == length ) { return sb.toString(); }
}
}
break;
So this "trick" does not work anymore (it used to work however).
With a grammar managing 2 message types (see previous posts) no problem.
With a third one, I get the following error message:
line 2:5 no viable alternative at input 'C'
I begin to doubt that antlr v3 will be able to parse SWIFT MT messages :o(
Regards,
Loïc
This message and any attachments (the "message") is
intended solely for the addressees and is confidential.
If you receive this message in error, please delete it and
immediately notify the sender. Any use not in accord with
its purpose, any dissemination or disclosure, either whole
or partial, is prohibited except formal approval. The internet
can not guarantee the integrity of this message.
BNP PARIBAS (and its subsidiaries) shall (will) not
therefore be liable for the message if modified.
Do not print this message unless it is necessary,
consider the environment.
---------------------------------------------
Ce message et toutes les pieces jointes (ci-apres le
"message") sont etablis a l'intention exclusive de ses
destinataires et sont confidentiels. Si vous recevez ce
message par erreur, merci de le detruire et d'en avertir
immediatement l'expediteur. Toute utilisation de ce
message non conforme a sa destination, toute diffusion
ou toute publication, totale ou partielle, est interdite, sauf
autorisation expresse. L'internet ne permettant pas
d'assurer l'integrite de ce message, BNP PARIBAS (et ses
filiales) decline(nt) toute responsabilite au titre de ce
message, dans l'hypothese ou il aurait ete modifie.
N'imprimez ce message que si necessaire,
pensez a l'environnement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20091216/40fa5c64/attachment.html
More information about the antlr-interest
mailing list