[antlr-interest] Recognition of dynamic ID-definitions
Christian Mrugalla
christian at mrugalla.info
Mon Jan 31 03:07:12 PST 2011
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello Michael,
I had already some preprocessing in mind, as an emergency solution if
ANTLR is not powerful enough to express such dynamics. Thank you for
your suggestion.
I got two answers directly by E-Mail, both with the solution outlined as:
expr : t=ID {check.isValidRuntimeID(t.getText())}? ( '+' ID )* ;
Now I had the time to check if this elegant solution works. The
remaining problem is how to define ID!
I concretely tried:
grammar simple_example;
@header {import RT.RuntimeIDs;}
@lexer::header{import RT.RuntimeIDs;}
expr : t=ID {RuntimeIDs.isElem(t.getText())}? ('+' ID)*;
ID: (.)*;
This yields to an error message "The following alternatives can never be
matched" pointing to the line "ID: (.)*;".
After replacing this line by "ID: (options {greedy=true;} : .)*;" the
parser could be compiled, but this does not work at runtime (assuming
RuntimeIDs.isElem returns true iff its argument is "a" or "b", and the
input-stream to be parsed is "a+b"):
I got an "rule expr failed predicate"-error.
Using some *usual* ID-definition like
ID: 'a'..'b';
works instead.
Any other ideas, except a handwritten preprocessing, to write
ANTLR-grammars with IDs defined at runtime?
Kind regards,
Christian Mrugalla
Michael Bedward wrote:
> Hello Christian,
>
> I've been waiting to see if anyone else would answer this question
> before venturing a response.
>
> I'd first create a pre-processor that runs at parser execution time,
> feeding your 'real' parser with source transformed according to a
> current list of characters recognized as operators. Below is some a
> toy grammar for such a pre-processor where the start rule takes as an
> argument a List<String> of current operators.
>
> Given the input "a+b" and a List of operators that includes "+" it
> will produce output var<a> op<+> var<b>. If the List excludes "+" the
> output will be var<a+b>.
>
> It scores low on efficiency and elegance but might get you started.
>
> Michael
>
>
> grammar Dynamic;
>
> @header {
> package dynamic;
> import java.util.ArrayList;
> }
>
> @lexer::header {
> package dynamic;
> }
>
> @members {
> List<String> operators;
>
> StringBuilder topSB = new StringBuilder();
>
> void addVar(String var) {
> if (var.length() > 0) {
> topSB.append("var<").append(var).append("> ");
> }
> }
>
> void addOp(String op) {
> topSB.append("op<").append(op).append("> ");
> }
>
> }
>
> // Parser rules
> prog[List<String> operators]
> @init {
> this.operators = operators == null ? new ArrayList<String>() : operators;
> }
> @after {
> System.out.println( topSB.toString() );
> }
> : statement+
> ;
>
> statement
> @init {
> StringBuilder sb = new StringBuilder();
> }
> @after {
> addVar(sb.toString());
> }
> : (element {
> if ($element.isOp) {
> addVar(sb.toString());
> addOp($element.src);
> sb = new StringBuilder();
> } else {
> sb.append($element.src);
> }
> })+ DELIM
> ;
>
> element returns [String src, boolean isOp]
> : WORD {$src = $WORD.text; $isOp = false; }
> | OP {$src = $OP.text; $isOp = operators.contains($OP.text);}
> ;
>
> // Lexer rules
> WORD : LETTER+
> ;
>
> // All potential operator chars
> OP : ('+' | '-')
> ;
>
> DELIM : ';'
> ;
>
> fragment
> LETTER : ('a'..'z' | 'A'..'Z')
> ;
>
> WS : (' '|'\r'|'\t'|'\n') {$channel=HIDDEN;}
> ;
>
>
>
> On 26 January 2011 09:21, Christian Mrugalla <christian at mrugalla.info> wrote:
> Dear all,
>
> is it possible to recognize a language which is (so to say)
> 'parameterized' by a finite set of arbitrary named identifiers, using
> ANTLR?
>
> Trivial Example:
>
> expr : ID ( '+' ID)* ;
>
> ID is not defined at parser-generation-time, it is only known that at
> parser-execution-time there exists a finite set S of arbitrary Strings
> which contains the allowed values for ID.
>
> That is in particular: It depends on S, if "a+b" is:
> - build by '+'-connected 'a'- and 'b'-IDs
> - an ID named 'a+b'
> - invalid, because S contains the IDs "a+" and "b"
>
> I did not found any hint concerning such kind of
> language-parameterization in the "The Definitive ANTLR Reference".
>
> Thank you in advance for your help!
>
> Kind Regards,
> Christian Mrugalla
>
>>
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAk1Gl+AACgkQz2D7mOZ/GFzUYQCeJWh23D6IAY4x9m9+0LmUUDyN
xvoAoI9cxOddv6OxHiFOx/OWEpKIyiJ1
=GqKl
-----END PGP SIGNATURE-----
More information about the antlr-interest
mailing list