[antlr-interest] Getting the Previously Matched Lexer Token in the C Target
John B. Brodie
jbb at acm.org
Mon Jul 19 19:39:20 PDT 2010
Greetings!
On Mon, 2010-07-19 at 21:00 -0400, Billy O'Neal wrote:
> Hello, Kirby Bohling.
>
> It's similar to Keyword Vs. ID, but not exact. Consider the following inputs:
>
> -arg#hashed#
> Result:
> ARGUMENT (Text="arg")
> ARGEXTRA (Text="hashed")
>
> -arg#hashed# #otherData#
> Result:
> ARGUMENT (Text="arg")
> ARGEXTRA (Text="hashed")
> OTHER (Text="#otherdata#") <-- Note that the hashes need to be
> included at this point, but excluded in the ARGEXTRA token type
>
> #otherData#andsomemorethings
> Result:
> OTHER (Text="#otherData#andsomemorethings") <-- If I just use a
> common token for that, then there needs to be a lot of stitching going
> on in the parser, posing a problem.
>
> Finally, this:
> -arg #hashed#
> needs to be:
> ARGUMENT (Text="arg")
> OTHER (Text="hashed")
>
> If I use a common token for things there, then the parser can't
> correctly discern what to do here -- stitching together here would
> actually be invalid because of the space, and because the whitespace
> is dropped by the lexer, the parser cannot make that determination.
>
i had some fun with this. thanks! see attached (yes, i am weird)
-jbb
-------------- next part --------------
grammar Test;
options {
output = AST;
ASTLabelType = CommonTree;
}
tokens { ARGUMENT; ARGEXTRA; OTHER; }
@members {
private static final String [] x = new String[] {
"-arg#hashed#",
"-arg#hashed# #otherData#",
"#otherData#andsomemorethings",
"-arg #hashed#"
};
public static void main(String [] args) {
for( int i = 0; i < x.length; ++i ) {
try {
System.out.println("about to parse:`"+x[i]+"`");
TestLexer lexer = new TestLexer(new ANTLRStringStream(x[i]));
CommonTokenStream tokens = new CommonTokenStream(lexer);
TestParser parser = new TestParser(tokens);
TestParser.start_return p_result = parser.start();
CommonTree ast = p_result.tree;
if( ast == null ) {
System.out.println("resultant tree: is NULL");
} else {
System.out.println("resultant tree: " + ast.toStringTree());
}
System.out.println();
} catch(Exception e) {
e.printStackTrace();
}
}
}
}
start : (arg|other) EOF!;
arg : argument
( ( ( argextra (WS other)? )? -> argument (argextra other?)? )
| ( WS HASH ID HASH -> argument OTHER[$ID.text] )
)
;
argument : DASH ID -> ARGUMENT[$ID.text] ;
argextra : HASH ID HASH -> ARGEXTRA[$ID.text] ;
other : o_data -> OTHER[$o_data.text] ;
o_data : ( HASH! | ID! )+ ; // avoid building a tree here, just want $text
DASH : '-' ;
HASH : '#' ;
ID : ('a'..'z'|'A'..'Z')+ ;
WS : ' '+ ;
More information about the antlr-interest
mailing list