[antlr-interest] newbie needs help

Sun Jan 24 10:18:56 PST 2010

Thanks you for all...
 but i have another problem because,
my file also contains some kind of function with the following format:

FUNCTION_A      //without parameters
FUNCTION_B   /opt1 /opt2   //2 parameters
FUNCTION_C A0 B0 %VAR_MYVARIABLE // data with the bytes format

the name of the function are name starting always with FUNCTION_

the problem is that where a NEWLINE is detected, it is considered like a 
"bytes" and it's a problem for this function
and a function like FUNCTION_C is badly detected

Could you give your precious help

thanks in advance

John B. Brodie a écrit :
> Greetings!
>
> On Thu, 2010-01-21 at 20:20 +0100, Hugo wrote:
>   
>> I started using antlr to parse a specific file format.
>> The problem is that i don't know how to write correctly my grammar.
>>
>> The file have the following format.
>> It contains multiple lines and each can have the following format:
>>
>> Only one or multilple hexadecimal caracter with space or not
>> ex: A0 A4 B5 77
>> or: A0
>>
>> Only variable identifier with the format VAR_XXX
>> ex: VAR_MY_VARIABLE
>>
>> Or the combinaison of the two previous format
>> ex:
>> A0 A4B5 VAR_MY_VARIABLE 77 98 VAR_MY_VARIABLE2
>> or
>> VAR_MY_VARIABLE AA BB
>> or
>> AA BB VAR_MY_VARIABLE
>>
>>
>> what i want to do is to build a AST tree
>>     
>
> attached please find a grammar file that is *almost* what I think you
> are trying to do.
>
> It does not have a MULTIPLE_BYTES_DEF node because the grouping of a
> collection of single_byte instances into a multibyte is ambiguous.
> Consider
>
> 11 22 33 44 55 66 77 88
>
> is this 8 single bytes? 1 single byte and 7-long multi? is it 4 multi
> pairs? a triple, a single and a quad?
>
> i kinda expect you want it to be a single 8-long multi, e.g. any run of
> single bytes becomes a multi. But that is a semantic of your language
> and getting a parser to do semantics isn't always possible....
>
> if you really need the MULTIPLE_BYTE_DEF node, you might be best served
> by parsing using some like my code (e.g. the parser produces only
> BYTE_DEF nodes) and then write a tree-walker that transforms the AST
> resultant from the parse into a new AST that contains the requisite
> MULTIPLE_BYTE_DEF nodes. e.g. scan for and collapse sequences of
> consecutive EXPR_DEF nodes that have BYTE_DEF children into a single
> EXPR_DEF node containing a single MULTIPLE_BYTE_DEF child.
>
>   
>> And the problem is that i don't know how to do this with antlr. the tool
>> always tell me that multiple rule can be applies with my grammar.
>>
>> please help me to solve my problem. 
>>
>> Here is my grammar:
>>
>> stmts               : bytes+ ;
>>
>>
>> bytes : multiple_byte bytes? -> ^(EXPR_DEF multiple_byte  bytes? )
>>
>> | define_expression bytes? -> ^(EXPR_DEF define_expression bytes? )
>>
>> | NEWLINE ;
>>
>> define_expression : define_var -> ^(DEFINE_VAR_DEF define_var) ;
>>
>> define_var : DEFINE_VARIABLE ;
>> multiple_byte : single_byte (single_byte)+ -> ^(MULTIPLE_BYTES_DEF
>> single_byte single_byte+) ;
>>
>>
>> single_byte : byte_digit -> ^(BYTES_DEF byte_digit) ;
>>
>> byte_digit : BYTE_DIGIT ;
>>
>> DEFINE_VARIABLE :
>> 'VAR_'('a'..'z'|'A'..'Z'|'_')('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
>>
>> BYTE_DIGIT :('0'..'9'| 'A'..'F'|'a'..'f')('0'..'9'| 'A'..'F'|'a'..'f') ;
>>
>> // Ignore whitespace, tab and escape sequence WS : (' '|'\t'|'\\\r\n')+
>> {$channel = HIDDEN;} ;
>>
>> // a new line NEWLINE : '\r'? '\n' ;
>>
>> thanks a lot
>>     
>
> hope this helps...
>    -jbb
>
>