[antlr-interest] more newbie help required

karl wettin kalle at snigel.net
Wed Feb 15 05:33:02 PST 2006


14 feb 2006 kl. 15.42 skrev Martin Probst:
> your currently doing all the work in the lexer and don't use a  
> parser at
> all. If you have a completely flat file format, that's exactly what  
> you
> want to do though.

Great news, as I take it "completely flat" is want I want. Or is it?  
What does it mean? The visitor I hacked in my lexer does the trick.   
What would then be the correct way of implementing a visitor like the  
one in my code? I doubt hacking a bean property to the lexer after  
generation is a correct solution.

FOO : bar:(<WORD>) { visitor.foo(bar); };


> The 'normal' way of using a Lexer however is something like this:
> Lexer myLexer = new Lexer(input);
> Token tok = null;
> while ((tok = myLexer.nextToken()).getType() != Token.EOF) {
>   // do something with the token
> }
>
> You would then try to break up input into single parts (e.g. NAME,
> SEASON, EPISODE, etc., that's why the Lexer is also called  
> Tokenizer) so
> you can easily handle the single parts from the outside. But that  
> might
> not work four you, at least not with your current way of parsing.

I am not completely sure I understand what you are telling me. If I  
want it "normal", I should to what with my grammar and move what to  
the parser?

However I did try something like your suggested code and got the only  
two tokens as result. One composite root expression and an EOF token.  
I did not get the building part tokens of the composite root. Is this  
due to the fact that I only use the lexer? No rules are protected. Or  
is it becasue all my grammar is coupled to each other, that the first  
two tokens in my lexer will consume the whole text by calling other  
grammar? Should this be done in the parser for below code to return  
the composite part tokens of the root expression?

	public static void test(String text) throws Exception {
		System.out.println("Testing " + text);
		MyLexer lexer = new MyLexer(new StringReader(text));
		while (true) {
			Token token = lexer.nextToken();
			System.out.println(token.getType() + "\t" + token.getText());			
			if (token.getType() == MyLexerTokenTypes.EOF) {
				break;
			}
		}
	}

-- 
karl

> Martin
>
> On Tue, 2006-02-14 at 13:21 +0100, karl wettin wrote:
>> 14 feb 2006 kl. 10.13 skrev Martin Probst:
>>
>> Hi, and thanks for your reply!
>>
>>>>
>>>> I just want a simple Visitor
>>>
>>> You can create member variables in your parser class and custom
>>> constructors, methods etc., so you have all Java-power at your
>>> fingertips.
>>
>>   I hacked a visitor to my Lexer. If it's as easy it sounds to be I
>> guess I could just copy my code to the parser. A simple example,
>> perhaps based on my grammar, would be appreciated.
>>
>>>> If the input is bad formatted  data and WELL_FORMATTED is
>>>> before the BAD_FORMATTED in the lexer, it  will not match.
>>>
>>> Could you post the Lexer rules for WELL_FORMATTED and for
>>> BAD_FORMATTED?
>>> I'd guess that your input text is matching both of them (e.g. your
>>> language is non-deterministic - both rules can match the same
>>> input) or
>>> at least the same prefix. Matching something "slightly wrong" can be
>>> very difficult.
>>
>> Attached is the generated vistor hack-lexer, the visitor and the
>> visitor coupled grammar and a test. I think it should run without any
>> problems.
>>
>>
>
>



More information about the antlr-interest mailing list