[antlr-interest] Lists. Lexer or Parser?

Dave Pawson dave.pawson at gmail.com
Wed Sep 17 01:09:27 PDT 2008


2008/9/13 Gavin Lambert <antlr at mirality.co.nz>:
> At 01:00 14/09/2008, Dave Pawson wrote:
>>CONTENT: ~(NEWLINE)+;
> [...]
>>line:  (c=CONTENT NEWLINE ) {
>>            System.out.println("<para>"+ $c.text +"</para>\n" );}|
>>     STAR c=CONTENT+ NEWLINE+ {
>>            System.out.println("<list>"+ $c.text );}   ;
> [...]
>>The output is
>><para>content only</para>
>>
>><para>* LIST list content</para>
>>
>><para>* LIST list content more</para>
>
> You'll note that "<list>" doesn't appear in the output -- that's a sign that
> you're never hitting the second alt, which suggests that the STAR is getting
> absorbed by the CONTENT rule.  Try changing CONTENT to this:
>
> CONTENT: ~(STAR | NEWLINE) (~NEWLINE)*;

Still not the 'double' markup I'm chasing
<list>
<item>list item content</item>
<item>Second item content</item>

With just the above change I'm finding

<para>content only</para>

<list> LIST list content
<list> LIST list content more


So it is seperating the two ('normal' content and lists, but not
answering my original
requirement)







>
>
> Another option would be to do all the matching in the lexer:
>
> NEWLINE : '\r' | '\n' { $channel = HIDDEN; };
> LISTITEM : '*' (~NEWLINE)* { setText(getText().substr(1)); };
> TEXT : ~('*' | NEWLINE) (~NEWLINE)*;
>
> line : TEXT { System.out.println("<para>" + $TEXT.text + "</para>"); }
>     | LISTITEM { System.out.println("<item>" + $LISTITEM.text + "</item>");
> }
>     ;
>
> It wouldn't be hard from there to generate a surrounding "<list>" element
> for groupings of LISTITEMs:
>
> line : TEXT { System.out.println("<para>" + $TEXT.text + "</para>"); }
>     | list
>     ;
>
> list : (LISTITEM) => { System.out.println("<list>"); }
>         (LISTITEM { System.out.println("<item>" + $LISTITEM.text +
> "</item>"); })+
>       { System.out.println("</list>"); }
>     ;
>
> (You probably don't even need the predicate there, since ANTLR shouldn't try
> to enter the list rule unless there's a LISTITEM present anyway.  But it
> never hurts to be paranoid.)



Using the second definition of line I'm getting

ANTLR Parser Generator  Version 3.1 (August 12, 2008)  1989-2008
warning(200): Test.g:23:81: Decision can match input such as
"LISTITEM" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
Semantic predicates were present but were hidden by actions.


Thanks for the help, I think I need quite a bit more reading!


regards



-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk


More information about the antlr-interest mailing list