[antlr-interest] Newbie! how can I convert a list of bullets to anHTML list

Matthew Ford matthew.ford at forward.com.au
Thu Jun 2 15:02:05 PDT 2005


Is the list actually the character sequence
/n
/t-/tbullet/n
/t-/tbullet/n
/t-/tbullet/n
/t-/tbullet/n

What makes a list different from other text like /t-/t
matthew

You may need to do infinite lookahead to decided you are processing a list
 like 
(list) => list
see Syntactic Predicates in the docs
matthew
  ----- Original Message ----- 
  From: Matthew Pearce 
  To: antlr-interest at antlr.org 
  Sent: Friday, June 03, 2005 1:19 AM
  Subject: [antlr-interest] Newbie! how can I convert a list of bullets to anHTML list


  I'd like to convert a list of bullets to an HTML list, i.e.:



  From:

  -          bullet

  -          bullet

  -          bullet



  To:

  <ul><li>bullet</li><li>bullet</li><li>bullet</li></ul>



  I thought over a few different options:



  1. Have the lexer produce a LIST token when it matches:

   - bullet

  But I don't know how to get the parser to find the <ul> tags, because I cannot add a special case



  2. Have the lexer produce a LIST token when it matches:

  -          bullet

  -          bullet

  -          bullet

  But I don't know how to get the parser to insert the <li> tags, because it hasn't tokenized each bullet



  3. Have the parser match a rule for list that matches like:



  list:       LIST^  PARA (LIST! PARA)+



  Which would give me an AST node like, that could support nested lists.



                          LIST ----+----PARA

                                      +----PARA

                                      +----LIST--------+-PARA

                                       +---PARA         



  But this gives me non-determinisim, between match a straight paragraph (PARA), and a bulleted line LIST PARA.





  Can anyone suggest an approach?  





  class CourseTreeWalker extends TreeParser;



  tree2html returns [String s]

  { s = ""; }

      :

        (#(t:TTL (p:PARA | l:list)+ { 

              s+="<h4>" +t+ "</h4>\n";

              s+= "<p>" +p+ "</p>\n";

              s+= "<ul>"+l+"</ul>"; } ))+   // this doesn't do what I want

        

      ;



  list        // this doesn't do what I want

  { String l = ""; }

   :

        (#(LIST (p2:PARA) { 

              l+="<ul><li>" +p2+ "</li></ul>\n";

               } ))

  ;



  class CourseParser extends Parser;



  options {

      buildAST = true;

  }



  file :  (section)+ EOF! ;



  section : TTL^ (listexpr)+;



  listexpr : (LIST^)? paraexpr;   // this just matches each bullet, instead of treating bullets as a group



  paraexpr: (PARA);





  class CourseLexer extends Lexer;



  options {

      k = 3; 

      charVocabulary = '\3'..'\377';

  }





  PARA  : ("LZU") =>

          ("LZU" (LETTER | DIGIT | ' ' | '/')+)  { $setType(TTL); }

          |

          ("Des") =>

          ("Description:")   { $setType(TTL); }

          |

          ("Lea") =>

          ("Learning objectives:")   { $setType(TTL); }

          |

          ("Tar") =>

          ("Target audience:")   { $setType(TTL); }

          |

          ("Pre") =>

          ("Prerequisites:")   { $setType(TTL); }

          |

           (CHAR | ' ' )+ 

        ;





  LIST   : ('-' | '·') ;







  NEWLINE : (

                    ('\r''\n')=> '\r''\n' //DOS

                    

                    | '\r' //MAC 

                    

                    | '\n' //UNIX

                    )

                    { $setType(Token.SKIP); newline();  }

              ;

  protected

  DIGIT

        : '0'..'9'

        ;



  protected

  LETTER

        : ('a'..'z' | 'A'..'Z')

        ;



              

  protected

  CHAR

        : ~( '\n' | '\r' | ' ' | '\t' | '\f' | '-' | '·' )

        ;

        

      

      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20050603/41aaa71c/attachment-0001.html


More information about the antlr-interest mailing list