[antlr-interest] Newbie! how can I convert a list of bullets to anHTML list

Thu Jun 2 15:02:05 PDT 2005

Is the list actually the character sequence
/n
/t-/tbullet/n
/t-/tbullet/n
/t-/tbullet/n
/t-/tbullet/n

What makes a list different from other text like /t-/t
matthew

You may need to do infinite lookahead to decided you are processing a list
 like 
(list) => list
see Syntactic Predicates in the docs
matthew
  ----- Original Message ----- 
  From: Matthew Pearce 
  To: antlr-interest at antlr.org 
  Sent: Friday, June 03, 2005 1:19 AM
  Subject: [antlr-interest] Newbie! how can I convert a list of bullets to anHTML list

  I'd like to convert a list of bullets to an HTML list, i.e.:

  From:

  -          bullet

  -          bullet

  -          bullet

  To:

  <ul><li>bullet</li><li>bullet</li><li>bullet</li></ul>

  I thought over a few different options:

  1. Have the lexer produce a LIST token when it matches:

   - bullet

  But I don't know how to get the parser to find the <ul> tags, because I cannot add a special case

  2. Have the lexer produce a LIST token when it matches:

  -          bullet

  -          bullet

  -          bullet

  But I don't know how to get the parser to insert the <li> tags, because it hasn't tokenized each bullet

  3. Have the parser match a rule for list that matches like:

  list:       LIST^  PARA (LIST! PARA)+

  Which would give me an AST node like, that could support nested lists.

                          LIST ----+----PARA

                                      +----PARA

                                      +----LIST--------+-PARA

                                       +---PARA         

  But this gives me non-determinisim, between match a straight paragraph (PARA), and a bulleted line LIST PARA.

  Can anyone suggest an approach?  

  class CourseTreeWalker extends TreeParser;

  tree2html returns [String s]

  { s = ""; }

      :

        (#(t:TTL (p:PARA | l:list)+ { 

              s+="<h4>" +t+ "</h4>\n";

              s+= "<p>" +p+ "</p>\n";

              s+= "<ul>"+l+"</ul>"; } ))+   // this doesn't do what I want

      ;

  list        // this doesn't do what I want

  { String l = ""; }

   :

        (#(LIST (p2:PARA) { 

              l+="<ul><li>" +p2+ "</li></ul>\n";

               } ))

  ;

  class CourseParser extends Parser;

  options {

      buildAST = true;

  }

  file :  (section)+ EOF! ;

  section : TTL^ (listexpr)+;

  listexpr : (LIST^)? paraexpr;   // this just matches each bullet, instead of treating bullets as a group

  paraexpr: (PARA);

  class CourseLexer extends Lexer;

  options {

      k = 3; 

      charVocabulary = '\3'..'\377';

  }

  PARA  : ("LZU") =>

          ("LZU" (LETTER | DIGIT | ' ' | '/')+)  { $setType(TTL); }

          |

          ("Des") =>

          ("Description:")   { $setType(TTL); }

          |

          ("Lea") =>

          ("Learning objectives:")   { $setType(TTL); }

          |

          ("Tar") =>

          ("Target audience:")   { $setType(TTL); }

          |

          ("Pre") =>

          ("Prerequisites:")   { $setType(TTL); }

          |

           (CHAR | ' ' )+ 

        ;

  LIST   : ('-' | '·') ;

  NEWLINE : (

                    ('\r''\n')=> '\r''\n' //DOS

                    | '\r' //MAC 

                    | '\n' //UNIX

                    )

                    { $setType(Token.SKIP); newline();  }

              ;

  protected

  DIGIT

        : '0'..'9'

        ;

  protected

  LETTER

        : ('a'..'z' | 'A'..'Z')

        ;

  protected

  CHAR

        : ~( '\n' | '\r' | ' ' | '\t' | '\f' | '-' | '·' )

        ;

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20050603/41aaa71c/attachment-0001.html