[antlr-interest] Newbie! how can I convert a list of bullets to an
HTML list
Matthew Pearce
mpearce at digitas.com
Thu Jun 2 08:19:45 PDT 2005
I'd like to convert a list of bullets to an HTML list, i.e.:
From:
- bullet
- bullet
- bullet
To:
<ul><li>bullet</li><li>bullet</li><li>bullet</li></ul>
I thought over a few different options:
1. Have the lexer produce a LIST token when it matches:
- bullet
But I don't know how to get the parser to find the <ul> tags, because I
cannot add a special case
2. Have the lexer produce a LIST token when it matches:
- bullet
- bullet
- bullet
But I don't know how to get the parser to insert the <li> tags, because
it hasn't tokenized each bullet
3. Have the parser match a rule for list that matches like:
list: LIST^ PARA (LIST! PARA)+
Which would give me an AST node like, that could support nested lists.
LIST ----+----PARA
+----PARA
+----LIST--------+-PARA
+---PARA
But this gives me non-determinisim, between match a straight paragraph
(PARA), and a bulleted line LIST PARA.
Can anyone suggest an approach?
class CourseTreeWalker extends TreeParser;
tree2html returns [String s]
{ s = ""; }
:
(#(t:TTL (p:PARA | l:list)+ {
s+="<h4>" +t+ "</h4>\n";
s+= "<p>" +p+ "</p>\n";
s+= "<ul>"+l+"</ul>"; } ))+ // this doesn't do what I want
;
list // this doesn't do what I want
{ String l = ""; }
:
(#(LIST (p2:PARA) {
l+="<ul><li>" +p2+ "</li></ul>\n";
} ))
;
class CourseParser extends Parser;
options {
buildAST = true;
}
file : (section)+ EOF! ;
section : TTL^ (listexpr)+;
listexpr : (LIST^)? paraexpr; // this just matches each bullet,
instead of treating bullets as a group
paraexpr: (PARA);
class CourseLexer extends Lexer;
options {
k = 3;
charVocabulary = '\3'..'\377';
}
PARA : ("LZU") =>
("LZU" (LETTER | DIGIT | ' ' | '/')+) { $setType(TTL); }
|
("Des") =>
("Description:") { $setType(TTL); }
|
("Lea") =>
("Learning objectives:") { $setType(TTL); }
|
("Tar") =>
("Target audience:") { $setType(TTL); }
|
("Pre") =>
("Prerequisites:") { $setType(TTL); }
|
(CHAR | ' ' )+
;
LIST : ('-' | '*') ;
NEWLINE : (
('\r''\n')=> '\r''\n' //DOS
| '\r' //MAC
| '\n' //UNIX
)
{ $setType(Token.SKIP); newline(); }
;
protected
DIGIT
: '0'..'9'
;
protected
LETTER
: ('a'..'z' | 'A'..'Z')
;
protected
CHAR
: ~( '\n' | '\r' | ' ' | '\t' | '\f' | '-' | '*' )
;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20050602/2dccd7e0/attachment-0001.html
More information about the antlr-interest
mailing list