[antlr-interest] Newbie! how can I convert a list of bullets to
anHTML list
Matthew Ford
matthew.ford at forward.com.au
Thu Jun 2 15:02:05 PDT 2005
Is the list actually the character sequence
/n
/t-/tbullet/n
/t-/tbullet/n
/t-/tbullet/n
/t-/tbullet/n
What makes a list different from other text like /t-/t
matthew
You may need to do infinite lookahead to decided you are processing a list
like
(list) => list
see Syntactic Predicates in the docs
matthew
----- Original Message -----
From: Matthew Pearce
To: antlr-interest at antlr.org
Sent: Friday, June 03, 2005 1:19 AM
Subject: [antlr-interest] Newbie! how can I convert a list of bullets to anHTML list
I'd like to convert a list of bullets to an HTML list, i.e.:
From:
- bullet
- bullet
- bullet
To:
<ul><li>bullet</li><li>bullet</li><li>bullet</li></ul>
I thought over a few different options:
1. Have the lexer produce a LIST token when it matches:
- bullet
But I don't know how to get the parser to find the <ul> tags, because I cannot add a special case
2. Have the lexer produce a LIST token when it matches:
- bullet
- bullet
- bullet
But I don't know how to get the parser to insert the <li> tags, because it hasn't tokenized each bullet
3. Have the parser match a rule for list that matches like:
list: LIST^ PARA (LIST! PARA)+
Which would give me an AST node like, that could support nested lists.
LIST ----+----PARA
+----PARA
+----LIST--------+-PARA
+---PARA
But this gives me non-determinisim, between match a straight paragraph (PARA), and a bulleted line LIST PARA.
Can anyone suggest an approach?
class CourseTreeWalker extends TreeParser;
tree2html returns [String s]
{ s = ""; }
:
(#(t:TTL (p:PARA | l:list)+ {
s+="<h4>" +t+ "</h4>\n";
s+= "<p>" +p+ "</p>\n";
s+= "<ul>"+l+"</ul>"; } ))+ // this doesn't do what I want
;
list // this doesn't do what I want
{ String l = ""; }
:
(#(LIST (p2:PARA) {
l+="<ul><li>" +p2+ "</li></ul>\n";
} ))
;
class CourseParser extends Parser;
options {
buildAST = true;
}
file : (section)+ EOF! ;
section : TTL^ (listexpr)+;
listexpr : (LIST^)? paraexpr; // this just matches each bullet, instead of treating bullets as a group
paraexpr: (PARA);
class CourseLexer extends Lexer;
options {
k = 3;
charVocabulary = '\3'..'\377';
}
PARA : ("LZU") =>
("LZU" (LETTER | DIGIT | ' ' | '/')+) { $setType(TTL); }
|
("Des") =>
("Description:") { $setType(TTL); }
|
("Lea") =>
("Learning objectives:") { $setType(TTL); }
|
("Tar") =>
("Target audience:") { $setType(TTL); }
|
("Pre") =>
("Prerequisites:") { $setType(TTL); }
|
(CHAR | ' ' )+
;
LIST : ('-' | '·') ;
NEWLINE : (
('\r''\n')=> '\r''\n' //DOS
| '\r' //MAC
| '\n' //UNIX
)
{ $setType(Token.SKIP); newline(); }
;
protected
DIGIT
: '0'..'9'
;
protected
LETTER
: ('a'..'z' | 'A'..'Z')
;
protected
CHAR
: ~( '\n' | '\r' | ' ' | '\t' | '\f' | '-' | '·' )
;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20050603/41aaa71c/attachment-0001.html
More information about the antlr-interest
mailing list