[antlr-interest] Parsing XML
Lucas Ontivero
lucasontivero at hotmail.com
Thu Aug 28 13:34:07 PDT 2008
Hi all,
I am making an articles processor which load technical articles from .txt files and convert him to HTML/DOC/etc.. these articles has tags like [link][/link], [strong][/strong], etc. It is very similar to XML so I am reusing the grammar from "Parsing XML" (http://www.antlr.org/wiki/display/ANTLR3/Parsing+XML)
The problem is the ArticleProcessorLexer.cs is very large (2.08 MB). My project requiere high performance because the articles could be large and my component is part of a web application which could be several request in a same time. I need to do ( PCDATA : {!tagMode}?=> (~'[')+ ; ) in a better way.
I am a newbe with antlr, may be I am confused but, is my grammar ok?
thank you.
/* Begin Grammar ---------------------------------------------------------------------------------------------------------------------------------------------------------------/
grammar ArticleProcessor;
options{
language=CSharp;
output = AST;
ASTLabelType = CommonTree;
}
@header {
using System.Collections;
}
@lexer::namespace { ArticleProcessor.Lexer }
@parser::namespace { ArticleProcessor.Parser }
@lexer::members { bool tagMode = false; }
article : element | EOF ;
element
: TAG_START_OPEN NAME (NAME ATTR_EQ ATTRVALUE)* TAG_CLOSE
(element
| PCDATA
)*
TAG_END_OPEN NAME TAG_CLOSE
;
TAG_START_OPEN : '[' { tagMode = true; } ;
TAG_END_OPEN : '[/' { tagMode = true; } ;
TAG_CLOSE : {tagMode}?=> ']' { tagMode = false; } ;
PCDATA : {!tagMode}?=> (~'[')+ ;
NAME : {tagMode}?=> ( LETTER | '_' | ':') (NAMECHAR)* ;
ATTR_EQ : { tagMode }?=> '=' ;
ATTRVALUE : { tagMode }?=>
( '"' (~'"')* '"'
| '\'' (~'\'')* '\''
)
;
fragment NAMECHAR : LETTER | DIGIT | '.' | '-' | '_' | ':' ;
fragment DIGIT : '0'..'9' ;
fragment LC : 'a'..'z' ;
fragment UC : 'A'..'Z' ;
fragment LETTER : LC|UC ;
WS : {tagMode}?=> (' '|'\r'|'\t'|'\u000C'|'\n')+ {$channel=HIDDEN;} ;
/* End Grammar
---------------------------------------------------------------------------------------------------------------------------------------------------------------/
_________________________________________________________________
Ingresá ya a MSN en Concierto y disfrutá los recitales en vivo de tus artistas favoritos.
http://msninconcert.msn.com/music/archive/es-la/archive.aspx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080828/14007b2f/attachment.html
More information about the antlr-interest
mailing list