[antlr-interest] D'oh (re HTML grammar problem)
lansdownsoftware
dave-Johnson at btclick.com
Thu Oct 10 10:07:24 PDT 2002
--- In antlr-interest at y..., "lansdownsoftware" <dave-Johnson at b...>
wrote:
> Hi,
>
> I'm trying to build an application around an ANTRL-generated HTML
> grammar recogniser.
> I used the example as a starting point.
> Unfortunately it falls over where it defines:
> script
> : OSCRIPT (~CSCRIPT)+ CSCRIPT
> ;
> If the script contains '<' it decides it's got something to match,
> and fails.
> I only want to keep up with line numbers within scripts so tried:
> script
> : OSCRIPT
> (SCDATA)*
> CSCRIPT
> ;
> where..
> SCDATA
> : ( options{generateAmbigWarnings=false;}
> : // allow < if not part of </SCRIPT>
> {LA(2)=='/' || LA(3)=='s' || LA(4)=='c' || LA
> (5)=='r'}? '<'
> | '\r' '\n' {newline();}
> | '\r' {newline();}
> | '\n' {newline();}
> | ~('<'|'\n'|'\r')
> )*
> ;
> but this bursts with lexical nondeterminism as you'd expect.
> Any ideas how I can grab everything within the script without
> distractions?
> Is there anything in ANTRL equivalent to PCCTS's #lexclass?
> (This provided a simple way for me to hive of a particular set of
> tests required within certain bounds)
> Has anyone worked the grammar file into a completely working
concern?
> (the current example file misses quite a few element types)
>
> Thanks
> Dave
MAKE IT PROTECTED - having stated the problem in public, the solution
appeared soon after.
Dave
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list