[antlr-interest] D'oh (re HTML grammar problem)

Thu Oct 10 10:07:24 PDT 2002

--- In antlr-interest at y..., "lansdownsoftware" <dave-Johnson at b...> 
wrote:
> Hi,
> 
> I'm trying to build an application around an ANTRL-generated HTML 
> grammar recogniser.
> I used the example as a starting point.
> Unfortunately it falls over where it defines:
> script
> 	:	OSCRIPT (~CSCRIPT)+ CSCRIPT
> 	;
> If the script contains '<' it decides it's got something to match, 
> and fails.
> I only want to keep up with line numbers  within scripts so tried:
> script
> 	:	OSCRIPT
> 		(SCDATA)*
> 		CSCRIPT
> 	;
> where..
> SCDATA
> 	:	(	options{generateAmbigWarnings=false;}
> 		:       // allow < if not part of </SCRIPT>
> 			{LA(2)=='/' || LA(3)=='s' || LA(4)=='c' || LA
> (5)=='r'}? '<'
> 		|	'\r' '\n'		{newline();}
> 		|	'\r'			{newline();}
> 		|	'\n'			{newline();}
> 		|	~('<'|'\n'|'\r')
> 		)*
> 	;
> but this bursts with lexical nondeterminism as you'd expect.
> Any ideas how I can grab everything within the script without 
> distractions?
> Is there anything in ANTRL equivalent to PCCTS's #lexclass?
> (This provided a simple way for me to hive of a particular set of 
> tests required within certain bounds)
> Has anyone worked the grammar file into a completely working 
concern? 
> (the current example file misses quite a few element types)
> 
> Thanks
> Dave

MAKE IT PROTECTED - having stated the problem in public, the solution 
appeared soon after.

Dave

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/