[antlr-interest] (unknown)

Thu Oct 10 08:23:06 PDT 2002

Hi,

I'm trying to build an application around an ANTRL-generated HTML 
grammar recogniser.
I used the example as a starting point.
Unfortunately it falls over where it defines:
script
	:	OSCRIPT (~CSCRIPT)+ CSCRIPT
	;
If the script contains '<' it decides it's got something to match, 
and fails.
I only want to keep up with line numbers  within scripts so tried:
script
	:	OSCRIPT
		(SCDATA)*
		CSCRIPT
	;
where..
SCDATA
	:	(	options{generateAmbigWarnings=false;}
		:       // allow < if not part of </SCRIPT>
			{LA(2)=='/' || LA(3)=='s' || LA(4)=='c' || LA
(5)=='r'}? '<'
		|	'\r' '\n'		{newline();}
		|	'\r'			{newline();}
		|	'\n'			{newline();}
		|	~('<'|'\n'|'\r')
		)*
	;
but this bursts with lexical nondeterminism as you'd expect.
Any ideas how I can grab everything within the script without 
distractions?
Is there anything in ANTRL equivalent to PCCTS's #lexclass?
(This provided a simple way for me to hive of a particular set of 
tests required within certain bounds)
Has anyone worked the grammar file into a completely working concern? 
(the current example file misses quite a few element types)

Thanks
Dave

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/