[antlr-interest] antlr works ignoring whitespace

Ron AF Greve antlrlist at moonlit.xs4all.nl
Sun Jan 7 12:53:00 PST 2007


Hi,

I am trying to use antlr-works to develop my grammar. It looks like a great 
tool. I tried to use it to test some input files. However I don't seem to be 
able to make it ignore whitespace. I tried the v2 and v3 ways like skip, 
channel=99, channel=HIDDEN, filter, but everything seems to fail. As soon as 
it hits whitespace the grammar fails. Anyone knows how to do this or is it 
not (yet) possible.

Below my grammar and last attempt with the filter option.

Any help greatly appreciated

Regards, Ron AF Greve

http://moonlit.xs4all.nl

Grammar
----------------------------------
grammar HTML;

options {

filter=WHITESPACE;

}

htmllist: ( bodylist )?;

bodylist: ( bodyblock )+;

blocklist: ( block )+;

block : ulblock

| hblock

| imgblock

| ablock

| pblock

| BR

;

bodyblock

: BODY ctag (blocklist)? CBODY

;

ulblock : UL ctag (lilist)? CUL

;

lilist : LI ctag (licontents)? CLI

;

pblock : P ctag (blocklist)? CP

;

hblock : ( H1 ctag | H2 ctag | H3 ctag | H4 ctag ) (textitems)? ( CH1 | CH2 
| CH3 |CH4 )

;

imgblock: IMG ctag CIMG

;

licontents

: (textitems)+

;

textitems

: TEXT

| pblock

| ablock

;

ablock : A ctag TEXT CA

;

ctag : (optionlist)? CLOSETAG

;

optionlist

: (property)+

;

property: ID EQUAL value

;

value : string1

| integer1

| float1

;


string1 : STRING

;

integer1 : INTEGER

;

float1 : FLOAT

;

FLOAT : INTEGER '.' INTEGER ;

protected

WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+

;

ID : ( 'A'..'Z' | 'a'..'z' ) ( 'A'..'Z' | 'a'..'z' | '0'..'9' )+

;

STRING : '"' ( 'A'..'Z' | 'a'..'z' | '0'..'9' )* '"'

;

INTEGER : ( '0'..'9' )+

;

TEXT : ( 'A'..'Z'|'a'..'z'|'0'..'9'|' '|'.'|'?'|'!'|'@' )+

;

EQUAL : '=';

CLOSETAG: '>';

BODY : '<BODY';

P : '<P';

UL : '<UL';

LI : '<LI';

H1 : '<H1';

H2 : '<H2';

H3 : '<H3';

H4 : '<H4';

IMG : '<IMG';

A : '<A';

BR : '<BR>';

CBODY : '</BODY>';

CP : '</P>';

CUL : '</UL';

CLI : '</LI>';

CH1 : '</H1>';

CH2 : '</H2>';

CH3 : '</H3>';

CH4 : '</H4>';

CIMG : '</IMG>';

CA : '</CA>';





More information about the antlr-interest mailing list