[antlr-interest] Ignore Whitespace
Anakreon Mejdi
amejdi at ertonline.gr
Tue Nov 5 03:20:25 PST 2002
If both whitespaces and ':' should be ignored then:
class CSVLexer extends Lexer;
options{filter=IGNORE;}
preotected IGNORE
:
'\t
| ' '
| '\n' {newline();}
| '\r' '\n' {newline();}
| ':'
;
No need to manually set Token type to SKIP or anything else.
The Parser will never know that whitespaces existed or tabs or ...
Neil Benn wrote:
> Hello,
>
> I'm sorry to post another newbie question but I stumped! I'm
> looking at the example to ignore whitespace. The text I'm trying to
> tokenise is:-
>
> Assay: , std
> Alphascreen 384 ,
> Description: ,
> Software: , Fusion
> 3.50 , Instrument
> Serial: , ---------
> Sample Map: ,
> demo ,
> Description: ,
> Detection Mode: ,
> Alpha ,
> Shaking: , Disabled
> Plate Type: , Packard
> OptiPlate 384 , Temperature
> Control: , Off
>
> If I tokenize this on comma and newline then I will get the tokens I
> wish. However this will also include the whitespace trailing each
> comment. I can get rid of this by calling a trim in the parser but I'm
> trying to learn how to do this in the lexer. I looked at the ignore
> whitespace section in the docs but it doesn't seem to ignore the
> trailing whitespace. The code is something like :-
>
> -----------------------------------------------------------
>
> class CSVLexer extends Lexer;
>
> options{filter=IGNORE;}
>
> DISCARD: ( '\t'
> | ','
> | '\n' {newline();}
> | '\r' '\n' {newline();}
> )+
> {$setType(Token.SKIP);}
> ;
>
> KEEP
> options { ignore=WS; }
> : ( '\u0020' .. '\u002B'
> | '\u002D' .. '\u0039'
> | '\u003B' .. '\u00FF')+
> ;
>
> protected
>
> IGNORE: (':');
> WS: (' ' | '\t');
>
> ------------------------------------------------
>
> The code compiles OK but the trailing whitespace dosn't get
> removed. Is this issue something I'm best dealing with in the parser or
> is there a way I can deal with it in the lexer?
>
>
> Thanks, in advance for your insistence.
>
> Cheers,
>
> Neil Benn
> Senior Automation Informatics Scientist
>
> Cambridge Antibody Technology
> The Science Park
> Melbourn
> Cambridgeshire
> SG8 6JJ, UK
>
> Telephone: + 44 (0) 1763 263233
> Facsimile + 44 (0) 1763 263413
> Email: mailto:neil.benn at cambridgeantibody.com
> http://www.cambridgeantibody.com
>
> Cambridge Antibody Technology Limited *
> Registered Office: The Science Park, Melbourn, Cambridgeshire, SG8 6JJ, UK
> Registered in England and Wales number 2451177
> (* Cambridge Antibody Technology Limited is a member of the Cambridge
> Antibody Technology Group of Companies)
>
> Confidentiality Note: This information and any attachments is confidential
> and only for use by the individual or entity to whom it has been sent. Any
> unauthorised dissemination, distribution or copying of this message is
> strictly prohibited. If you are not the intended recipient please inform the
> sender immediately by reply e-mail and delete this message from your system.
> Thank you for your co-operation.
>
>
> Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service
> <http://docs.yahoo.com/info/terms/>.
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list