[antlr-interest] Whitespace question

Nicola Musatti Nicola.Musatti at objectway.it
Mon Oct 12 01:14:24 PDT 2009


Reid Rivenburgh wrote:
[...]
> In my grammar, I've defined 
> a searchTerms parser rule, which is one or more searchTerm:
> 
> 	searchTerm+;
> 
> searchTerm matches a SEARCH_TERM token, which can be a number or word 
> (with some special characters like '*' allowed).  The number is the 
> usual definition for a floating point number:
> 
> 	('-'|'+')?((DIGIT+)|(DIGIT*'.'DIGIT+));
> 
> which I hope is correct.  (DIGIT is the fragment 0..9.)  I'm also 
> sending whitespace to the HIDDEN channel, as is often recommended.  It 
> seems like a side effect of this is that this input:
> 
> 4.66.34
> 
> which isn't a valid number, gets parsed as two different terms: 4.66 and 
> .34.  Is there some way to require whitespace between my search terms so 
> that input isn't allowed?  When I was parsing words, this wasn't a 
> problem.  I wouldn't be surprised if my design is a bit wrong still, and 
> that's what's put me in this position.

I would say that in most languages this problem doesn't arise because 
the syntax doesn't allow two numbers in a row. If this is the case also 
in your language, you should express it in your grammar. Otherwise you 
may want to recognize numbers with more than one decimal point, but mark 
them as errors, e.g. assuming you defined a boolean error variable in an 
appropriate place:

('-'|'+')?((DIGIT+)|(DIGIT*'.'DIGIT+(('.'DIGIT+) { error = true; )*));

Here's a larger example of this approach, also due to Jim Idle: 
http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C+dot%2C+range%2C+time+specs

Hope this helps.

Cheers,
Nicola Musatti

-- 
La presente comunicazione potrebbe contenere informazioni riservate e/o protette
da segreto professionale ed e' indirizzata esclusivamente ai destinatari della
medesima qui indicati. Se avete ricevuto per errore la presente comunicazione,
siete invitati a segnalarcelo, rispondendo a questo stesso indirizzo di e-mail,
e a cancellare il presente messaggio dal Vostro sistema. E' strettamente proibito
e potrebbe essere fonte di violazione di legge qualsiasi uso, comunicazione, copia
o diffusione dei contenuti di questa comunicazione da parte di chi la abbia
ricevuta per errore o in violazione degli scopi della presente.
Il messaggio e' stato analizzato alla ricerca di virus o contenuti pericolosi
ed e' risultato NON infetto.



More information about the antlr-interest mailing list