[antlr-interest] Re: identifier with space
Thomas Brandon
tom at psy.unsw.edu.au
Thu Oct 30 16:01:05 PST 2003
As Loring suggests, a hidden token stream would probably do the
trick. Or, to avoid attaching the hidden tokens (and having to use
that token class, and use the AST class or process the hidden tokens
in between lex and parse so the hidden tokens can be dropped), you
could keep the whitespace in the lexer, then use a token stream
filter in between lex and parse that rolled together IDENT WS+
combinations and stripped WS not following an IDENT.
Or, could you perhaps recognise the spaces as part of the
identifiers. Have something like (assuming for the example that
idents are only lowercase letters):
IDENT_WITH_SPACES
:
('a'..'z')+
{
$setType(testLiterals($getText, IDENT_WITH_SPACES));
}
(
{
$getType == IDENT_WITH_SPACES // Don't do it if it's
a keyword
}?
(WS)+
)?
;
That's just off the top of my head, probably neglecting something.
And that code is almost certainly wrong (especially the test literals
part) but should give the idea. Allow whitespace on non-literal
idents. Probably need to make the WS+ greedy to avoid ambiguity,
though from memory Antlr should match early and work anyway.
Tom.
--- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> Lloyd--
>
> Check out the "Token Streams" part of the ANTLR manual. I think
that
> you can capture the whitespace as hidden tokens and then access
that
> for reconstructing the input. I've not had occasion to use this
> feature, but it was put in for just this purpose.
>
> --Loring
>
>
> --- In antlr-interest at yahoogroups.com, "lloyd_from_far" <ld at g...>
> wrote:
> > Hi Loring,
> >
> > I do want to separate "A Field" (1 space) from "A Field" (2
spaces)
> > it's not my fault if I have to write a ADO.NET driver to a stupid
&
> > so-called "database"
> >
> > anyway managing 1 space (and only one) is certainly better than
no
> > space at all !!
> > obviously I hit here a limitation of ANTLR, I guess I have to do
as
> > you suggested, would be better than nothing.
> >
> > thanks for the feeback ;-)
> >
> > --- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...>
wrote:
> > > --- In antlr-interest at yahoogroups.com, "lloyd_from_far"
<ld at g...>
> > > wrote:
> > > > sorry, my example was bad.
> > > > let parse this:
> > > >
> > > > SELECT A Field With Name FROM ATable
> > > >
> > >
> > > Lloyd--
> > >
> > > You're trying to do too much in the lexer--spaces are
significant
> > for
> > > separating tokens in your example. If you really want "A Field
> > With
> > > Name" as a single AST node, you are probably better off
> > reconstructing
> > > it:
> > >
> > > select
> > > :
> > > "SELECT" text "FROM" text
> > > ;
> > >
> > > text
> > > { String foo }
> > > :
> > > a:IDENTIFIER { foo = $a.getText(); }
> > > { b:IDENTIFIER! { foo += " " + $b.getText(); } )*
> > > { $a.setText(foo); }
> > > ;
> > >
> > > That also has the advantage of converting text to a canonical
form
> > > with single spaces--you really don't want "A field" to be
> > different
> > > than "A field", do you?
> > >
> > > --Loring
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list