[antlr-interest] Re: File spec grammar
idontwantanidwith2000init
idontwantanidwith2000init at yahoo.com
Sun Apr 11 05:47:04 PDT 2004
You've said you have a problem with DIV and PATH_PART.
could you give an example where semantically it can cause it?
123/123.12 can it be considered a file name?
--- In antlr-interest at yahoogroups.com, "Mike Lischke" <lists at l...>
wrote:
> Hi John,
>
> > I haven't actually tried this using Antlr but how about:
>
> Thank you for your example. I came up with something similar but
the problem is that with that grammar I don't get all
> parts (e.g. the extension if there is one). I know the file spec
is ambiquous because just from looking at:
>
> /abc
>
> You cannot tell if this is a file name or a directory. However one
can say the last part not finished by a path
> separator is a priori a file name unless proved wrong in the
following semantic phase. This is not a serious problem in
> my eyes. My current grammar is similar to yours but a bit more
general, as it allows both path separators and Unicode
> file names:
>
> DRIVE_LETTER: 'a'..'z';
> protected
> FILE_NAME_LETTER: ~('\\' | '/' | ':' | '*' | '?' | '<' | '>'
| '|');
> protected
> FILE_NAME_SEPARATOR: '\\' | '/';
> PATH_PART: FILE_NAME_SEPARATOR (FILE_NAME_LETTER)*;
>
> file_name:
> (drive)? (PATH_PART)*
> ;
> drive:
> DRIVE_LETTER COLON
> ;
>
> This grammar suffers from the same limitations though and causes
warning messages about lexical nondeterminisms, e.g.
> for DIV (defined as '/') and PATH_PART. I'm not sure how to solve
that problem. And I really would like to have the file
> name already splitted in my AST (drive, path, name, extension)
instead adding another parse state.
>
> My earlier attempt was this:
>
> FILE_NAME_LETTER: ~('\\' | '/' | ':' | '*' | '?' | '<' | '>'
| '|');
> EXTENSION_NAME_LETTER: ~('\\' | '/' | ':' | '*' | '?' | '<'
| '>' | '|' | '.');
> FILE_NAME_SEPARATOR: '\\' | '/';
>
> // -- file specification
> file_name:
> (drive)? (FILE_NAME_SEPARATOR)? (directory)* filename
> ;
>
> drive:
> "a".."z" COLON
> | "~"
> ;
>
> directory:
> basename FILE_NAME_SEPARATOR
> ;
>
> filename:
> basename ("." extension)?
> ;
>
> basename:
> (FILE_NAME_LETTER)+
> ;
>
> extension:
> (EXTENSION_NAME_LETTER)+
> ;
>
> If this would work then I would get my file names nicely splitted.
Unfortunately, this throws several nondeterminism
> warnings because the file name letters conflict with other
definitions in my grammar and additionally I get a Java error
> for the "a".."z" range, which uses matchRange(String, String), an
ANTLR function that is not accessible by the resulting
> parser.
>
> > and you did mean unix filenames, right?
>
> I hoped to get both worlds into one grammar :-)
>
> Mike
> --
> www.soft-gems.net
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list