[antlr-interest] File spec grammar
Mike Lischke
lists at lischke-online.de
Sun Apr 11 03:12:12 PDT 2004
Hi John,
> I haven't actually tried this using Antlr but how about:
Thank you for your example. I came up with something similar but the problem is that with that grammar I don't get all
parts (e.g. the extension if there is one). I know the file spec is ambiquous because just from looking at:
/abc
You cannot tell if this is a file name or a directory. However one can say the last part not finished by a path
separator is a priori a file name unless proved wrong in the following semantic phase. This is not a serious problem in
my eyes. My current grammar is similar to yours but a bit more general, as it allows both path separators and Unicode
file names:
DRIVE_LETTER: 'a'..'z';
protected
FILE_NAME_LETTER: ~('\\' | '/' | ':' | '*' | '?' | '<' | '>' | '|');
protected
FILE_NAME_SEPARATOR: '\\' | '/';
PATH_PART: FILE_NAME_SEPARATOR (FILE_NAME_LETTER)*;
file_name:
(drive)? (PATH_PART)*
;
drive:
DRIVE_LETTER COLON
;
This grammar suffers from the same limitations though and causes warning messages about lexical nondeterminisms, e.g.
for DIV (defined as '/') and PATH_PART. I'm not sure how to solve that problem. And I really would like to have the file
name already splitted in my AST (drive, path, name, extension) instead adding another parse state.
My earlier attempt was this:
FILE_NAME_LETTER: ~('\\' | '/' | ':' | '*' | '?' | '<' | '>' | '|');
EXTENSION_NAME_LETTER: ~('\\' | '/' | ':' | '*' | '?' | '<' | '>' | '|' | '.');
FILE_NAME_SEPARATOR: '\\' | '/';
// -- file specification
file_name:
(drive)? (FILE_NAME_SEPARATOR)? (directory)* filename
;
drive:
"a".."z" COLON
| "~"
;
directory:
basename FILE_NAME_SEPARATOR
;
filename:
basename ("." extension)?
;
basename:
(FILE_NAME_LETTER)+
;
extension:
(EXTENSION_NAME_LETTER)+
;
If this would work then I would get my file names nicely splitted. Unfortunately, this throws several nondeterminism
warnings because the file name letters conflict with other definitions in my grammar and additionally I get a Java error
for the "a".."z" range, which uses matchRange(String, String), an ANTLR function that is not accessible by the resulting
parser.
> and you did mean unix filenames, right?
I hoped to get both worlds into one grammar :-)
Mike
--
www.soft-gems.net
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list