[antlr-interest] Re: File spec grammar

idontwantanidwith2000init idontwantanidwith2000init at yahoo.com
Sun Apr 11 05:47:04 PDT 2004


You've said you have a problem with DIV and PATH_PART.
could you give an example where semantically it can cause it?

123/123.12 can it be considered a file name?



--- In antlr-interest at yahoogroups.com, "Mike Lischke" <lists at l...> 
wrote:
> Hi John, 
> 
> > I haven't actually tried this using Antlr but how about:
> 
> Thank you for your example. I came up with something similar but 
the problem is that with that grammar I don't get all
> parts (e.g. the extension if there is one). I know the file spec 
is ambiquous because just from looking at:
> 
> /abc
> 
> You cannot tell if this is a file name or a directory. However one 
can say the last part not finished by a path
> separator is a priori a file name unless proved wrong in the 
following semantic phase. This is not a serious problem in
> my eyes. My current grammar is similar to yours but a bit more 
general, as it allows both path separators and Unicode
> file names:
> 
>   DRIVE_LETTER:        'a'..'z';
> protected
>   FILE_NAME_LETTER:    ~('\\' | '/' | ':' | '*' | '?' | '<' | '>' 
| '|');
> protected
>   FILE_NAME_SEPARATOR: '\\' | '/';
>   PATH_PART:           FILE_NAME_SEPARATOR (FILE_NAME_LETTER)*;
> 
> file_name:
>   (drive)? (PATH_PART)*
> ;
> drive:
>   DRIVE_LETTER COLON
> ;
> 
> This grammar suffers from the same limitations though and causes 
warning messages about lexical nondeterminisms, e.g.
> for DIV (defined as '/') and PATH_PART. I'm not sure how to solve 
that problem. And I really would like to have the file
> name already splitted in my AST (drive, path, name, extension) 
instead adding another parse state.
>  
> My earlier attempt was this:
> 
>   FILE_NAME_LETTER:    ~('\\' | '/' | ':' | '*' | '?' | '<' | '>' 
| '|');
>   EXTENSION_NAME_LETTER:    ~('\\' | '/' | ':' | '*' | '?' | '<' 
| '>' | '|' | '.');
>   FILE_NAME_SEPARATOR: '\\' | '/';
> 
> // -- file specification
> file_name:
>   (drive)? (FILE_NAME_SEPARATOR)? (directory)* filename
> ;
> 	
>   drive:
>     "a".."z" COLON
>     | "~"
>   ;
>   
>   directory:
>     basename FILE_NAME_SEPARATOR
>   ;
>   
>   filename:
>     basename ("." extension)?
>   ;
>   
>   basename:
>     (FILE_NAME_LETTER)+
>   ;
>   
>   extension:
>     (EXTENSION_NAME_LETTER)+
>   ;
> 
> If this would work then I would get my file names nicely splitted. 
Unfortunately, this throws several nondeterminism
> warnings because the file name letters conflict with other 
definitions in my grammar and additionally I get a Java error
> for the "a".."z" range, which uses matchRange(String, String), an 
ANTLR function that is not accessible by the resulting
> parser.
> 
> > and you did mean unix filenames, right?
> 
> I hoped to get both worlds into one grammar :-)
> 
> Mike
> --
> www.soft-gems.net



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list