[antlr-interest] Problem with Pascal grammar written in Java.

Mon Jun 21 08:29:54 PDT 2004

Hi grp,

I tried to run the pascal grammar that I found in my antlr folder
(/examples/java/pascal) and I am unable to compile the symbol table grammar
(symtab.g). I get an error that says "grammar PascalTreeParserSuper not
defined". I compiled the "pascal.g" grammar first, followed by the
"pascal.tree.g" grammar. I have the following files in my java package:

1. PascalLexer.java, PascalParser.java, PascalTokenTypes.java [generated
from pascal.g] and
2. PascalTreeParserSuper.java, PascalTreeParserSuperTokenTypes.java
[generated from pascal.tree.g]

Outside of this package, under my Pascal project, I have my grammar files,
including "symtab.g". When I try to add an import statement with the name of
my package in "symtab.g", I'm not able to see the PascalTreePaserSuper class
and that's why I'm unable to extend the same. 

I believe that "pascal.g" builds the AST, "pascal.tree.g" creates a generic
AST walker and "symtab.g" derives it to perform the symbol table checks.
Please correct me If I'm wrong.

Thanks for your time.

Bharath.

-----Original Message-----
From: Mark Lentczner [mailto:markl at glyphic.com] 
Sent: Saturday, June 19, 2004 5:37 PM
To: antlr-interest at yahoogroups.com
Subject: Re: [antlr-interest] unicode 16bit versus new 21bit stuff

Seems to me that you can still encode chars and tokens in the same 32 
bit int:
	any value <= 0x10FFFF is Unicode
	any value >  0x10FFFF is a Token type

Or am I missing something?

As for doing Unicode "right" - Yes, you have to do all 2^11 characters 
- it's not just the stuff people make fun of: There are real languages 
whose characters got up stuck above 16 bits.

As for 64-bit ints: My code is going to have to run on legions of aging 
32 bit hardware for years to come - I'd avoid 64 ints if you can.

> The new system will be cool.  You'll be able to use
> Character.UnicodeBlock stuff such as vocabulary=BENGALI;
I doubt this will be useful to anyone.  You should check if anyone 
would use it.  The Unicode blocks rarely correspond to semantically 
useful subsets for parsing.  It is highly unlikely that any grammar 
would want to have "vocabulary=BENGALI" - there would be no punctuation 
in such a language.  As for character class tests - one usually can't 
include whole blocks.  Many, if not all, blocks have characters that 
for parsing would need to be excepted out of any grammar character 
class.

Much more useful would access to the Unicode character classes and 
character property sets like identifier_start, identifier_extend, L, Nd 
and such.

	- Mark

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/