[antlr-interest] "An Introduction to ANTLR" presentation slides

Terence Parr parrt at cs.usfca.edu
Thu Feb 28 13:32:20 PST 2008


On Feb 28, 2008, at 12:33 PM, Andy Tripp wrote:

> Terence Parr wrote:
>>
>>
>> syntax is the grammatical structure. Semantics deals with the  
>> symbols (IDs).
>>
>> int x = "foo";
>>
>> is syntactically ok but semantically wrong.
> That example illustrates the problem well.
> You mean that to the *parser*, the input is invalid.

Andy, you're killing me. ;) Syntax means the structure the symbols.   
That is what the parser does.  The lexer only creates vocab symbols  
for the parser to apply structure to.

> To the *lexer*, it's symantically right.

Semantics are all about the *meaning*, which is derived from  
grammatical structure.  Forget the lexer, bro!

> So I think the "syntactic"/"semantic" distinction is orthogonal to  
> the issue
> of whether input (to a lexer, parser, or treewalker) is valid.

nope.  If you have no actions, it's all syntax.  It's a 3-level  
recognition (i.e., syntax) problem.  All recognizers are applying  
structure, but syntax shows how to make a valid sentence.  You've  
heard of *syntax* diagrams no doubt...are you really saying those are  
the lexer rules?

> Whether the example you gave (or any example) is syntactically or
> semantically valid all depends on the lexer, parser, or treewalker.

no, the parser will say if it's valid syntaciticly...if you have  
actions you can do the semantics.

>>> Terrence has this general mechanism that he's calling "predicates"
>>> which checks the structure of the input. That input can be a stream
>>> of characters (for lexer), tokens (for parser), or ASTs for  
>>> treewalker.
>>>
>>> Now that I think about it, maybe a better name for "syntactic  
>>> predicate"
>>> would be "input pattern predicate" or something like that. The term
>>> "syntactic", to me, is a bit misleading because it makes
>>> me think of input characters.
>>
>> why?  i've never seen nor heard this way of thinking about it.
> http://dictionary.reference.com/browse/syntax:
> 4.Computers. the grammatical rules and structural patterns  
> governing the ordered use of appropriate words and symbols for  
> issuing commands, writing code, etc., in a particular software  
> application or programming language.

Yes, how did you get syntax == lexer out of that?  syntax defines set  
of valid sentences; i.e., language.

> http://en.wikipedia.org/wiki/Syntax:
> ...study of the rules that govern the structure of sentences...
>
> Every time I've ever heard anyone talking about "syntax" they were  
> talking
> about the input string itself.

Structure == syntax.

>>> Saying "my treewalker has a
>>> syntactic predicate, which of course checks the shape of the input
>>> AST" seems a bit odd.
>>
>> Not sure why.
> Because most people (including ANTLR users, I think) would not say  
> that
> a treewalker is doing any syntactic checking.

sure it is: on the tree structure.

> They'd say it's checking the structure
> of the AST.

structure == syntax

>>> I may just be stuck in an old way of thinking,
>>> but I just checked dictionary.com and wikipedia, and they're  
>>> agreeing
>>> with me :)
>>
>> not possible.  syntax is grammatical structure.  i wrote the sem  
>> pred wikiped things so they must agree with me ;)
> Looks like you wrote the syntactic predicate wikipedia entry, but the
> semantic predicate entry doesn't exist.
> I guess you coined the term "syntactic predicate", so you can have  
> it mean whatever you want it to.
> I just think your definition goes way beyond the dictionary  
> definition and common usage of "syntax".

I'm pretty sure you'll find my usage is the common one; all my papers  
get past the reviewers at least in that area.  Seriously, this is the  
most clear thing in my mind and everybody elses in the formal  
language community.

> The sentence: "Go!" could cause either valid or invalid input to  
> either a lexer, parser, or treewalker.
> If you want to consider each one's input to be its "syntax", then  
> we have:
>
> lexer syntax is whether the chars are valid (whether an output  
> Tokenstream can be created)

it's syntax is whether it's a valid token.

Ter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080228/2c146e42/attachment-0001.html 


More information about the antlr-interest mailing list