[antlr-interest] Antlr-works/antlr bugs

Mark Whitis whitis at freelabs.com
Sun Dec 30 16:01:41 PST 2007


OS: Linux debian unstable
Antlrworks version: 1.1.5
classpath is empty.

I like some of the features of antlr works but there are some bugs in
antlrworks and antlr:

Bug #1
------
Check grammar reports success and the internal grammar error display shows 
clean even though a number of errors appear in the console and on stderr.
In addition, there are some java exception dumps.   The errors, 
themselves, appear to be bogus, they are complaints about missing rules
that are not actually missing (grammar below).

Switching "Desktop mode" off and restarting the program fixes
some of these bugs, turning it back on and restarting
causes them to reappear.    With desktop mode off, antlr still complains
about the false errors which antlrworks itself does not find but
it at least reports that there were errors instead of claiming
success.   The exceptions went away when turning of desktop mode
and did not come back when it was reeenabled.

[16:06:24] Checking Grammar...
[16:06:24] error(106): bsdl.g:28:44: reference to undefined rule: exponent
[16:06:24] error(106): bsdl.g:12:3: reference to undefined rule: 
use_statement
[16:06:24] error(106): bsdl.g:10:3: reference to undefined rule: 
generic_parameter
[16:06:24] error(106): bsdl.g:14:3: reference to undefined rule: 
scan_port_identification
[16:06:24] error(106): bsdl.g:15:3: reference to undefined rule: 
tap_description
[16:06:24] error(106): bsdl.g:16:3: reference to undefined rule: 
boundary_register_description
[16:06:24] error(106): bsdl.g:13:3: reference to undefined rule: 
package_pin_mapping
[16:06:24] error(106): bsdl.g:11:3: reference to undefined rule: 
logical_port_description

Ok, maybe there is some context that I am not supplying in the
grammar file but antlrworks and antlr should at least be consistent.
Some of the trivial examples have some stuff I am not doing.
I haven't separated parser and lexar rules, but that doesn't seem
to be a requirement (indeed the ANSI C grammar doesn't do that) and there 
is no markup for doing that, anyway; examples use comments.  I haven't
identified named tokens and don't ever intend to; that was a bug in lex/yacc
and if there weren't sample grammars that demonstated that this was
unnecessary, I wouldn't be using antlr in the first place.   I didn't
provide main(), not relevent at this stage.   I have been using
real grammars as an example, not the trivial stuff.

Bug #2
------
Desktop mode has dubious value, wastes screenspace, is buggy, and is
on by default.   In most cases, tabs for multiple files would be of
far greater value and that doesn't appear to be supported.   When
more than one file needs to be seen at the same time, split window or
multiple independent windows are almost always preferable to creating
a bogus desktop.

Bug #3
------
Program has some of the usual java bugs of not being executable from path,
starting in the wrong directory, etc.   It does do a better job than
many, however, of finding files specified on command line and running
from an arbitrary directory (though it does this by putting everything
in the jar file).

http://www.freelabs.com/~whitis/java/example_wrapper_script

Bug #4
------
path to dot is initially blank.   "/usr/bin/dot" would be a good default
value.   Won't work in many cases on proprietary OSes but at least it
will work on most of the free ones.

Bug #5
------
If I use generate -> show parser code, I get additional errors that
look bogus.

This:

[16:35:51] warning(200): /home/whitis/src/bsdl_parser/bsdl.g:30:50: 
Decision can match input such as "EOF" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
[16:35:52] error(106): /home/whitis/src/bsdl_parser/bsdl.g:28:44: 
reference to undefined rule: exponent

Is not a valid response to:
    FLOAT_LITERAL: ('0'..'9')+ '.' ('0'..'9')+ exponent?;
    fragment exponent: ('e'|'E') ('+'|'-')? ('0'..'9')+;
The problem seems to be the use of "exponent" instead of
"Exponent" for a fragment/token.
Note: this is at least a documentation bug.   Case sensitivity is
only alluded to in the Rule elements table, not the text of the
documentation or the tutorials.   Case sensitivity in languages
is a flaw.   You should be able to, at least optionally,  mark tokens with 
a keyword "token" and rules with "rule" and/or by section "lexar { }" and
"parser { }".

I also wonder why the user has to tell the program which is a token
and which is a gramatical rule.   The difference,
aside from some subtleties of whitespace handling, is 
basically whether it is more efficient to handle
with a lexer or a parser and ANTLR should know the answer to
that better than the user.  Let us override, of course, but
lexar/parser distinction are really an implementation detail and
a historical anachronism from the days when separate tools were used.
I know the difference but users new to grammars have enough else
to worry about.  A more important distinction comes when you build
parse trees or programs instead of a language recognizer.   Tokens
are a single entity in the parse tree.   But, in terms of optimization,
a smart program might use a lexar for some things the user thinks of
as parser rules and vice versa.   Thus, 'attribute' 'INSTRUCTION_LENGTH' 
'of' might be implemented in a lexer state machine that generates
three AST tokens.  And with the appropriate configuration of whitespace
handling, 'attribute INSTRUCTION_LENGTH of' could be a single
AST token as well.   Some thought needs to be given to the potential
for reserved words being recognized as identifiers in other contexts
which may be either a good thing or a bad one depending on application.0
All the user should be concerned with is how many AST nodes are
generated from a rule and how whitespace is handled, and they don't
need to be concerned about nodes if they are building a recognizer.

BTW, the main docs doesn't even have the word "whitespace" or
"white space" even though there is a "WS" example lexar rules.
And since whitespace could, in the future, get special handling inside 
tokens whitespace rules are special and there needs to be a keyword that 
identifies them.

MISC
----
Line number display should be on the view menu, not burried in 
preferences.   This is something that the user may change frequently.
It also should default to on.

Auto indent ":" involves non-consensual activity and should default to 
off.

Autosave should default to on.   And it should autosave to a different
file name and not display a dialog box (other common autosave bugs,
not tested yet).

Create backup file should default to on.

The three window manipulation buttins in the upper right corner are 
non-standard and substandard.   The maximize button is an empty box,
and the minimize button has an underscore that overlaps with the
edge of the box and thus also looks like an empty box.   Only a problem
in desktop mode, though.

GRAMMAR
-------
grammar bsdl;

options {
    // language=C;
    language=Java;
}
//  note: IDENTIFIER at start and end should match

BSDL_main: 'entity' IDENTIFIER 'is'
   generic_parameter?
   logical_port_description?
   use_statement*
   package_pin_mapping?
   scan_port_identification?
   tap_description?
   boundary_register_description?
   'end' IDENTIFIER;

WHITESPACE:  ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ 	{ $channel = HIDDEN; } ;

IDENTIFIER: ( 'A'..'Z' | 'a'..'z' ) ( 'A'..'Z' | 'a'..'z' | '0'..'9' | '_' )*;

// must concatenate somehow
STRING_LITERAL: ('"' ~('"')* '"' ) ('&' ('"' ~('"')* '"' ) )* ;

INTEGER_LITERAL: ('0'..'9')+;

FLOAT_LITERAL: ('0'..'9')+ '.' ('0'..'9')+ exponent?;

fragment exponent: ('e'|'E') ('+'|'-')? ('0'..'9')+;

COMMENT: '--' ~('\r'|'\n')* '\r'? '\n' {$channel=HIDDEN;};

generic_parameter: 'generic' '(' 'PHYSICAL_PIN_MAP' ':' 'string' ':='  STRING_LITERAL ')' ';';

logical_port_description: 'port' '(' (pin_id ';')+ ')' ';';

pin_id:  identifier_list ':' mode ':' pin_type;

// Note: out is a tristate output and buffer is a two state output
mode:  'linkage' | 'in' | 'out' | 'inout' | 'buffer';
identifier_list: IDENTIFIER (',' IDENTIFIER)*;

pin_type: pin_scalar | pin_vector;

pin_scalar: IDENTIFIER;

pin_vector:
    IDENTIFIER '(' range ')';

range:
    INTEGER_LITERAL 'to' INTEGER_LITERAL
    | INTEGER_LITERAL 'downto' INTEGER_LITERAL;

use_statement:
    'use' vhdl_package_name;

vhdl_package_name:
   ( 'A'..'Z' | 'a'..'z' ) ( 'A'..'Z' | 'a'..'z' | '0'..'9' | '_' | '.')*;

package_pin_mapping:
   'attribute' 'PIN_MAP' 'of' IDENTIFIER ':' 'entity' 'is' 'PHYSICAL_PIN_MAP' ';'
   | 'constant' IDENTIFIER ':' 'PIN_MAP_STRING' ':=' map_string ';' ;

// must invoke a subordinate parser on map_string;
map_string: STRING_LITERAL;

scan_port_identification:
    'attribute' 'TAP_SCAN_IN' 'of' IDENTIFIER ':' 'signal' 'is' 'true' ';'
    | 'attribute' 'TAP_SCAN_OUT' 'of' IDENTIFIER ':' 'signal' 'is' 'true' ';'
    | 'attribute' 'TAP_SCAN_MODE' 'of' IDENTIFIER ':' 'signal' 'is' 'true' ';'
    | 'attribute' 'TAP_SCAN_RESET' 'of' IDENTIFIER ':' 'signal' 'is' 'true' ';'
    | 'attribute' 'TAP_SCAN_CLOCK' 'of' IDENTIFIER ':' 'signal' 'is' '(' frequency ','  ('LOW' | 'BOTH' ) ';'
    ;

frequency: FLOAT_LITERAL;

// must invoke a subordinate parser on map_string;


tap_description:
    'attribute' 'INSTRUCTION_LENGTH' 'of' IDENTIFIER ':' 'entity' 'is' INTEGER_LITERAL ';'
   | 'attribute' 'INSTRUCTION_OPCODE' 'of' IDENTIFIER ':' 'entity' 'is' opcode_table ';'
   | 'attribute' 'INSTRUCTION_CAPTURE' 'of' IDENTIFIER ':' 'entity' 'is' capture_pattern ';'
   | 'attribute' 'INSTRUCTION_DISABLE' 'of' IDENTIFIER ':' 'entity' 'is' opcode_name ';'
   | 'attribute' 'INSTRUCTION_PRIVATE' 'of' IDENTIFIER ':' 'entity' 'is' opcode_list ';'
   | 'attribute' 'INSTRUCION_USAGE' 'of' IDENTIFIER ':' 'entity' 'is' usage_string';'
   ;

opcode_table: 'todo_opcode_table';
capture_pattern: 'todo_capture_pattern';
opcode_name: 'todo_opcode_name';
opcode_list: 'todo_opcode_list';
usage_string: 'todo_usage_string';
boundary_register_description: 'todo_boundary_register_description' ;

EXCEPTIONS
----------
Sample1:
Exception in thread "Thread-20" java.lang.NullPointerException
         at org.antlr.works.menu.MenuGrammar.checkGrammarDidEnd(Unknown 
Source)
         at org.antlr.works.grammar.CheckGrammar.run(Unknown Source)
         at java.lang.Thread.run(Thread.java:620)


Sample2:
error(100): bsdl.g:9:10: syntax error: antlr: bsdl.g:9:10: unexpected 
token: :

bsdl.g:19:44: expecting '"', found 'u'
         at org.antlr.tool.ANTLRLexer.nextToken(ANTLRLexer.java:321)
         at 
antlr.TokenStreamRewriteEngine.nextToken(TokenStreamRewriteEngine.java:161)
         at antlr.TokenBuffer.fill(TokenBuffer.java:69)
         at antlr.TokenBuffer.LT(TokenBuffer.java:86)
         at antlr.LLkParser.LT(LLkParser.java:56)
         at org.antlr.tool.ANTLRParser.alternative(ANTLRParser.java:1755)
         at org.antlr.tool.ANTLRParser.block(ANTLRParser.java:1713)
         at org.antlr.tool.ANTLRParser.ebnf(ANTLRParser.java:2522)
         at 
org.antlr.tool.ANTLRParser.elementNoOptionSpec(ANTLRParser.java:2005)
         at org.antlr.tool.ANTLRParser.element(ANTLRParser.java:1925)
         at org.antlr.tool.ANTLRParser.alternative(ANTLRParser.java:1777)
         at org.antlr.tool.ANTLRParser.altList(ANTLRParser.java:1448)
         at org.antlr.tool.ANTLRParser.rule(ANTLRParser.java:1236)
         at org.antlr.tool.ANTLRParser.rules(ANTLRParser.java:655)
         at org.antlr.tool.ANTLRParser.grammar(ANTLRParser.java:389)
         at org.antlr.tool.Grammar.setGrammarContent(Grammar.java:521)
         at org.antlr.tool.Grammar.setGrammarContent(Grammar.java:497)
         at org.antlr.works.grammar.EngineGrammar.createNewGrammar(Unknown 
Source)
         at 
org.antlr.works.grammar.EngineGrammar.createCombinedGrammar(Unknown 
Source)
         at org.antlr.works.grammar.EngineGrammar.createGrammars(Unknown 
Source)
         at 
org.antlr.works.generate.CodeGenerate.getGrammarLanguage(Unknown Source)
         at org.antlr.works.menu.MenuGenerate.isKnownLanguage(Unknown 
Source)
         at org.antlr.works.menu.MenuGenerate.checkLanguage(Unknown Source)
         at 
org.antlr.works.menu.MenuGenerate.checkAndShowGeneratedCode(Unknown 
Source)
         at org.antlr.works.menu.MenuGenerate.showGeneratedCode(Unknown 
Source)
         at org.antlr.works.editor.EditorMenu.handleMenuGenerate(Unknown 
Source)
         at org.antlr.works.editor.EditorMenu.handleMenuEvent(Unknown 
Source)
         at 
org.antlr.xjlib.appkit.menu.XJMenuItem$MenuActionListener.actionPerformed(Unknown 
Source)
         at 
javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1957)
         at 
javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2278)
         at 
javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:377)
         at 
javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:232)
         at javax.swing.AbstractButton.doClick(AbstractButton.java:357)
         at 
javax.swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:1137)
         at 
javax.swing.plaf.basic.BasicMenuItemUI$Handler.menuDragMouseReleased(BasicMenuItemUI.java:1241)
         at 
javax.swing.JMenuItem.fireMenuDragMouseReleased(JMenuItem.java:548)
         at 
javax.swing.JMenuItem.processMenuDragMouseEvent(JMenuItem.java:445)
         at javax.swing.JMenuItem.processMouseEvent(JMenuItem.java:391)
         at 
javax.swing.MenuSelectionManager.processMouseEvent(MenuSelectionManager.java:292)
         at 
javax.swing.plaf.basic.BasicPopupMenuUI$MouseGrabber.eventDispatched(BasicPopupMenuUI.java:783)
         at 
java.awt.Toolkit$SelectiveAWTEventListener.eventDispatched(Toolkit.java:2339)
         at 
java.awt.Toolkit$ToolkitEventMulticaster.eventDispatched(Toolkit.java:2232)
         at java.awt.Toolkit.notifyAWTEventListeners(Toolkit.java:2190)
         at java.awt.Component.dispatchEventImpl(Component.java:4263)
         at java.awt.Container.dispatchEventImpl(Container.java:2018)
         at java.awt.Component.dispatchEvent(Component.java:4195)
         at 
java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4222)
         at 
java.awt.LightweightDispatcher.processMouseEvent(Container.java:3886)
         at 
java.awt.LightweightDispatcher.dispatchEvent(Container.java:3816)
         at java.awt.Container.dispatchEventImpl(Container.java:2004)
         at java.awt.Window.dispatchEventImpl(Window.java:2210)
         at java.awt.Component.dispatchEvent(Component.java:4195)
         at java.awt.EventQueue.dispatchEvent(EventQueue.java:599)
         at 
java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:273)
         at 
java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:183)
         at 
java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:173)
         at 
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:168)
         at 
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:160)
         at java.awt.EventDispatchThread.run(EventDispatchThread.java:121)




More information about the antlr-interest mailing list