[antlr-interest] Freemarker grammer w/ JavaScript target

Sam Harwell sam at tunnelvisionlabs.com
Thu Nov 15 07:10:33 PST 2012


I don't have a publicly available example of the way I currently do things in my ANTLR 3-based IDEs. An old solution that is ugly but mostly works is available at the following link. Note that ANTLR 3 only became a workable tool for building an IDE after I understood the internals in detail and was able to hack it into cooperating. These IDEs (especially Pixel Mine nFringe) use a heavily modified build of ANTLR 3 and grammars which are carefully crafted with IDEs in mind.
http://blog.280z28.org/archives/2008/10/21/

ANTLR 4 is better than ANTLR 3 in almost every way, *especially* when it comes to building IDEs (this was my primary motivation for becoming so involved with the project). However the only target available for ANTLR 4 right now is Java.

--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com

From: Roded [mailto:rodedb at gmail.com]
Sent: Thursday, November 15, 2012 3:13 AM
To: Gerald Rosenberg
Cc: Sam Harwell; ANTLR-Interest Interest
Subject: Re: [antlr-interest] Freemarker grammer w/ JavaScript target

Thanks for the opinions, will report my subjective conclusions..
A couple more related questions:
1. Considering I'm starting a new ANTLR project, is it viable to start with ANTLR4? (in relation to both ANTLR4's general stability at the moment and the state of its JavaScript target).
2. Sam, regarding tracking the mode information and overriding nextToken in ANTLR3, As I'm not too familiar w/ the ANTLR's code so I understand the idea only generally. Can you perhaps point to an example of such parser usage in a different environment? Any code will be of much assistance.
Thanks again,
Roded

On Thu, Nov 15, 2012<tel:2012> at 1:45 AM, Gerald Rosenberg <gerald at certiv.net<mailto:gerald at certiv.net>> wrote:
Interesting.  Not my experience at all.  And, that is even with Eclipse.


On 11/14/2012 1:21 PM, Sam Harwell wrote:
This naïve approach is not scalable, and will introduce the following limitations:

1. Typing characters within a large token such as a block comment spanning many lines will be "laggy".
No reason that any particular token would take any more or less time to parse -- matching a .* is fast.  As long as the parser is kept warm, the incremental time required to parse an in-memory stream is quite small, particularly in comparison to keystrokes.  This is for source files of 10s to 100s of KB.  Perhaps what you are seeing is particular to your IDE.

2. As the document grows in size, the editor will progressively slow down.
This is entirely dependent on the IDE implementation.  Highlighting and similar features should run in a separate thread and never affect keyboard performance.  A common strategy is that if the highlighting thread ever falls behind, just discard new highlighting changes.  Even in a heavyweight IDE like Eclipse, discards rarely if ever happen and, if they do, the effect is imperceptible.

A very useful (and common) strategy is to minimize UI updates.  Diff the results of the parse with an image of the UI content and apply only the changes.  For SWT and highlighting, the changes are just a series of attribute changes, typically just one or two, which are set without necessarily invoking a UI update.  Keystrokes do cause UI updates, so highlighting is synchronous.

For CodeMirror, it looks like highlighting is implemented by tweaking the DOM class of a span.  The time required to do DOM and UI updates will likely far outweigh the Antlr parse time.  Run the parser in a separate Worker thread and, with some attention to keeping the parser warm and managing the application of updates, I think you will be quite pleasantly surprised at the performance.


For even medium sized documents, running *just* the lexer on the entire document in response to keystrokes will be noticeably slow. Not all editors treat syntax highlighting as a line-at-a-time problem. Even in those editors I use the line-at-a-time approach to greatly improve performance of my IDEs.

The new lexer modes in ANTLR 4 make it much easier to break up tokens which would otherwise span multiple lines. It can be done in ANTLR 3 by manually tracking the mode information and using an override of nextToken that calls a fragment rule for the current mode instead of always calling mTokens. I haven't used the JavaScript target or worked with CodeMirror so I don't have any examples of how to implement this strategy in that environment.

--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com

-----Original Message-----
From: antlr-interest-bounces at antlr.org<mailto:antlr-interest-bounces at antlr.org> [mailto:antlr-interest-bounces at antlr.org<mailto:antlr-interest-bounces at antlr.org>] On Behalf Of Gerald Rosenberg
Sent: Wednesday, November 14, 2012<tel:2012> 12:29 PM
To: rodedb at gmail.com<mailto:rodedb at gmail.com>
Cc: ANTLR-Interest Interest
Subject: Re: [antlr-interest] Freemarker grammer w/ JavaScript target

Although your editor's approach is line at a time, no reason to try and force Antlr to do the same.  Antlr is more than fast and light enough to re-parse the entire source file between each keystroke and walk the AST to provide highlighting info (and walk the AST to adjust error markers and to collect code assist hints and ... ).

On 11/14/2012 12:26 AM, Roded wrote:
Hi list,
I'm planning on using ANTLR 3.3's JavaScript target for creating a
Freemarker <http://freemarker.sourceforge.net/> parser for the sake of
syntax highlighting (and auto-completion at a later stage) in a
web-based editor. Considering my lacking experience in ANTLR, I
thought I'd ask for any input or tips on accomplishing my goal.
As for highlighting, using a generated AST is simple enough, however
my editor component's (CodeMirror <http://codemirror.net/>) syntax
highlighting mechanism works on individual lines of the source. Is
there a way to use the ANTLR parser in an interruptible mode so it
could be called for every line separately while retaining its state?
and perhaps even remedying the last parsing error in view of the new
input (as not all source lines pass parsing by themselves)?
Any help and points in the right direction (whether in regards to the
JS target or ANTLR in general) would be much appreciated.
Many thanks,
Roded

P.S. anyone encountered a Freemarker grammar?

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address






More information about the antlr-interest mailing list