[antlr-interest] Has anybody ever tried to integrate with VS?

Thu Apr 6 17:26:33 PDT 2006

Hi Don

Thanks for this great overview. It seems that the whole process is a lot 
harder than I would have thought :-(. I'll see if I can find any simpler 
ways of achieving my goal and if not I'll use your hints and tips.

Again, thanks heaps.

Patrick

Don Caton wrote:
> Patrick:
>
> I've built a language package for VS that uses Antlr.  It is a commercial
> product so I can't share the code, but I can give you a few tips.
>
> You'll need a lexer for syntax highlighting.  There's no need for a parser,
> the language service just needs a sequence of tokens with certain
> information such as the starting column and length, the text type (i.e.
> text, comment, etc.), and some other things.
>
> The lexer is called once for each line of source code, so it should be as
> simple and efficient as possible (e.g. don't use semantic predicates), and
> you should reuse the same lexer instance for the entire time your package is
> loaded.   You don't want to create a new instance of the lexer each time it
> is needed, the initialization costs are very high (especially the init of
> the tokens table) and it will slow down typing and scrolling in the editor
> if it isn't very efficient.
>
> I use a static stringstream object to feed the lexer and just reinitialize
> it each time the lexer is called.  In addition, you need to call
> lex->setColumn(0) and lex->getInputBuffer().reset() to "reset" the lexer
> each time you need to parse a new line.  You don't need to worry about the
> line counter, since you will never be parsing more than one line at a time.
>
> You also do not want to ignore comments in this lexer (since they must be
> colorized too) and you must maintain state for multiline comments and you
> cannot assume that when you encounter the beginning of a ML-comment you will
> find the end.  The end might be on a different line and you only get to lex
> one line at a time, so you must return a state code to the language service,
> which it will then return to you when it asks you to tokenize the next line.
> You must structure your lexer to be able to be entered in a state where you
> are within a multiline comment.
>
> If your language has any other constructs that can span multiple lines (like
> literal strings in C/C++) then you must maintain state for that as well.
> You cannot assume that you will be lexing each line in sequence, or if you
> will be lexing any particular line at all.  The language service maintains
> your returned state code for each line that has been colorized, and you will
> always be given the state code for the preceding line when you are asked to
> lex another line.
>
> To get tokens without using a parser, do something like this:
>
>    RefToken tok = lex->nextToken();
>    while ( tok->getType() != Token::EOF_TYPE )
>    {
>       // ...
>       tok = lex->nextToken();
>    }
>
>
> Next, you will need another lexer and a parser to support code collapsing,
> intellisense and various other things that the VS editor supports.  This is
> more like a traditional parser, where you want to ignore comments and
> whitespace.  You will be asked to parse all, or part of the source file.
> Rules in your parser will have to return the proper information to the
> language service, depending on the reason for the parse.  There are numerous
> reasons for invoking a full or partial parse (code collapsing, intellisense,
> etc.).  
>
> If your parser is efficient enough, it is easier to parse the entire source
> file each time, but the language service will also give you the coordinates
> of only the text that's changed.  You can maintain state information and
> only parse the changed source or parse the entire source file each time.
> Although colorizing and intellisense operations are generally called on
> background threads, you want them to be as efficient as possible so there is
> no noticeable delay to the end user.
>
> This lexer and parser generally does not need to be as thorough as those
> used in a compiler.  You only need to parse things to a point that you can
> supply the needed information back to the language service.  So, you don't
> really need to be concerned with things like operator precedence and things
> of that nature that are only significant if you are actually going to
> generate code.  Again, you want to make this as efficient as possible so
> there is no noticeable delay to the end user.
>
> Some language services like C# actually attempt to do a background compile
> (minus codegen) while you type, so that syntax and semantic errors can be
> displayed in real time in the error list pane.  That requires a more through
> parser that does semantic analysis, type resolution and so forth, and you
> may want to generate and persist an AST if you attempt this.  Some of C#'s
> advanced intellisense features also require this level of parsing.  
>
> Writing a Visual Studio language (or project) package is not a trivial thing
> to undertake, especially given the quality of the sample code and
> documentation that's provided, so be prepared for a bit of work.
>
> Don
>  
>
>   
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org 
>> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of P. van 
>> der Velde
>> Sent: Wednesday, April 05, 2006 1:23 AM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] Has anybody ever tried to integrate with VS?
>>
>> Hi All
>>
>> I want to build a new language integration tool for VS2005. I 
>> want to integrate Latex into VS (I'm lazy, I'm spoiled and I 
>> need my code complete thingies ;-). However to do that I need 
>> a parser and a lexer. 
>> The documents assume you use Flex and Bison, however I was 
>> thinking about Antlr. So now my questions are:
>>
>> 1) Has anybody ever build a language package for VS with antlr
>> 2) Has anybody ever created a LaTex grammar?
>> 3) Has anybody ever tried to create those two things.
>>
>> Also if anybody has any hints or tips I would love to hear those
>>
>> Regards
>>
>> Patrick
>>     
>
>
>
>
>
>