[antlr-interest] Problems with Pre-processing instructions of C#
David-Sarah Hopwood
david-sarah at jacaranda.org
Sun Sep 20 16:54:25 PDT 2009
Eduard Ralph wrote:
> Hi community,
>
> I'm fighting with the processing of pre-processing instructions according to C# specs. The BNF is:
>
> Whitespace(opt) '#' Whitespace(opt) 'error' input-characters
> Whitespace(opt) '#' Whitespace(opt) 'warning' input-characters
> Whitespace(opt) '#' Whitespace(opt) 'line' ...
>
> where
> Whitespace(opt) can be optionally one or more spaces ('\u0020','\u00A0', and a few more)
> Input-characters is anything except newline ('\n', and a few more)
>
> I wrote in the Lexer, where the other rules are fragments
>
>
> PP_DIAGNOSTIC : (WHITESPACE* HASH WHITESPACE* 'error')=>WHITESPACE* HASH WHITESPACE* ERROR INPUT_CHARACTER*
> | (WHITESPACE* HASH WHITESPACE* 'warning')=>WHITESPACE* HASH WHITESPACE* WARNING INPUT_CHARACTER*
> ;
These probably need NEWLINEs at the end.
> PP_LINE : (WHITESPACE* HASH WHITESPACE* 'line')=> WHITESPACE* HASH WHITESPACE* LINE PP_LINE_INDICATOR NEWLINE
> ;
This will not skip whitespace between LINE and PP_LINE_INDICATOR or
between PP_LINE_INDICATOR and NEWLINE.
I think you probably want
... => WHITESPACE* HASH WHITESPACE* LINE WHITESPACE* PP_LINE_INDICATOR
WHITESPACE* NEWLINE
but that is likely independent of your problem with the lexer not
recognising which rule applies.
> fragment PP_LINE_INDICATOR : INTEGER_LITERAL PP_FILE_NAME?
> | IDENTIFIER_OR_KEYWORD
> ;
>
> fragment PP_FILE_NAME : STRING_LITERAL
> ;
>
> fragment HASH : '#';
I would suggest left-factoring and using actions to change the token type:
fragment PP_DIAGNOSTIC : ;
fragment PP_LINE : ;
PP_UNRECOGNIZED
: WHITESPACE* HASH WHITESPACE*
( (ERROR | WARNING)=> INPUT_CHARACTER* { $type = PP_DIAGNOSTIC; }
| (LINE)=> LINE WHITESPACE* PP_LINE_INDICATOR WHITESPACE*
{ $type = PP_LINE; }
| INPUT_CHARACTER* // leave as type PP_UNRECOGNIZED [1]
)? NEWLINE
;
[1] omit this line if you want an unrecognized instruction to be a lexer
mismatch, but I would suggest leaving it for better error recovery.
--
David-Sarah Hopwood ⚥ http://davidsarah.livejournal.com
More information about the antlr-interest
mailing list