[antlr-interest] ab? b?a

Fri Jun 27 03:30:02 PDT 2008

Hi,

Try without the "$channel = HIDDEN" (thus leaving the lexer rule
alone), and using it explicitely in your parser rules.
I don't know anything about Shogi, but I think your WS acts just like
a ';' separator in C or Java. Maybe a top-level rule:

moves: move (WHITESPACE* move)*;

The WS makes sense separating move no. 1 from move no. 2, and may be
ignored in the parser actions.

Thomas

On Fri, Jun 27, 2008 at 12:22 PM, fat bold cyclop
<fat.bold.cyclop at gmail.com> wrote:
> I am working on grammar for Shogi (Japanese chess) game record. I am
> not very experienced but I do my best ;-)
>
> First of all, there are many points in the game record where
> whitespace (WS) could occur. Because I wanted keep the grammar
> human readable I chose to ignore WSs.
>
> Secondly, I want the parser to distinguish between:
> "ab a", "a ba" and "ab ba"
>
> ((In "real life" the game record will hava information about piece
> rank (b). In Shogi a piece can be promoted
> which is indicated by prefixing it's name with plus sign (a). So a
> pawn's symbol is "P" and promoted pawn is "+P".
> An unpromoted piece (under certain circumstances) after it's move can
> pe promoted. Which is,
> unfortunately for me, also indicated by '+'. A move in which a pawn
> gets promoted is postfixed with '+'.
> The game record can contain entry: "P4c+ +P1a". This means: first pawn
> moves to 4c square and gets promoted.
> The second pawn, a promoted pawn goes to 4c.))
>
> So I have the following part in my grammar (simplified):
> ab? b?a
>
> Ignoring WS introduces nondeterminism. The parser can't distinguish
> "ab a" from "a ba"
> I know I could switch off ignoring WS but it forces me to put WS in
> many places of my grammar,
> making less readable.
>
> Can I use any predicate to hint the parser the right choice?
> For example for my ab? b?a: "if you find b and the next symbol is a
> then you found ba ", or
> "if you find b and the previous token was WS then it cannot belong to ab".
>
> Or maybe there is a way of grouping/rearranging the grammar?
>
> Any help is welcomed.
>
> Best regards,
> fbc
>
>
>
> grammar problem;
> tokens {DOT = '.'; PAWN_SYMBOL = 'P'; PROMOTION_SIGN = '+';}
>
> // PARSER RULES ---------------------------------
> move     : DOT moveData moveData;
> moveData : pieceRank PROMOTION_SIGN?;
> pieceRank  : PROMOTION_SIGN? PAWN_SYMBOL;
>
> // LEXER RULES ---------------------------------
> WHITESPACE  : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+  { $channel = HIDDEN; } ;
>

-- 
..........................................................
Thomas VIAL
OCTO Technology
..........................................................
50, Avenue des Champs-Elysées
75008 Paris
Tél : (33) 1 58 56 10 00
Fax : (33) 1 58 56 10 01
GSM : (33) 6 28 50 07 64
Web : http://www.octo.com/
..........................................................