[antlr-interest] Parsing Question

Bart Kiers bkiers at gmail.com
Fri Jul 8 00:16:23 PDT 2011


Hi Chris,

Very original! :)

Try to do more in lexer rules. Some of your keywords may probably also be a
part of the "instruction phrase": you need to be aware of that.
How about something like this:

grammar KnittingGrammer;

parse
  :  instruction+ EOF
  ;

instruction
  :  section FullStop
  |  castOn FullStop
  ;

section
  :  NumberDecoration Section
  |  Section NumberDecoration
  ;

castOn
  :  CastOn Number Stitches+ anyWordExceptStitches anyWord*
  ;

anyWordExceptStitches
  :  NumberDecoration
  |  Number
  |  Section
  |  Word
  ;

anyWord
  :  NumberDecoration
  |  Number
  |  Section
  |  Stitches
  |  Word
  ;

NumberDecoration
 :  Digit+ ('st' | 'nd' | 'rd' | 'th')
 ;

Number
  :  Digit+
  ;

FullStop
  :  ':'
  |  ';'
  |  ','
  |  '.'
  |  '\n'
  |  '\r'
  ;

Section
  :  'section'
  ;

CastOn
  :  'cast' Space+ 'on'
  |  'co'
  ;

Stitches
  :  'stitch' 'es'?
  |  '(sts)'
  |  'sts'
  ;

Space
  :  (' ' | '\t') {skip();}
  ;

Word
  :  ('a'..'z' | 'A'..'Z')+
  ;

fragment Digit : '0'..'9';


which parses input like this properly:

1st section: cast on 63 stitches (sts) and work in pattern section as
follows:


Regards,

Bart.


On Fri, Jul 8, 2011 at 2:15 AM, Chris Wegener
<chris at wegenerconsulting.com>wrote:

> Dear Friends-
>
>
>
> I am attempting to define a language that will let me parse knitting
> instructions.  (Don't ask.)
>
>
>
> By and large it is a well understood convention with standard abbreviations
> and phrases.  Occasionally the originator will insert a phrase in the
> instructions that are not directly relevant.  What I would like is to parse
> out those words and deal with them around the issue of reading the
> instructions.  I have tried:
>
>
>
> text :    (letter)+;
>
> letter :  ('a'..'z' | 'A'..'Z');
>
> WS   :    (' ' | '\n' | '\r');
>
>
>
> And it doesn't work at all.  I changed it to:
>
>
>
> text :   (letter)+;
>
> letter :~('"' | '\\');
>
> WS   :   (' ' | '\n' | '\r');
>
>
> That works, but becomes unweildy very quickly when I start including all of
> the things I do know to scan for. I have attached the KnittingGrammer.g
> file
> with my rules.
>
> For example:
>
> "1st Section: Cast on 63 stitches (sts) and work in pattern as follows:" is
> parsed into '1st Section:' and 'Cast on 63 stitches (sts)' which leaves the
> text until the colon which is the stop character.  I would like to parse
> the
> 'and work in pattern as follows' into the parse tree under text so I can
> inspect it or lex it seperately or even display to the user.
>
> What am I missing or doing wrong?
>
> My thanks for your help in advance.
>
> Regards,
>
> Chris
>
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>


More information about the antlr-interest mailing list