[antlr-interest] Parsing Question
Bart Kiers
bkiers at gmail.com
Fri Jul 8 00:16:23 PDT 2011
Hi Chris,
Very original! :)
Try to do more in lexer rules. Some of your keywords may probably also be a
part of the "instruction phrase": you need to be aware of that.
How about something like this:
grammar KnittingGrammer;
parse
: instruction+ EOF
;
instruction
: section FullStop
| castOn FullStop
;
section
: NumberDecoration Section
| Section NumberDecoration
;
castOn
: CastOn Number Stitches+ anyWordExceptStitches anyWord*
;
anyWordExceptStitches
: NumberDecoration
| Number
| Section
| Word
;
anyWord
: NumberDecoration
| Number
| Section
| Stitches
| Word
;
NumberDecoration
: Digit+ ('st' | 'nd' | 'rd' | 'th')
;
Number
: Digit+
;
FullStop
: ':'
| ';'
| ','
| '.'
| '\n'
| '\r'
;
Section
: 'section'
;
CastOn
: 'cast' Space+ 'on'
| 'co'
;
Stitches
: 'stitch' 'es'?
| '(sts)'
| 'sts'
;
Space
: (' ' | '\t') {skip();}
;
Word
: ('a'..'z' | 'A'..'Z')+
;
fragment Digit : '0'..'9';
which parses input like this properly:
1st section: cast on 63 stitches (sts) and work in pattern section as
follows:
Regards,
Bart.
On Fri, Jul 8, 2011 at 2:15 AM, Chris Wegener
<chris at wegenerconsulting.com>wrote:
> Dear Friends-
>
>
>
> I am attempting to define a language that will let me parse knitting
> instructions. (Don't ask.)
>
>
>
> By and large it is a well understood convention with standard abbreviations
> and phrases. Occasionally the originator will insert a phrase in the
> instructions that are not directly relevant. What I would like is to parse
> out those words and deal with them around the issue of reading the
> instructions. I have tried:
>
>
>
> text : (letter)+;
>
> letter : ('a'..'z' | 'A'..'Z');
>
> WS : (' ' | '\n' | '\r');
>
>
>
> And it doesn't work at all. I changed it to:
>
>
>
> text : (letter)+;
>
> letter :~('"' | '\\');
>
> WS : (' ' | '\n' | '\r');
>
>
> That works, but becomes unweildy very quickly when I start including all of
> the things I do know to scan for. I have attached the KnittingGrammer.g
> file
> with my rules.
>
> For example:
>
> "1st Section: Cast on 63 stitches (sts) and work in pattern as follows:" is
> parsed into '1st Section:' and 'Cast on 63 stitches (sts)' which leaves the
> text until the colon which is the stop character. I would like to parse
> the
> 'and work in pattern as follows' into the parse tree under text so I can
> inspect it or lex it seperately or even display to the user.
>
> What am I missing or doing wrong?
>
> My thanks for your help in advance.
>
> Regards,
>
> Chris
>
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
More information about the antlr-interest
mailing list