[antlr-interest] Fwd: keywords and arbitrary text

Peter Hull peterhull90 at gmail.com
Thu Jul 8 00:31:39 PDT 2010


Jim, Darin,
Thanks, that's helped. I've got another question on how it works,
though. Here is my grammar (just a subset of the actual language I
need to parse)
=== Sys.g ===
grammar Sys;

tokens {ARM='Arm'; SETGATE='SetGate'; MONITOR='Monitor'; TITLE='Title';}

program : lines;
lines   : line*;
line    : arm|setgate|monitor|Title;
arm     :       ARM INT;
setgate :       SETGATE INT FLOAT;
monitor :       MONITOR INT;
Title   :       TITLE (~('\n'|'\r'))* ;

INT :   '0'..'9'+
   ;

FLOAT
   :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
   |   '.' ('0'..'9')+ EXPONENT?
   |   ('0'..'9')+ EXPONENT
   ;

fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
---

Here is the test file:
=== test.txt ===
Title Test Example
Arm 29
SetGate 3 1.3
Monitor 1
Monitor 2
---

This seems to work OK. Originally I had 'Title' as 'title' i.e. a
non-terminal, and it gave these results.
line 1:5 no viable alternative at character ' '
line 1:7 mismatched character 'e' expecting 'i'
line 1:8 no viable alternative at character 's'
line 1:9 no viable alternative at character 't'
line 1:10 no viable alternative at character ' '
line 1:11 no viable alternative at character 'E'
line 1:12 no viable alternative at character 'x'
line 1:13 no viable alternative at character 'a'
line 1:14 no viable alternative at character 'm'
line 1:15 no viable alternative at character 'p'
line 1:16 no viable alternative at character 'l'
line 1:17 no viable alternative at character 'e'
line 2:3 no viable alternative at character ' '
line 3:7 no viable alternative at character ' '
line 3:9 no viable alternative at character ' '
line 4:7 no viable alternative at character ' '
line 5:7 no viable alternative at character ' '
I assume it's matched Title then the T of Test (at line 1:6) then it's looking
for 'Title' again (hence expecting 'i' at line 1:7)

Why is that?

Pete





On Wed, Jul 7, 2010 at 9:42 PM, Jim Idle <jimi at temporal-wave.com> wrote:
> You just need:
>
>
> TITLE : 'Title' (~('\n'|'\r'))* ;
>
> Then look for TITLE in your parser.
>
> Jim
>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Peter Hull
>> Sent: Wednesday, July 07, 2010 12:58 PM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] keywords and arbitrary text
>>
>> Hi all, here is a quick question from a new user.
>> I have a simple language I want to parse. Each line is a separate
>> command and each command looks something like
>> KEYWORD param param...
>> e.g.
>> SetGate 10 4.5
>> However there's a title command that can take any text up to the end
>> of line, e.g
>> Title Configuration 1
>> or even
>> Title This is a Title
>> I had a rule
>> TEXT: ~('\n'|'\r')+
>> but this (I think) matched all the keywords too, even if I used tokens
>> {...}
>> Is there a way to say that the TEXT token is only to be used after
>> Title? I saw something on island grammars but I couldn't understand it
>> to be honest.
>>
>> Pete
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list