[antlr-interest] No viable for alternative with ISO-LATIN-1 non-breaking space character
Jim Idle
jimi at temporal-wave.com
Mon Feb 18 12:53:11 PST 2008
Are you sure that that is actually character 0xa0? Print the hex value
of it.
However, I think that perhaps you need to add the “UTF8” encoding
option to your input stream?
new ANTLRFileStream((/tmp/nbsp.txt", "UTF8")
Jim
From: Darach Ennis [mailto:darach at gmail.com]
Sent: Monday, February 18, 2008 8:59 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] No viable for alternative with ISO-LATIN-1
non-breaking space character
Hi guys,
I'm not sure if this is a case of user error or a bug. I have replicated
the issue in a testcase as follows:
grammar Test;
@parser::header {
import java.io.FileInputStream;
}
@parser::members {
public static void main(String args[]) throws Throwable {
final ANTLRInputStream cs = new ANTLRInputStream(new
FileInputStream("/tmp/nbsp.txt"));
final TestLexer sl = new TestLexer(cs);
final CommonTokenStream cts = new CommonTokenStream(sl);
final TestParser sp = new TestParser(cts);
sp.rules();
}
}
rules: anything+;
anything: Other | Directive ;
Other: '-' ( ('directive') => ('directive') { $type = Directive; } |
/* empty */ );
WS : (' ' | '\t' | '\f' | '\r' | '\n' | '\u00a0') {
$channel=HIDDEN; };
Despite defining a non-breaking space (iso-latin-1) within the
whitespace hiding lexer rule 'WS'
test input with this character fails to parse as expected. Here is some
test input:
-directive †-directive †-directive †-directive - - -
Here is some example output:
line 1:11 no viable alternative at character '†'
line 1:24 no viable alternative at character '†'
line 1:37 no viable alternative at character '†'
Given the above grammar I would have expected the non-breaking space
(\u00a0) to be ignored.
Is this a bug or user error? If user error, can anyone suggest a grammar
fix?
Regards,
Darach.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080218/f1e1234d/attachment.html
More information about the antlr-interest
mailing list