[antlr-interest] Grammar Problem
johnclarke72
johnclarke at hotmail.com
Tue Jun 4 04:20:58 PDT 2002
Hi,
I am currently having problems with a HTML Grammar that I am
writing. The Grammar has been added to the end of this e-mail.
When I enter HTML comments (<!-- This is a Comment -->) and End Tags
(</endTag>) it handles it correctly.
However, if I enter <test> or anything similar to this I get
an "line 1: unexpected token: test" error message.
How can I get this to work ?
I would be grateful for all advice offered regarding this.
John
HTMLTagLexer.g
==============
// Import the required Classes
header
{
import java.util.*;
import antlr.*;
}
// Define the class
class HTMLTagLexer extends Lexer;
// Set the options for the Lexer
options
{
k=3; // Set the look ahead to 3
Characters
caseSensitive = false; // Set Case Sensitivity to false
charVocabulary = '\1' .. '\377'; // Set the Lexer Character
Vocabulary
testLiterals = false; // Don't test against the Literals
table
exportVocab = HTMLTagLexer; // The Grammar to export
}
// The routines that will enable us to switch between lexer states
{
// The Current Selector
TokenStreamSelector selector;
// The method that will enable us to switch between lexer states
public void setSelector(TokenStreamSelector tokenStreamSelector)
{
selector = tokenStreamSelector;
}
}
// Define the Tokens required for the Grammar
// Various HTML Marker Tags
INITSTARTTAG : "<";
FINISHSTARTTAG : ">";
EQUALS : "=";
// HTML Comments
HTMLCOMMENT : "!--"! (options {greedy=false;} : .)* "-->"!
{selector.pop();}
;
// Main HTML Tags Section. This defines the Tag names,
// attributes and attribute values
// TAGDATA is used to define the Tag Name and names of
// attributes used within the tag
TAGDATA : (~(' ' | '\r' | '\n' | '\t' | '<' | '>' | '/' | '!' | '='))
+;
// TAGVALUE is used to define the values for attributes
// used within the tags
// Definition of an End Tag
ENDTAG : '/'! ( 'a'..'z' )+ ">"! {selector.pop();};
// Ignore all White Space
WS : ( ' '
| '\t'
| '\r' '\n' { newline(); }
| '\n' { newline(); }
)
{$setType(Token.SKIP);} //ignore this token
;
HTMLTagParser.g
===============
// Define the class
class HTMLTagParser extends Parser;
// Set the options for the Parser
options
{
importVocab = HTMLTagLexer; // The Grammar to import
}
// The Parser Rules
processHTML:
htmlComment:HTMLCOMMENT {System.out.println
("COMMENT : "+htmlComment.getText().trim());}
| startHTMLTag
| endTag:ENDTAG {System.out.println("ENDTAG : "+endTag.getText
());};
startHTMLTag : INITSTARTTAG tagName:TAGDATA
{System.out.println("STARTTAG : "+tagName.getText());}
FINISHSTARTTAG;
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list