[antlr-interest] Recognising XML in a grammar

Timothy Washington timothyjwashington at yahoo.ca
Thu Aug 31 10:28:07 PDT 2006


Hello there. I am new to ANTLR and parser generators
in general, so I hope you'll forgive what might seem a
simple question. I want to know how my parser can
recognise an XML block inside of my grammar. 

GRAMMAR
I want to take as an example, the xml grammar file
'$ANTLR_2.7.6/examples/java/xml/xml.g' in antlr. I'm
writing a grammar that can contain xml (with
namespaces and declarations) as a token. So a command
could look like this for example:
create      
	(entry 
		(
			<?xml version='1.0' encoding='UTF-8'?>
			<debit xmlns='com/interrupt/bookkeeping/account'
amount='100.00'>,
			<?xml version='1.0' encoding='UTF-8'?>
			<credit xmlns='com/interrupt/bookkeeping/account'
amount='100.00'>
		)
	)


IMPORTING .g FILES
I want to write a grammar that recognises all the
tokens in this command, including the raw XML. How
could I use the grammar definitions in 'xml.g', in my
own grammar file. For starters, I believe you use the
'importVocab' grammar option. 
Class MyParser extends Parser 
options { ?   importVocab=V; ?}


RECOGNISING XML BLOCKS
But what I really want to know is how my parser can
recognise a block of XML inside of my command. With
the said 'xml.g' grammar, I can recognise start and
end tags and cdata and so on. But I just want to
recognise an entire XML block and pass it as a token
to some command. My first guess was to create a
grammar that recognises a start and end tag and all
nested tags within (and may or may not have an xml
declaration). So I tried to write the grammar in
fig.1, but ran into errors in fig.2.

class BookkeepingExprLexer extends Lexer;
options {
	importVocab=XMLLexer; 
}

TOKEN_LITERAL:	( 
	STARTTAG 
		( PI | COMMENT | STARTTAG | ENDTAG | PCDATA |
CDATABLOCK )* 
	ENDTAG 
) { System.out.println("TOKEN LITERAL"); }; 
fig.1 

$ java antlr.Tool grammar/bookkeeping.g
ANTLR Parser Generator   Version 2.7.6 (2005-12-22)  
1989-2005
error: Lexer rule STARTTAG is not defined
error: Lexer rule PI is not defined
error: Lexer rule COMMENT is not defined
error: Lexer rule ENDTAG is not defined
error: Lexer rule PCDATA is not defined
error: Lexer rule CDATABLOCK is not defined
grammar/bookkeeping.g:15:41: no definition of rule
mSTARTTAG
grammar/bookkeeping.g:0:0: warning:Alternate omitted
due to empty prediction set
grammar/bookkeeping.g:15:41: no definition of rule
mSTARTTAG
grammar/bookkeeping.g:15:41: Rule 'mSTARTTAG' is not
defined
grammar/bookkeeping.g:16:51: no definition of rule mPI
grammar/bookkeeping.g:16:56: no definition of rule
mCOMMENT
grammar/bookkeeping.g:16:66: no definition of rule
mSTARTTAG
grammar/bookkeeping.g:16:77: no definition of rule
mENDTAG
grammar/bookkeeping.g:16:86: no definition of rule
mPCDATA
grammar/bookkeeping.g:16:95: no definition of rule
mCDATABLOCK
grammar/bookkeeping.g:17:41: no definition of rule
mENDTAG
grammar/bookkeeping.g:17:41: no definition of rule
mENDTAG
grammar/bookkeeping.g:17:41: no definition of rule
mENDTAG
grammar/bookkeeping.g:17:41: no definition of rule
mENDTAG
grammar/bookkeeping.g:17:41: no definition of rule
mENDTAG
grammar/bookkeeping.g:17:41: no definition of rule
mENDTAG
grammar/bookkeeping.g:16:51: warning:Alternate omitted
due to empty prediction set
grammar/bookkeeping.g:16:56: warning:Alternate omitted
due to empty prediction set
grammar/bookkeeping.g:16:66: warning:Alternate omitted
due to empty prediction set
grammar/bookkeeping.g:16:77: warning:Alternate omitted
due to empty prediction set
grammar/bookkeeping.g:16:86: warning:Alternate omitted
due to empty prediction set
grammar/bookkeeping.g:16:95: warning:Alternate omitted
due to empty prediction set
grammar/bookkeeping.g:17:41: Rule 'mENDTAG' is not
defined
Exiting due to errors.
fig.2 


THE PROBLEM
For this specific problem, is my grammar incorrect, or
have I not correctly pulled in the XML grammar
definitions from 'xml.g'? Also, Is there a better way
of recognising an XML block inside of my grammar. 


Thanks for any help. 
Tim


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the antlr-interest mailing list