[antlr-interest] lexer: embedded quotes assistance

Edwards, Waverly Waverly.Edwards at genesys.com
Thu Aug 23 12:39:26 PDT 2007


Well after a lot more reading it is still not easy.  Below is what I've done to capture the quoted text.  I'm hoping someone can assist me.  My grammar is very short as at this point I just need to get past capturing quotations with embedded quotes.  Below that is the code I used to test what I was doing was correct ( or seemingly correct ).  The code below the grammar works just fine.  I decided that instead of making an exception to allow the continuation across lines I would throw an error if there was not one, thereby saving myself more headaches.

I'm hoping someone has some ideas on how I can get over this hurdle.


Thanks,


W.


grammar QUOTETEST;


prog	:	start+ ;

start	:	NEWLINE
	|	DBLQUOTE
	;
	
NEWLINE	:	'\r'? '\n';
   
DBLQUOTE :   ('"' ( options {  greedy=false; }: . )  '"') 
//DBLQUOTE  : '"' ( options {greedy=false;} : . )* '"'
	  {
	        
		int where, lastCharPos;
        	String quote, dblDblQuote;
        	StringBuffer txt;
        	char quoteChr = 34;
        
        	dblDblQuote = "\"\"";
		txt = new StringBuffer(getText());
        	lastCharPos = txt.length( )-1;
        
        // Remove first and last double-quote if they exist
        	if ( txt.charAt(0) == quoteChr ) txt.deleteCharAt(0);
        	if ( txt.charAt( lastCharPos ) == quoteChr ) txt.deleteCharAt( lastCharPos );
        
        // -------------
        // DO SOMETHING HERE TO HANDLE UNTERMINATED STRING
        // -------------
        
        	while (( where = txt.lastIndexOf( dblDblQuote ) ) >= 0) {
            		txt.deleteCharAt(where);
        	}
        
		setText(txt.toString()); 
        	System.out.println(txt.toString());
        

	  };
	

// ----------------------------------------------------------------


package quotedstrings;

public class Main {
    
    public static void showQuotedStringWithWrap(String testStr ) {
        int where, lastCharPos;
        String quote, dblDblQuote;
        StringBuffer txt;
        char quoteChr = 34;
        
//        quote = Character.toString(quoteChr);
//        dblDblQuote = Character.toString(quoteChr) + Character.toString(quoteChr);
        dblDblQuote = "\"\"";
        txt = new StringBuffer( testStr );
        lastCharPos = txt.length( )-1;
        
        // Remove first and last double-quote if they exist
        if ( txt.charAt(0) == quoteChr ) txt.deleteCharAt(0);
        if ( txt.charAt( lastCharPos ) == quoteChr ) txt.deleteCharAt( lastCharPos );
        
        // -------------
        // DO SOMETHING HERE TO HANDLE UNTERMINATED STRING
        // -------------
        
        while (( where = txt.lastIndexOf( dblDblQuote ) ) >= 0) {
            txt.deleteCharAt(where);
        }
        
        System.out.println(txt.toString());
        
    }
    
    
    public static void showDoubleQuotedString(String testStr ) {
        int where, lastCharPos;
        String quote, dblDblQuote;
        StringBuffer txt;
        char quoteChr = 34;
        
//        quote = Character.toString(quoteChr);
//        dblDblQuote = Character.toString(quoteChr) + Character.toString(quoteChr);
        dblDblQuote = "\"\"";
        txt = new StringBuffer( testStr );
        lastCharPos = txt.length( )-1;
        
        // Remove first and last double-quote if they exist
        if ( txt.charAt(0) == quoteChr ) txt.deleteCharAt(0);
        if ( txt.charAt( lastCharPos ) == quoteChr ) txt.deleteCharAt( lastCharPos );
        
        // -------------
        // DO SOMETHING HERE TO HANDLE UNTERMINATED STRING
        // -------------
        
        while (( where = txt.lastIndexOf( dblDblQuote ) ) >= 0) {
            txt.deleteCharAt(where);
        }
        
        System.out.println(txt.toString());
        
    }
    
    /** Creates a new instance of Main */
    public Main() {
    }
    
    public static void main(String[] args) {
        char cr = 34;
        String testStr, quote, dblDblQuote;
        quote = "\"";
//        String dblDblQuote = Character.toString(cr) + Character.toString(cr);
        dblDblQuote = "\"\"";
        testStr = "Hello, is it me your looking for?";
        showDoubleQuotedString( testStr );
        testStr = "I said, " + dblDblQuote + "Hello." + dblDblQuote;
        showDoubleQuotedString( testStr );
        testStr = "I said, " + quote + "Hello." + quote;
        showDoubleQuotedString( testStr );
    }
}
 



-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Edwards, Waverly
Sent: Wednesday, August 22, 2007 5:30 PM
To: antlr
Subject: [antlr-interest] lexer: embedded quotes assistance

This is probably pretty simple but I'm not getting it.
Would mind assisting me with a way to create a lex rule that turns my original quoted string into the two versions below?

History...
I am replicating an existing language that uses embedded quotes to indicate a quote character.  In addition to embedded quotes the quoted material may span multiple lines by using the ¬ character followed by a CR.  Now a rule for treating ¬CR as whitespace is not an issue as I just create a rule

CONTINUATION:
     :     '¬'CR
     ;

because I need it for the language anyway but inside of a quote is another matter.


original text: "I said, ""Hello."""

quote rule 1: "I said, "Hello.""

quote rule 2: I said, "Hello."

------------

My second problem is that the character ¬ followed immediately by a CR is used for line continuation

myString = "line ¬
""continuation."""

print myString

result is: line "continuation"

How would I create a rule that would return the resulting string?



Suggestions are not just welcome they are highly sought after.


Thank you,


W.




More information about the antlr-interest mailing list