[antlr-interest] detecting transitions in stanza-based files

Tue May 10 08:08:58 PDT 2005

Bryan Ewbank wrote:

>The core problem is that shortLine and longLine have the same
>left-match.  If this is true, it's perhaps best to simply /parse/
>everything (assume longLine?), then use a tree walker to break out
>stanzas using a semantic predicate.
>  
>
shortLine and longLine do have the same left-match, but I thought a 
sufficient value for the k lookahead would take care of that. I don't 
understand why it doesn't. My current structure is a lexer that just 
generates FIELDs, DELIMs and NEWLINEs, a parser that looks at this token 
stream to generate an AST with the stanzas separated out using marker 
tokens and such, and then a tree parser that goes through this tree 
fetching out FIELDs and arranging them into a data structure. Is this a 
fundamentally incorrect approach? Since my lexer and tree parser already 
work (and still work fine on many input files), I was hoping to do the 
fix in parser space, perhaps using syntactic predicates.
I've never done syntactic or semantic predicates and am reading up on 
them now. Shouldn't there be a way to handle the longLine/shortLine with 
syntactic predicates such as:
line: FIELD DELIM FIELD DELIM FIELD NEWLINE => shortLine
     | FIELD DELIM FIELD DELIM FIELD DELIM FIELD => longLine

With a sufficient lookahead? Would this work?

Thanks again,
Chris

>On 5/10/05, Chris Black <chris at lotuscat.com> wrote:
>  
>
>>multStanzas: (stanza)+
>>stanza: shortLine (longLine)+
>>
>>shortLine: FIELD DELIM FIELD DELIM FIELD NEWLINE
>>longLine: FIELD DELIM FIELD (DELIM FIELD)+ NEWLINE
>>    
>>