[antlr-interest] Grammars with too many keywords?

Vadim Tropashko vadimtro at yahoo.com
Fri Mar 2 10:21:03 PST 2007


>Given that ANTLR is a top-down parser generator which
>generates one method
>per rule from the parser, rule splitting should be
the >answer to your problem in case of Java too.

I don't understand how rule splitting might help
decrease the size of lexer's mTokens() method.

Here is a test case with just 4 rules:
////////////////////////////////////// 

grammar Test;

id: IDENTIFIER | KEYWORD | DQUOTED_STRING;

IDENTIFIER: 
        ('a' .. 'z'|'A' .. 'Z') (  'a' .. 'z'|'A' ..
'Z'| '0' .. '9' | '_' | '$' | '#' )*;

DQUOTED_STRING: 
      '\"' ( ~'\"' )* '\"'; 

KEYWORD	:	
| 'A'
| 'ADD'
| 'AGENT'
| 'AGGREGATE'
| 'ALL'
| 'ALTER'
| 'AND'
| 'ANY'
| 'ARRAY'
| 'AS'
| 'ASC'
| 'AT'
| 'ATTRIBUTE'
| 'AUTHID'
| 'AVG'
| 'BEGIN'
| 'BETWEEN'
| 'BFILE_BASE'
| 'BINARY'
| 'BLOB_BASE'
| 'BLOCK'
| 'BODY'
| 'BOTH'
| 'BOUND'
| 'BULK'
| 'BY'
| 'BYTE'
| 'C'
| 'CALL'
| 'CALLING'
| 'CASCADE'
| 'CASE'
| 'CHAR'
| 'CHARACTER'
| 'CHARSET'
| 'CHARSETFORM'
| 'CHARSETID'
| 'CHAR_BASE'
| 'CHECK'
| 'CLOB_BASE'
| 'CLOSE'
| 'CLUSTER'
| 'COLLECT'
| 'COMMENT'
| 'COMMIT'
| 'COMMITTED'
| 'COMPILED'
| 'COMPRESS'
| 'CONNECT'
| 'CONSTANT'
| 'CONSTRUCTOR'
| 'CONTEXT'
| 'CONVERT'
| 'COUNT'
| 'CURRENT'
| 'CURSOR'
| 'CUSTOMDATUM'
| 'DANGLING'
| 'DATA'
| 'DATE'
| 'DATE_BASE'
| 'DAY'
| 'DECIMAL'
| 'DECLARE'
| 'DEFAULT'
| 'DEFINE'
| 'DELETE'
| 'DESC'
| 'DETERMINISTIC'
| 'DISTINCT'
| 'DOUBLE'
| 'DROP'
| 'DURATION'
| 'ELEMENT'
| 'ELLIPSIS'
| 'ELSE'
| 'ELSIF'
| 'EMPTY'
| 'END'
| 'ESCAPE'
| 'EXCEPT'
| 'EXCEPTION'
| 'EXCEPTIONS'
| 'EXCLUSIVE'
| 'EXECUTE'
| 'EXISTS'
| 'EXIT'
| 'EXTERNAL'
| 'FETCH'
| 'FINAL'
| 'FIXED'
| 'FLOAT'
| 'FOR'
| 'FORALL'
| 'FORCE'
| 'FORM'
| 'FROM'
| 'FUNCTION'
| 'GENERAL'
| 'GOTO'
| 'GROUP'
| 'HASH'
| 'HAVING'
| 'HEAP'
| 'HIDDEN'
| 'HOUR'
| 'IF'
| 'IMMEDIATE'
| 'IN'
| 'INCLUDING'
| 'INDEX'
| 'INDICATOR'
| 'INDICES'
| 'INFINITE'
| 'INSERT'
| 'INSTANTIABLE'
| 'INT'
| 'INTERFACE'
| 'INTERSECT'
| 'INTERVAL'
| 'INTO'
| 'INVALIDATE'
| 'IS'
| 'ISOLATION'
| 'JAVA'
| 'LANGUAGE'
| 'LARGE'
| 'LEADING'
| 'LENGTH'
| 'LEVEL'
| 'LIBRARY'
| 'LIKE'
| 'LIKE2'
| 'LIKE4'
| 'LIKEC'
| 'LIMIT'
| 'LIMITED'
| 'LOCAL'
| 'LOCK'
| 'LONG'
| 'LOOP'
| 'MAP'
| 'MAX'
| 'MAXLEN'
| 'MEMBER'
| 'MERGE'
| 'MIN'
| 'MINUS'
| 'MINUTE'
| 'MOD'
| 'MODE'
| 'MODIFY'
| 'MONTH'
| 'MULTISET'
| 'NAME'
| 'NAN'
| 'NATIONAL'
| 'NATIVE'
| 'NCHAR'
| 'NEW'
| 'NOCOPY'
| 'NOT'
| 'NOWAIT'
| 'NULL'
| 'NUMBER_BASE'
| 'OBJECT'
| 'OCICOLL'
| 'OCIDATE'
| 'OCIDATETIME'
| 'OCIDURATION'
| 'OCIINTERVAL'
| 'OCILOBLOCATOR'
| 'OCINUMBER'
| 'OCIRAW'
| 'OCIREF'
| 'OCIREFCURSOR'
| 'OCIROWID'
| 'OCISTRING'
| 'OCITYPE'
| 'OF'
| 'ONLY'
| 'OPAQUE'
| 'OPEN'
| 'OPERATOR'
| 'OPTION'
| 'OR'
| 'ORACLE'
| 'ORADATA'
| 'ORDER'
| 'ORGANIZATION'
| 'ORLANY'
| 'ORLVARY'
| 'OTHERS'
| 'OUT'
| 'OVERLAPS'
| 'OVERRIDING'
| 'PACKAGE'
| 'PARALLEL_ENABLE'
| 'PARAMETER'
| 'PARAMETERS'
| 'PARTITION'
| 'PASCAL'
| 'PIPE'
| 'PIPELINED'
| 'PRAGMA'
| 'PRECISION'
| 'PRIOR'
| 'PRIVATE'
| 'PROCEDURE'
| 'PUBLIC'
| 'RAISE'
| 'RANGE'
| 'RAW'
| 'READ'
| 'RECORD'
| 'REF'
| 'REFERENCE'
| 'REM'
| 'REMAINDER'
| 'RENAME'
| 'RESULT'
| 'RETURN'
| 'RETURNING'
| 'REVERSE'
| 'ROLLBACK'
| 'ROW'
| 'SAMPLE'
| 'SAVE'
| 'SAVEPOINT'
| 'SB1'
| 'SB2'
| 'SB4'
| 'SECOND'
| 'SEGMENT'
| 'SELECT'
| 'SELF'
| 'SEPARATE'
| 'SEQUENCE'
| 'SERIALIZABLE'
| 'SET'
| 'SHARE'
| 'SHORT'
| 'SIZE_T'
| 'SOME'
| 'SPARSE'
| 'SQL'
| 'SQLCODE'
| 'SQLDATA'
| 'SQLNAME'
| 'SQLSTATE'
| 'STANDARD'
| 'START'
| 'STATIC'
| 'STDDEV'
| 'STORED'
| 'STRING'
| 'STRUCT'
| 'STYLE'
| 'SUBMULTISET'
| 'SUBPARTITION'
| 'SUBSTITUTABLE'
| 'SUBTYPE'
| 'SUM'
| 'SYNONYM'
| 'TABLE'
| 'TDO'
| 'THE'
| 'THEN'
| 'TIME'
| 'TIMESTAMP'
| 'TIMEZONE_ABBR'
| 'TIMEZONE_HOUR'
| 'TIMEZONE_MINUTE'
| 'TIMEZONE_REGION'
| 'TO'
| 'TRAILING'
| 'TRANSAC'
| 'TRANSACTIONAL'
| 'TRUSTED'
| 'TYPE'
| 'UB1'
| 'UB2'
| 'UB4'
| 'UNDER'
| 'UNION'
| 'UNIQUE'
| 'UNSIGNED'
| 'UNTRUSTED'
| 'UPDATE'
| 'USE'
| 'USING'
| 'VALIST'
| 'VALUE'
| 'VALUES'
| 'VARIABLE'
| 'VARIANCE'
| 'VARRAY'
| 'VARYING'
| 'VIEW'
| 'VOID'
| 'WHEN'
| 'WHERE'
| 'WHILE'
| 'WITH'
| 'WORK'
| 'WRAPPED'
| 'WRITE'
| 'YEAR'
| 'ZONE'
	
|	 'A1'
| 'A1DD'
| 'A1GENT'
| 'AG2GREGATE'
| 'A1LL'
| 'AL2TER'
| 'AN1D'
| 'AN1Y'
| 'A1RRAY'
| 'A1S'
| 'A1SC'
| 'A1T'
| 'A1TTRIBUTE'
| 'A1UTHID'
| 'A1VG'
| 'BEG1IN'
| 'BE1TWEEN'
| 'BF1ILE_BASE'
| 'BI1NARY'
| 'BL1OB_BASE'
| 'BL1OCK'
| 'BO1DY'
| 'BO1TH'
| 'B1O1UND'
| 'BU1LK'
| 'B1Y'
| 'BY1TE'
| 'C1'
| 'C1ALL'
| 'CA1LLING'
| 'CAS1CADE'
| 'CA1SE'
| 'CHA1R'
| 'CH1ARACTER'
| 'C1HAR1SET'
| 'CHAR1SETFORM'
| 'CHAR1SETID'
| 'CH1AR_BASE'
| 'CHE1CK'
| 'CL1OB_BASE'
| 'CLOS1E'
| 'CL1USTER'
| 'COLLE1CT'
| 'COMME1NT'
| 'COM1MIT'
| 'CO1MMITTED'
| 'COMP1ILED'
| 'C1OMPRESS'
| 'CON1NECT'
| 'CONS1TANT'
| 'CO1NSTRUCTOR'
| 'CONT1EXT'
| 'CON1VERT'
| 'CO1UNT'
| 'CU1RRE1NT'
| 'CUR1SOR'
| 'CUST1OMDATUM'
| 'DAN1GLING'
| 'DA1TA'
| 'DA1TE'
| 'DA1TE_BASE'
| 'DA1Y'
| 'DE1CIMAL'
| 'D1ECLARE'
| 'D1EFAULT'
| 'DEFINE'
| 'DE1L1ETE'
| 'DE1SC'
| 'DET1ERMINISTIC'
| 'DIS1TINCT'
| 'DOU1BLE'
| 'D1ROP'
| 'DU1RATION'
| 'EL1EMENT'
| 'EL1LIPSIS'
| 'EL1SE'
| 'ELS1IF'
| 'EM1PTY'
| 'EN1D'
| 'ESC1APE'
| 'EXCE1PT'
| 'EX1CEPTION'
| 'EXC1EPTIONS'
| 'EXCLU1SIVE'
| 'EXE1CUTE'
| 'EX1ISTS'
| 'EX1IT'
| 'EX1TERNAL'
| 'FET1CH'
| 'FI1NAL'
| 'FIX1ED'
| 'FL1OAT'
| 'FO1R'
| 'FO1RALL'
| 'FO1RCE'
| 'FO1RM'
| 'FR1OM'
| 'FU1NCTION'
| 'GEN1ERAL'
| 'GOT1O'
| 'GR1OUP'
| 'HA1SH'
| 'HA1VING'
| 'HEA1P'
| 'HI1DDEN'
| 'HOU1R'
| 'IF1'
| 'IM1MEDIATE'
| 'IN1'
| 'INC1LUDING'
| 'IND1EX'
| 'IND1ICATOR'
| 'INDI1CES'
| 'INF1INITE'
| 'INS1ERT'
| 'INS1TANTIABLE'
| 'IN1T'
| 'IN1TERFACE'
| 'INTE1RSECT'
| 'INT1ERVAL'
| 'INT1O'
| 'INV1ALIDATE'
| 'IS1'
| 'ISO1LATION'
| 'JAVA1'
| 'LA1NGUAGE'
| 'LA1RGE'
| 'LEA1DING'
| 'LEN1GTH'
| 'LEV1EL'
| 'LIB1RARY'
| 'LI1KE'
| 'LI1KE2'
| 'LIK1E4'
| 'LIK1EC'
| 'LIM1IT'
| 'LI1MITED'
| 'LO1CAL'
| 'LO1CK'
| 'LO1NG'
| 'LO1OP'
| 'MA1P'
| 'MA1X'
| 'MA1XLEN'
| 'MEM1BER'
| 'ME1RGE'
| 'M1IN'
| 'MI1NUS'
| 'MI1NUTE'
| 'MO1D'
| 'MO1DE'
| 'MODI1FY'
| 'MON1TH'
| 'MU1LTISET'
| 'NA1ME'
| 'NA1N'
| 'NAT1IONAL'
| 'NA1TIVE'
| 'NCH1AR'
| 'NE1W'
| 'N1OCOPY'
| 'NOT1'
| 'NO1WAIT'
| 'NU1LL'
| 'NUM1BER_BASE'
| 'OBJE1CT'
| 'OCICO1LL'
| 'OCID1ATE'
| 'OC1ID1ATETIME'
| 'OC1IDURATION'
| 'OCII1NTERVAL'
| 'O1CILOBLOCATOR'
| 'OCIN1UMBER'
| 'OCI1RAW'
| 'OC1IREF'
| 'OCIRE1FCURSOR'
| 'OCIRO1WID'
| 'OCI1STRING'
| 'O1CITYPE'
| 'OF1'
| 'ON1LY'
| 'OPA1QUE'
| 'OP1EN'
| 'OPE1RATOR'
| 'OP1TION'
| 'O1R'
| 'ORA1CLE'
| 'ORADAT1A'
| 'ORD1ER'
| 'OR1GANIZATION'
| 'ORL1ANY'
| 'ORLV1ARY'
| 'OTH1ERS'
| 'OU1T'
| 'OVER1LAPS'
| 'OVER1RIDING'
| 'PACKA1GE'
| 'PAR1ALLEL_ENABLE'
| 'PARAM1ETER'
| 'PARA1METERS'
| 'PART1ITION'
| 'PAS1CAL'
| 'PIP1E'
| 'PIP1ELINED'
| 'PRA1GMA'
| 'PRE1CISION'
| 'PRI1OR'
| 'PRI1VATE'
| 'PRO1CEDURE'
| 'PUB1LIC'
| 'RA1ISE'
| 'RAN1GE'
| 'RA1W'
| 'RE1AD'
| 'RE1CORD'
| 'RE1F'
| 'REF1ERENCE'
| 'RE1M'
| 'REM1AINDER'
| 'REN1AME'
| 'RES1ULT'
| 'RET1URN'
| 'RET1URNING'
| 'REV1ERSE'
| 'RO1LLBACK'
| 'ROW1'
| 'SA1MPLE'
| 'SA1VE'
| 'S1AVEPOINT'
| 'S1B1'
| 'SB12'
| 'SB14'
| 'SEC1OND'
| 'SEGME1NT'
| 'SE1LECT'
| 'SE1LF'
| 'SEP1ARATE'
| 'SEQ1UENCE'
| 'SERI1ALIZABLE'
| 'SE1T'
| 'SHA1RE'
| 'SH1ORT'
| 'SI1ZE_T'
| 'SOM1E'
| 'SPA1RSE'
| 'SQ1L'
| 'SQ1LCODE'
| 'SQL1DATA'
| 'SQLN1AME'
| 'S1QLSTATE'
| 'ST1ANDARD'
| 'STA1RT'
| 'STA1TIC'
| 'STDD1EV'
| 'STO1RED'
| 'ST1RING'
| 'STR1UCT'
| 'STY1LE'
| 'SUBM1ULTISET'
| 'SUB1PARTITION'
| 'SUBST1ITUTABLE'
| 'SU1BTYPE'
| 'SU1M'
| 'SYN1ONYM'
| 'TA1BLE'
| 'TD1O'
| 'T1HE'
| 'TH1EN'
| 'TIM1E'
| 'TI1MESTAMP'
| 'TIME1ZONE_ABBR'
| 'TIMEZ1ONE_HOUR'
| 'TIMEZO1NE_MINUTE'
| 'TIMEZO1NE_REGION'
| 'TO1'
| 'TR1AILING'
| 'TRAN1SAC'
| 'TRAN1SACTIONAL'
| 'TR1USTED'
| 'TY1PE'
| 'UB11'
| 'UB12'
| 'U1B4'
| 'UN1DER'
| 'UNI1ON'
| 'UNI1QUE'
| 'UNSI1GNED'
| 'UNTR1USTED'
| 'UPD1ATE'
| 'US1E'
| 'US1ING'
| 'VA1LIST'
| 'VA1LUE'
| 'VA1LUES'
| 'VAR1IABLE'
| 'VAR1IANCE'
| 'VAR1RAY'
| 'VAR1YING'
| 'VI1EW'
| 'VO1ID'
| 'WH1EN'
| 'WH1ERE'
| 'WH1ILE'
| 'WIT1H'
| 'WO1RK'
| 'W1RAPPED'
| 'WR1ITE'
| 'YEA1R'
| 'ZON1E'
;



 
____________________________________________________________________________________
It's here! Your new message!  
Get new email alerts with the free Yahoo! Toolbar.
http://tools.search.yahoo.com/toolbar/features/mail/


More information about the antlr-interest mailing list