【问题】
antlr v3的语法,在antlrworks中调试。
核心部分的代码是:
fragment
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
//singleInclude : '#include' BLANKS '"' ID '"' '.h';
singleInclude : '#include' '"' ID '"' '.h';
//include : singleInclude WS* -> singleInclude;
include : singleInclude WS*;
//startParse : include* identification+;
//startParse : include+ identification+;
//startParse : identification+;
//startParse : manufacture deviceType deviceRevison ddRevision;解析的内容是:
/* ********************************************************************** ** Includes ********************************************************************** */ #include "std_defs.h" #include "com_tbls.h" #include "rev_defs.h" #include "fbk_hm.h" #include "fdiag_FBK2_Start.h" #include "blk_err.h" /* ********************************************************************** ********** DEVICE SECTION ******************************************** ********************************************************************** */ MANUFACTURER 0x1E6D11, DEVICE_TYPE 0x00FF, DEVICE_REVISION 5, DD_REVISION 1
结果调试出错:
【解决过程】
1.很明显,是双引号无法识别,出现MismatchedTokenException(0!=0)的问题。
2.参考:
解释的很清楚,可惜对此问题没帮助。
3.参考:
[antlr-interest] MismatchedTokenException
没太看懂。。。
对解决问题,没帮助。
4.参考:
Antlr.Runtime.MismatchedTokenException from Envers with generic entities
没用。
5.后来搜:
antlr MismatchedTokenException(0!=0) double quote
而参考:
ANTLR grammar how to capture all characters to end of line
其说的,和我此处有点类似:
好像是comment等的定义,和此处的 双引号的匹配,有点冲突了?
所以试着看,把原先的代码:
grammar DDParserDemo;
options {
output = AST;
ASTLabelType = CommonTree; // type of $stat.tree ref etc...
}
//NEWLINE : '\r'? '\n' ;
//NEWLINE : '\r' '\n' ;
fragment
NEWLINE : '\r'? '\n' ;
fragment
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
fragment
FLOAT
: ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
| '.' ('0'..'9')+ EXPONENT?
| ('0'..'9')+ EXPONENT
;
COMMENT
: '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
| '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
;
//fragment WS : ( ' ' | '\t' | '\r' | '\n') {skip();};
//fragment WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;};
WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;};
STRING
: '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
;
CHAR: '\'' ( ESC_SEQ | ~('\''|'\\') ) '\''
;
fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
ESC_SEQ
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| UNICODE_ESC
| OCTAL_ESC
;
fragment
OCTAL_ESC
: '\\' ('0'..'3') ('0'..'7') ('0'..'7')
| '\\' ('0'..'7') ('0'..'7')
| '\\' ('0'..'7')
;
fragment
UNICODE_ESC
: '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
;
fragment
DIGIT
: '0'..'9';
//FAKE_TOKEN : '1' '2' '3';
/*
DECIMAL_VALUE
: '1'..'9' DIGIT*;
*/
//DECIMAL_VALUE : DIGIT*;
DECIMAL_VALUE : DIGIT+;
//HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
HEX_DIGIT : (DIGIT|'a'..'f'|'A'..'F') ;
HEX_VALUE
: '0x' HEX_DIGIT+;
fragment
HEADER_FILENAME
: ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*;
/*
BLANKSPACE_TAB
// : (' ' | '\t'){skip();};
: (' ' | '\t')
{$channel=HIDDEN;};
*/
//fragment BLANK : (' '|'\t')+ {skip();};
//BLANK : (' '|'\t') {skip();};
//BLANK : (' '|'\t');
//BLANK : (' '|'\t') {$channel=HIDDEN;};
//BLANKS : (' '|'\t')+ {$channel=HIDDEN;};
//BLANKS : (' '|'\t')+ {$channel=HIDDEN;};
//BLANKS : (' '|'\t')+;
//BLANK : (' '|'\t') {$channel=HIDDEN;};
//BLANK : (' '|'\t') {skip();};
BLANKS : (' '|'\t')+;
//BLANKS : (' '|'\t')+ {skip();};
//BLANKS : ' '+ {$channel=HIDDEN;};
//singleInclude : '#include' ' '+ '"' ID '.h"' ;
//singleInclude : '#include' ' '+ '"' ID+ '.h"' ;
//singleInclude : '#include' ' '+ '"' HEADER_FILENAME '.h"';
//singleInclude : '#include' ' ' '"' HEADER_FILENAME '.h"';
//singleInclude : '#include "' HEADER_FILENAME '.h"';
//fragment singleInclude : '#include' (' ')+ '"' ID '.h"';
//singleInclude : '#include' (' '|'\t')+ '""' ID '.h"';
//singleInclude : '#include' (' '|'\t')+ '"std_defs.h"';
//singleInclude : '#include' BLANKS '"' ID '"' '.h';
singleInclude : '#include' '"' ID '"' '.h';
//include : singleInclude WS* -> singleInclude;
include : singleInclude WS*;
//startParse : include* identification+;
//startParse : include+ identification+;
//startParse : identification+;
//startParse : manufacture deviceType deviceRevison ddRevision;
startParse : include+ manufacture deviceType deviceRevison ddRevision;
//manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*;
//manufacture : 'MANUFACTURER'^ (BLANK+! (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*;
//manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) ','? WS*;
manufacture : 'MANUFACTURER'^ BLANKS (HEX_VALUE | DECIMAL_VALUE) ','? WS*;
deviceType : 'DEVICE_TYPE'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*;
deviceRevison : 'DEVICE_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*;
ddRevision : 'DD_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*;
//identification : definiton WS* (','?)! WS* -> definiton;
//definiton : (ID)^ ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE)
//definiton : (ID)^ BLANKSPACE_TAB+ (DECIMAL_VALUE | HEX_VALUE)
//definiton : ID ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE);中的STRING注释掉:
/*
STRING
: '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
;
*/去重新debug看看结果,结果,果然可以识别第一个双引号了,不过接着又出现了另外的
MismatchedTokenException(0!=0)
的问题:
但是,这样就离着最终解决此问题,前进了一大步了。
因为,搞懂了,之前之所以没有匹配第一个双引号,是因为,之前无故地,多定义了个STRING,但是却没使用。
导致后续无法正常匹配所需要的双引号。
6.此处,之所以错在ID位置,好像是之前多余的,自己定义了一个:
fragment
HEADER_FILENAME
: ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*;所以,去掉:
/*
fragment
HEADER_FILENAME
: ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*;
*/试试结果,结果错误依旧。
7.期间遇到类似于重复定义的问题,详见:
【总结】
1.不要随便,乱用,Antlrworks创建新的.g文件时所自带的语法
比如ID,STRING等等。
否则,后期可能和你真正要处理的内容,有冲突:
比如此处就是,之前模板所生成的STRING,和后续的识别双引号,而产生冲突,导致出现了
MismatchedTokenException(0!=0)
而无法正常继续解析。
2.之前的ID定义,其实是可以用的,即:
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;是可以正常使用的。
3.但是对应ID,不能加上fragment,即不能用:
fragment
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;否则,是会报错:MismatchedTokenException(0!=0),的。
4.单引号的表示,的确就是正常的:
'"'
即可。
5.此处,还仍旧会有那个MissingTokenException的,目前看来,估计是bug。
详见:
【基本解决】antlr v3,用包含{$channel=HIDDEN;}语法,结果解析出错:MissingTokenException
6.目前是用如下代码:
grammar DDParserDemo;
options {
output = AST;
ASTLabelType = CommonTree; // type of $stat.tree ref etc...
}
//NEWLINE : '\r'? '\n' ;
//NEWLINE : '\r' '\n' ;
fragment
NEWLINE : '\r'? '\n' ;
fragment
FLOAT
: ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
| '.' ('0'..'9')+ EXPONENT?
| ('0'..'9')+ EXPONENT
;
COMMENT
: '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
| '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
;
//fragment WS : ( ' ' | '\t' | '\r' | '\n') {skip();};
//fragment WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;};
WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;};
/*
STRING
: '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
;
*/
CHAR: '\'' ( ESC_SEQ | ~('\''|'\\') ) '\''
;
fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
ESC_SEQ
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| UNICODE_ESC
| OCTAL_ESC
;
fragment
OCTAL_ESC
: '\\' ('0'..'3') ('0'..'7') ('0'..'7')
| '\\' ('0'..'7') ('0'..'7')
| '\\' ('0'..'7')
;
fragment
UNICODE_ESC
: '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
;
//fragment
DIGIT
: '0'..'9';
//FAKE_TOKEN : '1' '2' '3';
/*
DECIMAL_VALUE
: '1'..'9' DIGIT*;
*/
//DECIMAL_VALUE : DIGIT*;
DECIMAL_VALUE : DIGIT+;
//HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
HEX_DIGIT : (DIGIT|'a'..'f'|'A'..'F') ;
HEX_VALUE
: '0x' HEX_DIGIT+;
/*
fragment
HEADER_FILENAME
: ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_')*;
*/
/*
BLANKSPACE_TAB
// : (' ' | '\t'){skip();};
: (' ' | '\t')
{$channel=HIDDEN;};
*/
//fragment BLANK : (' '|'\t')+ {skip();};
//BLANK : (' '|'\t') {skip();};
//BLANK : (' '|'\t');
//BLANK : (' '|'\t') {$channel=HIDDEN;};
//BLANKS : (' '|'\t')+ {$channel=HIDDEN;};
//BLANKS : (' '|'\t')+ {$channel=HIDDEN;};
//BLANKS : (' '|'\t')+;
//BLANK : (' '|'\t') {$channel=HIDDEN;};
//BLANK : (' '|'\t') {skip();};
BLANKS : (' '|'\t')+;
//BLANKS : (' '|'\t')+ {skip();};
//BLANKS : ' '+ {$channel=HIDDEN;};
//singleInclude : '#include' ' '+ '"' ID '.h"' ;
//singleInclude : '#include' ' '+ '"' ID+ '.h"' ;
//singleInclude : '#include' ' '+ '"' HEADER_FILENAME '.h"';
//singleInclude : '#include' ' ' '"' HEADER_FILENAME '.h"';
//singleInclude : '#include "' HEADER_FILENAME '.h"';
//fragment singleInclude : '#include' (' ')+ '"' ID '.h"';
//singleInclude : '#include' (' '|'\t')+ '""' ID '.h"';
//singleInclude : '#include' (' '|'\t')+ '"std_defs.h"';
//singleInclude : '#include' BLANKS '"' ID '"' '.h';
//singleInclude : '#include' '"' ID '"' '.h';
//singleInclude : '#include' BLANKS '"' ID '"' '.h';
//singleInclude : '#include' BLANKS '"' ID '.h' '"';
//singleInclude : '#include' BLANKS '"' ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* '.h' '"';
//ID_START : 'a'..'z'|'A'..'Z'|'_';
//fragment ID_START : 'a'..'z'|'A'..'Z'|'_';
//WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
//WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'| DIGIT)*;
//WHOLE_ID : ('a'..'z'|'A'..'Z'|'_') (HEX_DIGIT|'_')*;
//fragment
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
//ID_START : 'a'..'z'|'A'..'Z'|'_';
//WHOLE_ID : (ID_START) (ID_START | DIGIT)*;
//ID_MIDDLE_END : ID_START | DIGIT;
//ID_MIDDLE_END : HEX_DIGIT | '_';
//singleInclude : '#include' BLANKS '"' ID_START ID_MIDDLE_END* '.h' '"';
//singleInclude : '#include' BLANKS '"' ID_START (ID_START | DIGIT)* '.h' '"';
//singleInclude : '#include' BLANKS '"' ID_START (ID_START | DIGIT)+ '.h' '"';
//singleInclude : '#include' BLANKS '"' ID_START '.h' '"';
//singleInclude : '#include' BLANKS '"' WHOLE_ID '.h' '"';
singleInclude : '#include' BLANKS '"' ID '.h' '"';
//include : singleInclude WS* -> singleInclude;
include : singleInclude WS*;
//startParse : include* identification+;
//startParse : include+ identification+;
//startParse : identification+;
//startParse : manufacture deviceType deviceRevison ddRevision;
startParse : include+ manufacture deviceType deviceRevison ddRevision;
//manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*;
//manufacture : 'MANUFACTURER'^ (BLANK+! (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*;
//manufacture : 'MANUFACTURER'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) ','? WS*;
manufacture : 'MANUFACTURER'^ BLANKS (HEX_VALUE | DECIMAL_VALUE) ','? WS*;
deviceType : 'DEVICE_TYPE'^ BLANKS (DECIMAL_VALUE | HEX_VALUE) (','?)! WS*;
deviceRevison : 'DEVICE_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*;
ddRevision : 'DD_REVISION'^ BLANKS (DECIMAL_VALUE | HEX_VALUE)(','?)! WS*;
//identification : definiton WS* (','?)! WS* -> definiton;
//definiton : (ID)^ ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE)
//definiton : (ID)^ BLANKSPACE_TAB+ (DECIMAL_VALUE | HEX_VALUE)
//definiton : ID ('\t'!|' '!)+ (DECIMAL_VALUE | HEX_VALUE);去解析:
/* ********************************************************************** ** Includes ********************************************************************** */ #include "std_defs.h" #include "com_tbls.h" #include "rev_defs.h" #include "fbk_hm.h" #include "fdiag_FBK2_Start.h" #include "blk_err.h" /* ********************************************************************** ********** DEVICE SECTION ******************************************** ********************************************************************** */ MANUFACTURER 0x1E6D11, DEVICE_TYPE 0x00FF, DEVICE_REVISION 5, DD_REVISION 1
对应的截图为:
7.
转载请注明:在路上 » 【已解决】antlr解析双引号出错:MismatchedTokenException(0!=0)