2011-07-08 38 views
3

考慮以下我的語法提取物:ANTLR v3的NoViableAltException沒有顯示出來

definition 
    : '(' 'define' 
        ('(' variable def_formals ')' body ')' 
        | variable expression ')' 
        )  
    ; 

def_formals 
    : variable* ('.' variable)? 
    ; 

body 
    : ((definition)=> definition)* expression+ 
    ; 

變量標識符,表達式是方案的一些表述(如文字或lambda表達式)。完整的語法可以在我的一些其他問題中找到。

所以我測試了整個事情,並提出了有關NoViableException的問題。

到目前爲止,一切應該運行良好的運行良好。例如

(define x 5) 

被認可。

現在我正在測試解析器不應該識別什麼。

例如約額外

(define x 5)) 

報告 「)」 在該行的末尾。

但是當我離開的東西,例如

(define x) 

(define) 

解析器不抱怨的。當我檢查解釋器時,NoViableAltException正確顯示。但我無法弄清楚如何讓這個錯誤出現在外部程序(如java測試類)中

我試圖讓解析器解決他看到的第一個語法錯誤,就像書中描述的那樣來自Terrence Parr(第252頁),但這也沒有幫助。 我也嘗試像

private List<String> errors = new LinkedList<String>(); 
    public void displayRecognitionError(String[] tokenNames, 
             RecognitionException e) { 
     String hdr = getErrorHeader(e); 
     String msg = getErrorMessage(e, tokenNames); 
     errors.add(hdr + " " + msg); 
    } 
    public List<String> getErrors() { 
     return errors; 
    } 

,但調用時方法不返回任何東西。

因此,當他們明顯被內部拋出時,我如何讓ANTLR向我顯示這個錯誤?

編輯: 這是整個語法:

grammar R5RS; 

options { 
    language = Java; 
    output=AST; 
} 

@header{ 
    package r5rsgrammar; 
    import r5rsgrammar.scope.*; 
    import java.util.LinkedList; 
} 

@lexer::header{ 
    package r5rsgrammar; 
    import r5rsgrammar.scope.*; 
    import java.util.LinkedList; 
} 

@members{ 

    // variables wich is used to distinguish between top level and inner definitions 
    private boolean topLevel; 


    // the toplevel scope of a file, whose parent is null 
    private IScope scope; 

    @Override 
    public void emitErrorMessage(String message) { 
     throw new RuntimeException(message); 
    } 
} 

// PROGRAMS AND DEFINITIONS 

parse 
@init{ 
    this.topLevel = true; 
    this.scope = new Scope(); 
} 
    : command_or_definition* EOF 
    ; 

command_or_definition 
    : (syntax_definition)=>    syntax_definition 
    | (definition)=>      definition 
    | ('('BEGIN command_or_definition)=> 
      '('BEGIN 
       { this.topLevel = false; 
        this.scope = this.scope.push(); 
       } 
       command_or_definition+ 
       { this.scope = this.scope.pop(); 
        this.topLevel = true; 
       }')' 
    | command 
    ; 

command 
    : expression 
    ; 

definition 
    : '(' DEFINE ('(' var=variable 
         { this.topLevel = false; 
          this.scope.bind($var.text); 
          this.scope = this.scope.push(); 
         } 
          def_formals ')' body 
         { this.topLevel = true; 
          this.scope = this.scope.pop(); 
         }')' 
        | var=variable 
         { this.topLevel = false; 
          this.scope.bind($var.text); 
          this.scope = this.scope.push(); 
         } 
          expression 
         { this.topLevel = true; 
          this.scope = this.scope.pop(); 
         }')' 
        ) 
    | '(' BEGIN 
      {this.scope = this.scope.push();} 
      definition* 
      {this.scope = this.scope.pop();}')' 
    ; 

def_formals 
    : vars+=variable* ('.' vars+=variable)? 
     {for (int i = 0; i \less $vars.size(); i++){ 
      String name = ((CommonTree)$vars.get(i)).getText(); 
      this.scope.bind(name); 
      } 
     } 
    ; 


syntax_definition 
    : '(' DEFINE_SYNTAX var=variable 
         { this.scope.bind($var.text); 
          this.scope = this.scope.push();} 
          transformer_spec 
          {this.scope = this.scope.pop();}')' 
    ; 

// EXPRESSIONS 

expression 
    : (variable)=>   var=variable 
           { 
            if(!this.scope.isBound($var.text)) 
            System.err.println($var.text + " not bound"); 
           } 
    | (literal)=>    literal 
    | (lambda_expression)=> lambda_expression 
    | (conditional)=>   conditional 
    | (assignment)=>   assignment 
    | (derived_expression)=> derived_expression 
    | (procedure_call)=>  procedure_call 
    | (macro_use)=>   macro_use 
    |       macro_block 
    ; 

keyword 
    : identifier 
    ; 

literal 
    : quotation 
    | self_evaluating 
    ; 

self_evaluating 
    : bool 
    | number 
    | CHARACTER 
    | STRING 
    ; 

quotation 
    : '\'' datum 
    | '(' QUOTE datum ')' 
    ; 

lambda_expression 
    : '(' LAMBDA {this.scope = this.scope.push();} 
        formals body 
        {this.scope = this.scope.pop();}')' 
    ; 

formals 
    : '(' (vars+=variable+ ('.' vars+=variable)?)? ')' 
     {for (int i = 0; i \less $vars.size(); i++){ 
      String name = ((CommonTree)$vars.get(i)).getText(); 
      this.scope.bind(name); 
      } 
     } 
    | var=variable 
     {this.scope.bind($var.text);} 
    ; 

body 
    : ((definition)=> definition)* sequence 
    ; 

sequence 
    : expression+ 
    ; 


conditional 
    : '(' IF test consequent alternate? ')' 
    ; 

test 
    : expression 
    ; 
consequent 
    : expression 
    ; 
alternate 
    : expression 
    ; 

assignment 
    : '(' SET_BANG variable expression ')' 
    ; 


derived_expression 
    : quasiquotation 
    | '(' (COND ('(' ELSE sequence ')' 
        | cond_clause+ ('(' ELSE sequence ')')? 
        ) 
      | CASE expression (case_clause+ ('(' ELSE sequence ')')? 
           | '(' ELSE sequence ')' 
          )  
      | AND test* 
      | OR test* 
      | LET variable? '(' {this.scope = this.scope.push();} 
           binding_spec[false] ')' body 
           {this.scope = this.scope.pop();} 
      | LET_STAR '(' {this.scope = this.scope.push();} 
          binding_spec[true] ')' body 
          {this.scope = this.scope.pop();} 
      | LETREC '(' {this.scope = this.scope.push();} 
          binding_spec[true] ')' body 
          {this.scope = this.scope.pop();} 
      | BEGIN sequence 
      | DO '(' iteration_spec* ')' '(' test do_result? ')' command* 
      | DELAY expression 
      ) 
     ')' 

    ; 

cond_clause 
    : '(' test (sequence | FOLLOWS recipient)? ')' 
    ; 

recipient 
    : expression 
    ; 

case_clause 
    : '(' '(' datum* ')' sequence ')' 
    ; 

binding_spec[boolean sequential] 
    : {sequential}? // let* or letrec: bind the var immediatly 
     ('(' var=variable 
     {this.scope.bind($var.text);} 
      expression ')')* 

    | {!sequential}? // normal let: bind all vars at the end 
     ('(' vars+=variable expression ')')* 
     {for (int i = 0; i \less $vars.size(); i++){ 
      String name = ((CommonTree)$vars.get(i)).getText(); 
      this.scope.bind(name); 
      } 
     } 
    ; 

iteration_spec 
    : '(' variable init step ')' 
    ; 

init 
    : expression 
    ; 

step 
    : expression 
    ; 

do_result 
    : sequence 
    ; 

procedure_call 
    : '(' operator operand* ')' 
    ; 

operator 
    : expression 
    ; 

operand 
    : expression 
    ; 

macro_use 
    : '(' keyword datum* ')' 
    ; 

macro_block 
    : '(' (LET_SYNTAX | LETREC_SYNTAX) '(' syntax_spec*')' body ')' 
    ; 

syntax_spec 
    : '(' keyword transformer_spec')' 
    ; 


// TRANSFORMERS 

transformer_spec 
    : '(' SYNTAX_RULES '(' identifier* ')' syntax_rule* ')' 
    ; 

syntax_rule 
    : '(' pattern template ')' 
    ; 

pattern 
    : pattern_identifier 
    | '(' (pattern+ ('.' pattern)?)? ')' 
    | '#(' (pattern+ ELLIPSIS?)? ')' 
    | pattern_datum 
    ; 

pattern_datum 
    : bool 
    | number 
    | CHARACTER 
    | STRING 
    ; 

template 
    : pattern_identifier 
    | '(' (template_element+ ('.' template)?)? ')' 
    | '#('template_element* ')' 
    | template_datum 
    ; 

template_element 
    : template ELLIPSIS? 
    ; 

template_datum 
    : pattern_datum 
    ; 

pattern_identifier 
    : syntactic_keyword 
    | VARIABLE 
    ; 

// external representations 
// a Datum is what the _read_ procedure successfully parses. 
// Note that any string that parses as an expression will also parse as a datum. 
datum 
    : simple_datum 
    | compound_datum 
    ; 

simple_datum 
    : bool 
    | number 
    | CHARACTER 
    | STRING 
    | identifier 
    ; 

compound_datum 
    : list 
    | vector 
    ; 

list 
    : '(' (datum+ ('.' datum)?)? ')' 
    | abbreviation 
    ; 

abbreviation 
    : abbrev_prefix datum 
    ; 

abbrev_prefix 
    : ('\'' | '`' | ',' | ',@') 
    ; 

vector 
    : '#(' datum* ')' 
    ; 

// QUASIQUOTATIONS 
// CONTEXT-SENSITIVE 

quasiquotation 
    : quasiquotation_D[1] 
    ; 

quasiquotation_D[int d] 
    : '`' qq_template[d] 
    | '(' QUASIQUOTE qq_template[d] ')' 
    ; 

qq_template[int d] 
    : (expression)=> expression 
    | ('(' UNQUOTE)=> unquotation[d] 
    |     simple_datum 
    |     vectorQQ_template[d] 
    |     listQQ_template[d] 
    ; 

vectorQQ_template[int d] 
    : '#(' qq_template_or_slice[d]* ')' 
    ; 

listQQ_template[int d] 
    :      '\'' qq_template[d] 
    | ('(' QUASIQUOTE)=> quasiquotation_D[d+1] 
    |      '(' (qq_template_or_slice[d]+ ('.' qq_template[d])?)? ')' 
    ; 

unquotation[int d] 
    : ',' qq_template[d-1] 
    | '(' UNQUOTE qq_template[d-1] ')' 
    ; 

qq_template_or_slice[int d] 
    : ('(' UNQUOTE_SPLICING)=> splicing_unquotation[d] 
    |       qq_template[d] 
    ; 

splicing_unquotation[int d] 
    : ',@' qq_template[d-1] 
    | '(' UNQUOTE_SPLICING qq_template[d-1] ')' 
    ; 



// values 

bool: TRUE | FALSE; 
number: NUM_2 | NUM_8 | NUM_10 | NUM_16; 
identifier: syntactic_keyword | variable; 
variable : VARIABLE | ELLIPSIS; 

// KEYWORDS 

syntactic_keyword 
    : expression_keyword 
    | ELSE 
    | FOLLOWS 
    | DEFINE 
    | UNQUOTE 
    | UNQUOTE_SPLICING; 
expression_keyword 
    : QUOTE 
    | LAMBDA 
    | IF 
    | SET_BANG 
    | BEGIN 
    | COND 
    | AND 
    | OR 
    | CASE 
    | LET 
    | LET_STAR 
    | LETREC 
    | DO 
    | DELAY 
    | QUASIQUOTE; 

// syntactic keywords 
ELSE : 'else'; 
FOLLOWS : '=>'; 
DEFINE : 'define'; 
UNQUOTE : 'unquote'; 
UNQUOTE_SPLICING : 'unquote-splicing'; 

// expression keywords 
QUOTE : 'QUOTE'; 
LAMBDA : 'lambda'; 
IF : 'if'; 
SET_BANG : 'set!'; 
BEGIN : 'begin'; 
COND : 'cond'; 
AND : 'and'; 
OR : 'or'; 
CASE : 'case'; 
LET : 'let'; 
LET_STAR : 'let*'; 
LETREC : 'letrec'; 
DO : 'do'; 
DELAY : 'delay'; 
QUASIQUOTE : 'quasiquote'; 

// macro keywords 
LETREC_SYNTAX : 'letrec-syntax'; 
LET_SYNTAX : 'let-syntax'; 
SYNTAX_RULES : 'syntax_rules'; 
DEFINE_SYNTAX : 'define-syntax'; 

ELLIPSIS : '...'; 

//RESERVED_CHAR : '{'| '}' | '[' | ']' | '|'; 

STRING : '"' STRING_ELEMENT* '"'; 

TRUE : '#' ('T' | 't'); 
FALSE : '#' ('f' | 'F'); 

CHARACTER : '#\\' (~(' ' | '\n') | CHARACTER_NAME); 

VARIABLE : INITIAL SUBSEQUENT* | PECULIAR_IDENTIFIER; 

// space and comments are ignored 
SPACE : (' ' | '\n' | '\t' | '\r') {$channel = HIDDEN;}; 
COMMENT : ';' ~('\r' | '\n')* {$channel = HIDDEN;}; 


fragment INITIAL : LETTER | SPECIAL_INITIAL; 
fragment LETTER : 'a'..'z' | 'A'..'Z'; 
fragment SPECIAL_INITIAL : '!' | '$' | '%' | '&' | '*' | '/' | ':' | '\less' | '=' | '>' | '?' | '^' | '_' | '~'; 
fragment SUBSEQUENT : INITIAL | DIGIT | SPECIAL_SUBSEQUENT; 
fragment SPECIAL_SUBSEQUENT : '+' | '-' | '.' | '@'; 
fragment PECULIAR_IDENTIFIER : '+' | '-'; 
fragment STRING_ELEMENT : ~('"' | '\\') | '\\' ('"' | '\\'); 
fragment CHARACTER_NAME : 'space' | 'newline'; 



// NUMBERS 

fragment SUFFIX : EXPONENT_MARKER SIGN? DIGIT+; 
fragment EXPONENT_MARKER : 'e' | 'E' | 's' | 'S' | 'f' | 'F' | 'd' | 'D' | 'l' |'L'; 
fragment SIGN : '+' | '-'; 
fragment EXACTNESS : '#' ('i' | 'I' | 'e' | 'E'); 
fragment IMAGINARY : 'i' | 'I'; 
fragment DIGIT : '0'..'9'; 

// BINARY NUMBERS 

NUM_2 : PREFIX_2 COMPLEX_2; 

fragment COMPLEX_2 
    : REAL_2 ('@' REAL_2)? 
    | REAL_2? ('+' | '-') UREAL_2? IMAGINARY 
    ;   
fragment REAL_2 : SIGN? UREAL_2; 
fragment UREAL_2 : UINTEGER_2 ('/' UINTEGER_2)?;  
fragment UINTEGER_2 : DIGIT_2+ '#'*; 

fragment PREFIX_2 
    : RADIX_2 EXACTNESS? // #d #i 
    | EXACTNESS RADIX_2 // #i #d 
    ; 

fragment RADIX_2 : '#' ('b' | 'B'); 
fragment DIGIT_2 : '0' | '1'; 

// OCTAL NUMBERS 

NUM_8 : PREFIX_8 COMPLEX_8; 

fragment COMPLEX_8 
    : REAL_8 ('@' REAL_8)? 
    | REAL_8? ('+' | '-') UREAL_8? IMAGINARY 
    ; 

fragment REAL_8 : SIGN? UREAL_8; 

fragment UREAL_8 
    : UINTEGER_8 ('/' UINTEGER_8)?; 

fragment UINTEGER_8 : DIGIT_8+ '#'*; 

fragment PREFIX_8 
    : RADIX_8 EXACTNESS? // #d #i 
    | EXACTNESS RADIX_8; // #i #d 

fragment RADIX_8 : '#' ('o' | 'O'); 
fragment DIGIT_8 : '0' .. '7'; 

// DECIMAl NUMBERS 

NUM_10 : PREFIX_10? COMPLEX_10; 

fragment COMPLEX_10 
    : REAL_10 ('@' REAL_10)? 
    | REAL_10? ('+' | '-') UREAL_10? IMAGINARY 
    ; 

fragment REAL_10 : SIGN? UREAL_10; 
fragment UREAL_10 : UINTEGER_10 ('/' UINTEGER_10)? | DECIMAL_10;  
fragment UINTEGER_10 : DIGIT+ '#'*; 

fragment DECIMAL_10 
    : UINTEGER_10 SUFFIX 
    | '.' DIGIT+ '#'* SUFFIX? 
    | DIGIT+ '.' DIGIT* '#'* SUFFIX? 
    | DIGIT+ '#'+ '.' '#'* SUFFIX?; 

fragment PREFIX_10 
    : RADIX_10 EXACTNESS? // #d #i 
    | EXACTNESS RADIX_10; // #i #d 

fragment RADIX_10 : '#' ('d' | 'D'); 

// HEXADECIMAL NUMBERS 

NUM_16 : PREFIX_16 COMPLEX_16; 

fragment COMPLEX_16 
    : REAL_16 ('@' REAL_16)? 
    | REAL_16? ('+' | '-') UREAL_16? IMAGINARY 
    ; 

fragment REAL_16 : SIGN? UREAL_16; 

fragment UREAL_16 
    : UINTEGER_16 ('/' UINTEGER_16)?; 

fragment UINTEGER_16 : DIGIT_16+ '#'*; 

fragment PREFIX_16 
    : RADIX_16 EXACTNESS? // #d #i 
    | EXACTNESS RADIX_16; // #i #d 

fragment RADIX_16 : '#' ('x' | 'X'); 
fragment DIGIT_16 : DIGIT | 'a'.. 'f' | 'A' .. 'F';

(我爲了使formatation工作不得不更換 「<」 與 「\少」)

編輯 這個問題的解決方案簡單得多:(define x)(在r5rs中是令人驚訝的有效的(參見最後一條評論)

回答

2

有許多方法可以改進錯誤報告。速戰速決將覆蓋emitErrorMessage(String message)在分析器類和簡單地拋出與所提供的消息的異常:

grammar T; 

@members { 
    @Override 
    public void emitErrorMessage(String message) { 
    throw new RuntimeException(message); 
    } 
} 

definition 
    : '(' 'define' ('(' variable def_formals ')' body ')' 
        | variable expression ')' 
       )  
    ; 

def_formals 
    : variable* ('.' variable)? 
    ; 

body 
    : ((definition)=> definition)* expression+ 
    ; 

expression 
    : INT 
    ; 

variable 
    : ID 
    ; 

ID : 'a'..'z'+; 
INT : '0'..'9'; 
SPACE : ' ' {skip();}; 

,你可以用類測試:

import org.antlr.runtime.*; 

public class Main { 
    public static void main(String[] args) { 
    String[] tests = { 
     "(define x 5)", 
     "(define x 5))", 
     "(define x)", 
     "(define)" 
    }; 
    for(String input : tests) { 
     TLexer lexer = new TLexer(new ANTLRStringStream(input)); 
     TParser parser = new TParser(new CommonTokenStream(lexer)); 
     System.out.println("\nParsing : " + input); 
     try { 
     parser.definition(); 
     } catch(Exception e) { 
     System.out.println(" exception -> " + e.getMessage()); 
     } 
    } 
    } 
} 

運行上述類後,你會看到以下內容:

[email protected]:~/Programming/ANTLR/Demos/T$ java -cp antlr-3.3.jar org.antlr.Tool T.g 
[email protected]:~/Programming/ANTLR/Demos/T$ javac -cp antlr-3.3.jar *.java 
[email protected]:~/Programming/ANTLR/Demos/T$ java -cp .:antlr-3.3.jar Main 

Parsing : (define x 5) 

Parsing : (define x 5)) 

Parsing : (define x) 
    exception -> line 1:9 missing INT at ')' 

Parsing : (define) 
    exception -> line 1:7 no viable alternative at input ')' 

正如你看到的,輸入(define x 5))產生也不例外!這是因爲詞法分析器有與它沒有任何問題(它們都是有效的令牌)和解析器簡單地指示消耗definition規則:

definition 
    : '(' 'define' ('(' variable def_formals ')' body ')' 
        | variable expression ')' 
       )  
    ; 

它一樣。如果你想,因爲懸掛')'的錯誤,那麼你就必須在該規則的末尾添加EOF令牌:

definition 
    : '(' 'define' ('(' variable def_formals ')' body ')' 
        | variable expression ')' 
       ) 
        EOF 
    ; 
+0

我與你的榜樣轉載此,但是當我嘗試這個分配給我語法,再次,沒有任何反應。我發佈了整個語法,因爲它現在在上面。 – Sebastian

+0

我試圖追蹤它,當他進入規則和他離開規則時,我添加了一些打印輸出。我得到的是:他執行規則,但他不離開它。會不會有某種他不能離開的循環? – Sebastian

+0

好吧,我明白了;在我的語法中,(定義x)沒有錯,因爲他認爲它是一個宏(規則macro_use),即使(define)也會匹配該規則。不管怎麼說,多謝拉! – Sebastian