处理错误

🌐 Dealing with Errors

引用自《龙书》(https://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811)

🌐 Quoting from the Dragon Book

大多数编程语言规范并没有描述编译器应如何应对错误；错误处理由编译器设计者决定。从一开始就规划错误处理可以既简化编译器的结构，又提高其错误处理能力。

一个完全可恢复的解析器无论我们输入什么，都能构建抽象语法树（AST）。对于像代码检查工具或格式化工具这样的工具，人们希望有一个完全可恢复的解析器，这样我们就可以对程序的部分内容进行操作。

🌐 A fully recoverable parser can construct an AST no matter what we throw at it. For tools such as linter or formatter, one would wish for a fully recoverable parser so we can act on part of the program.

如果有任何语法不匹配，恐慌解析器会中止，而部分可恢复的解析器则可以从确定性语法中恢复。

🌐 A panicking parser will abort if there is any grammar mismatch, and a partially recoverable parser will recover from deterministic grammars.

例如，给定一个语法不正确的 while 语句 while true {}，我们知道它缺少圆括号，并且它唯一可能有的标点符号就是圆括号，所以我们仍然可以返回一个有效的 AST，并指出它缺少的括号。

🌐 For example, given a grammatically incorrect while statement while true {}, we know it is missing round brackets, and the only punctuation it can have are round brackets, so we can still return a valid AST and indicate its missing brackets.

大多数现有的 JavaScript 解析器都是部分可恢复的，所以我们也将采用同样的方法，构建一个部分可恢复的解析器。

🌐 Most JavaScript parsers out there are partially recoverable, so we'll do the same and build a partially recoverable parser.

INFO

Biome 解析器是一个完全可恢复的解析器。

Rust 有 Result 类型用于返回和传播错误。结合 ? 语法，解析函数将保持简洁明了。

🌐 Rust has the Result type for returning and propagating errors. In conjunction with the ? syntax, the parse functions will remain simple and clean.

通常会封装 Result 类型，这样我们以后可以替换错误：

🌐 It is common to wrap the Result type so we can replace the error later:

rust

pub type Result<T> = std::result::Result<T, ()>;

我们的解析函数将返回一个结果，例如：

🌐 Our parse functions will return a Result, for example:

rust

pub fn parse_binding_pattern(&mut self, ctx: Context) -> Result<BindingPattern<'a>> {
    match self.cur_kind() {
        Kind::LCurly => self.parse_object_binding_pattern(ctx),
        Kind::LBrack => self.parse_array_binding_pattern(ctx),
        kind if kind.is_binding_identifier() => {
          // ... code omitted
        }
        _ => Err(()), 
    }
}

我们可以添加一个 expect 函数，如果当前标记与语法不匹配，则返回错误：

🌐 We can add an expect function for returning an error if the current token does not match the grammar:

rust

/// Expect a `Kind` or return error
pub fn expect(&mut self, kind: Kind) -> Result<()> {
    if !self.at(kind) {
        return Err(())
    }
    self.advance();
    Ok(())
}

并按如下方式使用：

🌐 And use it as such:

rust

pub fn parse_paren_expression(&mut self, ctx: Context) -> Result<Expression> {
    self.expect(Kind::LParen)?;
    let expression = self.parse_expression(ctx)?;
    self.expect(Kind::RParen)?;
    Ok(expression)
}

INFO

为了完整性，当词法分析时遇到意外的 char 时，词法分析器函数 read_next_token 也应该返回 Result。

🌐 For completeness, the lexer function read_next_token should also return Result when an unexpected char is found when lexing.

“Error”特性

🌐 The Error Trait

要返回特定错误，我们需要填写 Result 的 Err 部分：

🌐 To return specific errors, we need to fill in the Err part of Result:

rust

pub type Result<T> = std::result::Result<T, SyntaxError>;
                                            ^^^^^^^^^^^
#[derive(Debug)]
pub enum SyntaxError {
    UnexpectedToken(String),
    AutoSemicolonInsertion(String),
    UnterminatedMultiLineComment(String),
}

我们称它为 SyntaxError，因为 ECMAScript 规范语法部分定义的所有“早期错误”都是语法错误。

🌐 We call it SyntaxError because all "early error"s defined in the grammar section of the ECMAScript specification are syntax errors.

为了将其制作成一个合适的 Error，它需要实现 Error Trait。为了让代码更清晰，我们可以使用 thiserror crate 中的宏：

🌐 To make this a proper Error, it needs to implement the Error Trait. For cleaner code, we can use macros from the thiserror crate:

rust

#[derive(Debug, Error)]
pub enum SyntaxError {
    #[error("Unexpected Token")]
    UnexpectedToken,

    #[error("Expected a semicolon or an implicit semicolon after a statement, but found none")]
    AutoSemicolonInsertion,

    #[error("Unterminated multi-line comment")]
    UnterminatedMultiLineComment,
}

然后我们可以添加一个 expect 辅助函数，如果令牌不匹配就抛出错误：

🌐 We can then add an expect helper function for throwing an error if the token does not match:

rust

/// Expect a `Kind` or return error
pub fn expect(&mut self, kind: Kind) -> Result<()> {
    if self.at(kind) {
        return Err(SyntaxError::UnexpectedToken);
    }
    self.advance(kind);
    Ok(())
}

“parse_debugger_statement”现在可以使用“expect”功能进行正确的错误管理：

🌐 The parse_debugger_statement can now use the expect function for proper error management:

rust

fn parse_debugger_statement(&mut self) -> Result<Statement> {
    let node = self.start_node();
    self.expect(Kind::Debugger)?;
    Ok(Statement::DebuggerStatement {
        node: self.finish_node(node),
    })
}

注意 expect 之后的 ?，它是一种语法糖，称为“问号操作符”，用于在 expect 函数返回 Err 时让函数提前返回。

🌐 Notice the ? after the expect, it is a syntactic sugar called the "question mark operator" for making the function return early if the expect function returns a Err.

花哨错误报告

🌐 Fancy Error Report

miette 是最棒的错误报告库之一，它提供了漂亮的彩色输出

miette

将 miette 添加到你的 Cargo.toml

🌐 Add miette to your Cargo.toml

toml

[dependencies]
miette = { version = "5", features = ["fancy"] }

我们可以用 miette 封装我们的 Error，而不修改解析器中定义的 Result 类型：

🌐 We can wrap our Error with miette and not modify the Result type defined in our parser:

rust

pub fn main() -> Result<()> {
    let source_code = "".to_string();
    let file_path = "test.js".to_string();
    let mut parser = Parser::new(&source_code);
    parser.parse().map_err(|error| {
        miette::Error::new(error).with_source_code(miette::NamedSource::new(file_path, source_code))
    })
}

处理错误 ​

“Error”特性 ​

花哨错误报告 ​

处理错误

“Error”特性

花哨错误报告