docs: Improve error handling docs (#1919)

* docs: Improve error handling docs and add more TL;DRs and new sections about handling specific errors, and logging errors. * contributing: Add more coding tips
2022-11-09 18:01:36 +00:00 · 2022-11-09 18:01:36 +00:00 · 49b831c032
commit 49b831c032
parent d4beabfeb2
2 changed files with 200 additions and 1 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -124,8 +124,23 @@ something interesting to work on and guide through.
 - Generate ad-hoc errors with `anyhow!(<SOME MESSAGE>)`
 - *Further reading*: [See here][error-docs-result]
 ### Logging errors
 - When there's a `Result` type around, use `.non_fatal()` on that instead of `log::error!`
 - When there's a `Err` type around, use `Err::<(), _>(err).non_fatal()`
 - Also attach context before logging!
 - *Further reading*: [See here][error-docs-logging]
 ### Adding Concrete Errors, Handling Specific Errors
 - Add a new variant to `zellij_utils::errors::ZellijError`, if needed
 - Use `anyhow::Error::downcast_ref::<ZellijError>()` to recover underlying errors
 - *Further reading*: [See here][error-docs-zellijerror]
 [error-docs-context]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#attaching-context
 [error-docs-result]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#converting-a-function-to-return-a-result-type
 [error-docs-logging]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#logging-errors
 [error-docs-zellijerror]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#adding-concrete-errors-handling-specific-errors
 ## Filing Issues
--- a/docs/ERROR_HANDLING.md
+++ b/docs/ERROR_HANDLING.md
@ -92,12 +92,16 @@ to handle it. Instead of `panic`ing the application, the error is written to
 the application log and execution continues. Please use this sparingly, as an
 error usually calls for actions to be taken rather than ignoring it.
 # Examples of applied error handling
 You can have a look at the commit that introduced error handling to the
 `zellij_server::screen` module [right here][1] (look at the changes in
 `zellij-server/src/screen.rs`). We'll use this to demonstrate a few things in
-the following text.
+the following text. You can find countless other examples in the [tracking
 issue for error handling][3]
 ## Converting a function to return a `Result` type
@ -184,6 +188,7 @@ time to do something about the error.
 In general, any function calling `unwrap` or `expect` is a good candidate to be
 rewritten to return a `Result` type instead.
 ## Attaching context
 [Anyhow][anyhow]s [`Context`][context] trait gives us two methods to attach
@ -243,8 +248,13 @@ Feel free to move context string/closure to a variable to avoid copy-pasting:
 When there is only a single `Result` to be returned from your function, use
 `context` as shown above for `resize_to_screen`.
 ## Choosing helpful context messages
 > **TL;DR**
 > - Don't repeat what the error message in the `Result` says
 > - Describe what you were doing, ideally include the current functions name
 When attaching context to an error, usually you want to express what you were
 doing when the error occurred, i.e. in what context the error occurred. In the
 `render` method, we could have done something like this instead:
@ -276,8 +286,13 @@ give us a [`NotFound`][2] error), so we don't have to repeat that.
 In case of doubt, look at the name of the function you're currently working in
 and write a context message somehow mentioning this.
 ## Terminating execution
 > **TL;DR**
 > - Terminate execution on errors by adding `.fatal()` to it
 > - First try to pass the error as far up as you can or deem reasonable
 We want to propagate errors as far up as we can. This way, every function along
 the way can at least attach a context message giving us an idea what chain of
 events lead to the error. Where do we terminate execution in `Screen`? If you
@ -330,6 +345,10 @@ resulting output!
 ## Error handling for `Option` types
 > **TL;DR**
 > - Attach a `.context` with a message saying why a `None` here is an error
 > - Attach a regular context message like you would for a `Result` type, too!
 Beyond what's described in "Choosing helpful context messages" above, `Option`
 types benefit from extra handling. That is because a `Option` containing a
 `None` where a value is expected doesn't carry an error message: It doesn't
@ -365,13 +384,178 @@ attach a description manually. The second context:
 then describes what the surrounding function was trying to achieve (See
 descriptions above).
 ## Logging errors
 > **TL;DR**  
 > - When there's a `Result` type around, use `.non_fatal()` on that instead of `log::error!`
 > - When there's a `Err` type around, use `Err::<(), _>(err).non_fatal()`
 > - Also attach context before logging!
 > - For further examples, refer to [PR #1881][pr1881]
 You may encounter situations where you have an error and decide it's safe to
 ignore. Depending on the circumstances, this is a perfectly fine thing to do.
 However, oftentimes it proves to be useful to at least log the error, so in
 case things do go wrong we at least see the logged error message. Also, the
 logged message may hint towards an underlying problem which may require further
 action.
 An obvious thing to do is something like the following:
 ```rust
 log::error!("failed to find tab with index {tab_index}");
 ```
 While an ad-hoc log message is better than silently ignoring the error, we can
 usually do better than that. That is because in large parts of the codebase we
 have a `Result` available in one form or another.
 If the `Result` has been treated as suggested above and context messages have
 been attached to it, it already contains a lot of valuable information. This is
 lost when instead we log a custom error. Hence, the better solution is to log
 the `Result` type including all the context information.
 This is easily achieved like so:
 ```rust
 fs::create_dir_all(&plugin_global_data_dir)
    .context("failed to create plugin asset directory")
    .non_fatal();
 ```
 Here, we try to create a directory, and if we fail to do this we simply log the
 error (with `non_fatal()`) and continue. It's important to note that the
 `non_fatal()` function always returns `()`: It cannot be used on any `Result`
 type whose `Ok` value is different from `()`. This is on purpose: If your
 `Result` carries a type, it is probably required for further
 calculations/actions. Hence, you mustn't ignore it.
 Also note that, even though we log the error and continue, we still attach
 context information to it. Just like with fatal errors, the context information
 allows us to tell what we tried to do before we got the error and logged it.
 This is a simple example, and oftentimes the `Result` you're trying to handle
 doesn't carry a `()`. Your code may look more like this:
 ```rust
 if let Ok(active_tab) = self.get_active_tab(client_id) {
    let active_tab_pos = active_tab.position;
    let new_tab_pos = (active_tab_pos + 1) % self.tabs.len();
    return self.switch_active_tab(new_tab_pos, client_id);
 } else {
    log::error!("Active tab not found for client_id: {:?}", client_id);
 }
 ```
 When we get an `Ok`, we do something with it. When we get an `Err`, we log a
 message. Here you'll notice that the usage of an `if let` statement hides the
 error from us: In the `else` branch, we have no means to access the error we
 got from `get_active_tab()` above. In such a situation, it is helpful to
 rewrite the `if let` to a `match` statement instead:
 ```rust
 match self.get_active_tab(client_id) {
    Ok(active_tab) => {
        let active_tab_pos = active_tab.position;
        let new_tab_pos = (active_tab_pos + 1) % self.tabs.len();
        return self.switch_active_tab(new_tab_pos, client_id);
    },
    Err(err) => Err::<(), _>(err).with_context(err_context).non_fatal(),
 }
 ```
 The interesting part here is what happens after the `Err(err)`. Let's break it down:
 1. `Err(err) =>`: We're making the `Err` value that was inside the `Result`
   accessible in the variable `err`
 2. `Err::<(), _>(err)`: We're wrapping the `err` back into a `Result` with the
   `Err()` function
 3. `.with_context(err_context)`: We're attaching error context (the actual
   context isn't shown here)
 4. `.non_fatal()`: We're logging the error with `non_fatal`
 Notice that in step 2 we must qualify the final `Result` type, which is the job
 of the `::<(), _>` attached to `Err`. This is necessary because at this point
 Rust cannot determine on its own what the `Ok` value of the `Result` we create
 may be. This isn't relevant for our case, because we *know* the `Result` will
 never be an `Ok` variant, but Rust still requires this.
 If you're getting errors about "type annotations needed" or "cannot infer type
 for type parameter `T`" in one of your calls to `Err`, this is most likely
 fixed by adding `::<(), _>`.
 ## Adding Concrete Errors, Handling Specific Errors
 > **TL;DR**
 > - Add a new variant to `zellij_utils::errors::ZellijError`
 > - Use `anyhow::Error::downcast_ref::<ZellijError>()` to recover underlying errors
 Sometimes you'll find yourself in a situation where you want to react to very
 specific errors. For example, the "command panes" feature in zellij has a
 special handling for "command not found" errors. If all the `anyhow::Error`s
 are the same, how can we distinguish between underlying error types?
 This is possible because while `anyhow` can unify a vast amount of errors into
 the `anyhow::Error` type, it also gives us the possibility to recover
 underlying error types. To do this, however, we must first know what error type
 to expect.
 External libraries, such as other crates or even `std` will likely define their
 own error types. These error types have distinct error variants that one can
 distinguish and react upon. But what happens, for example, if we create the
 error we want to react to ourselves?
 For this purpose, there is the `ZellijError`, which is contained in
 `zellij_utils::errors`. It is built with the [`thiserror`][thiserror] crate and
 hence easily extensible. If you need a specific error type to act upon, just
 define a new variant in `ZellijError`. It is automatically available in any
 source file that has the `use zellij_utils::errors::prelude::*;` statement in
 it.
 Once you have created your error instance, as soon as you wrap a `context`
 around it, it is turned into an `anyhow::Error`. This makes it compatible with
 all the other functions in the code that return `anyhow::Result`.
 Recovering the error can look, for example, like this:
 ```rust
 match pty
    .spawn_terminal(terminal_action, client_or_tab_index)
    .with_context(err_context)  // <-- Note how we attach a context, but can
                                //     still recover the error below!
 {
    Ok(_) => {
        // ... Whatever
    },
    Err(err) => match err.downcast_ref::<ZellijError>() {
        Some(ZellijError::CommandNotFound { terminal_id, .. }) => {
            // Do something now that this error occured.
            // We can even access the values stored inside it, "terminal_id" in
            // this case
        },
        // You can check for other error variants here
        _ => {
            // Some other error, which we haven't checked for, occured here.
            // Now we can, for example, log it!
            Err::<(), _>(err).non_fatal(),
        },
    },
 }
 ```
 [tracking_issue]: https://github.com/zellij-org/zellij/issues/1753
 [handle_panic]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/fn.handle_panic.html
 [miette]: https://crates.io/crates/miette
 [anyhow]: https://crates.io/crates/anyhow
 [thiserror]: https://crates.io/crates/thiserror
 [context]: https://docs.rs/anyhow/latest/anyhow/trait.Context.html
 [result]: https://doc.rust-lang.org/stable/std/result/enum.Result.html
 [fatal]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/trait.FatalError.html#tymethod.fatal
 [non_fatal]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/trait.FatalError.html#tymethod.non_fatal
 [1]: https://github.com/zellij-org/zellij/commit/99e2bef8c68bd166cf89e90c8ffe8c02272ab4d3
 [2]: https://doc.rust-lang.org/stable/std/io/enum.ErrorKind.html#variant.NotFound
 [3]: https://github.com/zellij-org/zellij/issues/1753
 [pr1881]: https://github.com/zellij-org/zellij/pull/1881