docs: Improve error handling docs (#1919)

* docs: Improve error handling docs

and add more TL;DRs and new sections about handling specific errors, and
logging errors.

* contributing: Add more coding tips
This commit is contained in:
har7an 2022-11-09 18:01:36 +00:00 committed by GitHub
parent d4beabfeb2
commit 49b831c032
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 200 additions and 1 deletions

View file

@ -124,8 +124,23 @@ something interesting to work on and guide through.
- Generate ad-hoc errors with `anyhow!(<SOME MESSAGE>)` - Generate ad-hoc errors with `anyhow!(<SOME MESSAGE>)`
- *Further reading*: [See here][error-docs-result] - *Further reading*: [See here][error-docs-result]
### Logging errors
- When there's a `Result` type around, use `.non_fatal()` on that instead of `log::error!`
- When there's a `Err` type around, use `Err::<(), _>(err).non_fatal()`
- Also attach context before logging!
- *Further reading*: [See here][error-docs-logging]
### Adding Concrete Errors, Handling Specific Errors
- Add a new variant to `zellij_utils::errors::ZellijError`, if needed
- Use `anyhow::Error::downcast_ref::<ZellijError>()` to recover underlying errors
- *Further reading*: [See here][error-docs-zellijerror]
[error-docs-context]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#attaching-context [error-docs-context]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#attaching-context
[error-docs-result]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#converting-a-function-to-return-a-result-type [error-docs-result]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#converting-a-function-to-return-a-result-type
[error-docs-logging]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#logging-errors
[error-docs-zellijerror]: https://github.com/zellij-org/zellij/blob/main/docs/ERROR_HANDLING.md#adding-concrete-errors-handling-specific-errors
## Filing Issues ## Filing Issues

View file

@ -92,12 +92,16 @@ to handle it. Instead of `panic`ing the application, the error is written to
the application log and execution continues. Please use this sparingly, as an the application log and execution continues. Please use this sparingly, as an
error usually calls for actions to be taken rather than ignoring it. error usually calls for actions to be taken rather than ignoring it.
# Examples of applied error handling # Examples of applied error handling
You can have a look at the commit that introduced error handling to the You can have a look at the commit that introduced error handling to the
`zellij_server::screen` module [right here][1] (look at the changes in `zellij_server::screen` module [right here][1] (look at the changes in
`zellij-server/src/screen.rs`). We'll use this to demonstrate a few things in `zellij-server/src/screen.rs`). We'll use this to demonstrate a few things in
the following text. the following text. You can find countless other examples in the [tracking
issue for error handling][3]
## Converting a function to return a `Result` type ## Converting a function to return a `Result` type
@ -184,6 +188,7 @@ time to do something about the error.
In general, any function calling `unwrap` or `expect` is a good candidate to be In general, any function calling `unwrap` or `expect` is a good candidate to be
rewritten to return a `Result` type instead. rewritten to return a `Result` type instead.
## Attaching context ## Attaching context
[Anyhow][anyhow]s [`Context`][context] trait gives us two methods to attach [Anyhow][anyhow]s [`Context`][context] trait gives us two methods to attach
@ -243,8 +248,13 @@ Feel free to move context string/closure to a variable to avoid copy-pasting:
When there is only a single `Result` to be returned from your function, use When there is only a single `Result` to be returned from your function, use
`context` as shown above for `resize_to_screen`. `context` as shown above for `resize_to_screen`.
## Choosing helpful context messages ## Choosing helpful context messages
> **TL;DR**
> - Don't repeat what the error message in the `Result` says
> - Describe what you were doing, ideally include the current functions name
When attaching context to an error, usually you want to express what you were When attaching context to an error, usually you want to express what you were
doing when the error occurred, i.e. in what context the error occurred. In the doing when the error occurred, i.e. in what context the error occurred. In the
`render` method, we could have done something like this instead: `render` method, we could have done something like this instead:
@ -276,8 +286,13 @@ give us a [`NotFound`][2] error), so we don't have to repeat that.
In case of doubt, look at the name of the function you're currently working in In case of doubt, look at the name of the function you're currently working in
and write a context message somehow mentioning this. and write a context message somehow mentioning this.
## Terminating execution ## Terminating execution
> **TL;DR**
> - Terminate execution on errors by adding `.fatal()` to it
> - First try to pass the error as far up as you can or deem reasonable
We want to propagate errors as far up as we can. This way, every function along We want to propagate errors as far up as we can. This way, every function along
the way can at least attach a context message giving us an idea what chain of the way can at least attach a context message giving us an idea what chain of
events lead to the error. Where do we terminate execution in `Screen`? If you events lead to the error. Where do we terminate execution in `Screen`? If you
@ -330,6 +345,10 @@ resulting output!
## Error handling for `Option` types ## Error handling for `Option` types
> **TL;DR**
> - Attach a `.context` with a message saying why a `None` here is an error
> - Attach a regular context message like you would for a `Result` type, too!
Beyond what's described in "Choosing helpful context messages" above, `Option` Beyond what's described in "Choosing helpful context messages" above, `Option`
types benefit from extra handling. That is because a `Option` containing a types benefit from extra handling. That is because a `Option` containing a
`None` where a value is expected doesn't carry an error message: It doesn't `None` where a value is expected doesn't carry an error message: It doesn't
@ -365,13 +384,178 @@ attach a description manually. The second context:
then describes what the surrounding function was trying to achieve (See then describes what the surrounding function was trying to achieve (See
descriptions above). descriptions above).
## Logging errors
> **TL;DR**
> - When there's a `Result` type around, use `.non_fatal()` on that instead of `log::error!`
> - When there's a `Err` type around, use `Err::<(), _>(err).non_fatal()`
> - Also attach context before logging!
> - For further examples, refer to [PR #1881][pr1881]
You may encounter situations where you have an error and decide it's safe to
ignore. Depending on the circumstances, this is a perfectly fine thing to do.
However, oftentimes it proves to be useful to at least log the error, so in
case things do go wrong we at least see the logged error message. Also, the
logged message may hint towards an underlying problem which may require further
action.
An obvious thing to do is something like the following:
```rust
log::error!("failed to find tab with index {tab_index}");
```
While an ad-hoc log message is better than silently ignoring the error, we can
usually do better than that. That is because in large parts of the codebase we
have a `Result` available in one form or another.
If the `Result` has been treated as suggested above and context messages have
been attached to it, it already contains a lot of valuable information. This is
lost when instead we log a custom error. Hence, the better solution is to log
the `Result` type including all the context information.
This is easily achieved like so:
```rust
fs::create_dir_all(&plugin_global_data_dir)
.context("failed to create plugin asset directory")
.non_fatal();
```
Here, we try to create a directory, and if we fail to do this we simply log the
error (with `non_fatal()`) and continue. It's important to note that the
`non_fatal()` function always returns `()`: It cannot be used on any `Result`
type whose `Ok` value is different from `()`. This is on purpose: If your
`Result` carries a type, it is probably required for further
calculations/actions. Hence, you mustn't ignore it.
Also note that, even though we log the error and continue, we still attach
context information to it. Just like with fatal errors, the context information
allows us to tell what we tried to do before we got the error and logged it.
This is a simple example, and oftentimes the `Result` you're trying to handle
doesn't carry a `()`. Your code may look more like this:
```rust
if let Ok(active_tab) = self.get_active_tab(client_id) {
let active_tab_pos = active_tab.position;
let new_tab_pos = (active_tab_pos + 1) % self.tabs.len();
return self.switch_active_tab(new_tab_pos, client_id);
} else {
log::error!("Active tab not found for client_id: {:?}", client_id);
}
```
When we get an `Ok`, we do something with it. When we get an `Err`, we log a
message. Here you'll notice that the usage of an `if let` statement hides the
error from us: In the `else` branch, we have no means to access the error we
got from `get_active_tab()` above. In such a situation, it is helpful to
rewrite the `if let` to a `match` statement instead:
```rust
match self.get_active_tab(client_id) {
Ok(active_tab) => {
let active_tab_pos = active_tab.position;
let new_tab_pos = (active_tab_pos + 1) % self.tabs.len();
return self.switch_active_tab(new_tab_pos, client_id);
},
Err(err) => Err::<(), _>(err).with_context(err_context).non_fatal(),
}
```
The interesting part here is what happens after the `Err(err)`. Let's break it down:
1. `Err(err) =>`: We're making the `Err` value that was inside the `Result`
accessible in the variable `err`
2. `Err::<(), _>(err)`: We're wrapping the `err` back into a `Result` with the
`Err()` function
3. `.with_context(err_context)`: We're attaching error context (the actual
context isn't shown here)
4. `.non_fatal()`: We're logging the error with `non_fatal`
Notice that in step 2 we must qualify the final `Result` type, which is the job
of the `::<(), _>` attached to `Err`. This is necessary because at this point
Rust cannot determine on its own what the `Ok` value of the `Result` we create
may be. This isn't relevant for our case, because we *know* the `Result` will
never be an `Ok` variant, but Rust still requires this.
If you're getting errors about "type annotations needed" or "cannot infer type
for type parameter `T`" in one of your calls to `Err`, this is most likely
fixed by adding `::<(), _>`.
## Adding Concrete Errors, Handling Specific Errors
> **TL;DR**
> - Add a new variant to `zellij_utils::errors::ZellijError`
> - Use `anyhow::Error::downcast_ref::<ZellijError>()` to recover underlying errors
Sometimes you'll find yourself in a situation where you want to react to very
specific errors. For example, the "command panes" feature in zellij has a
special handling for "command not found" errors. If all the `anyhow::Error`s
are the same, how can we distinguish between underlying error types?
This is possible because while `anyhow` can unify a vast amount of errors into
the `anyhow::Error` type, it also gives us the possibility to recover
underlying error types. To do this, however, we must first know what error type
to expect.
External libraries, such as other crates or even `std` will likely define their
own error types. These error types have distinct error variants that one can
distinguish and react upon. But what happens, for example, if we create the
error we want to react to ourselves?
For this purpose, there is the `ZellijError`, which is contained in
`zellij_utils::errors`. It is built with the [`thiserror`][thiserror] crate and
hence easily extensible. If you need a specific error type to act upon, just
define a new variant in `ZellijError`. It is automatically available in any
source file that has the `use zellij_utils::errors::prelude::*;` statement in
it.
Once you have created your error instance, as soon as you wrap a `context`
around it, it is turned into an `anyhow::Error`. This makes it compatible with
all the other functions in the code that return `anyhow::Result`.
Recovering the error can look, for example, like this:
```rust
match pty
.spawn_terminal(terminal_action, client_or_tab_index)
.with_context(err_context) // <-- Note how we attach a context, but can
// still recover the error below!
{
Ok(_) => {
// ... Whatever
},
Err(err) => match err.downcast_ref::<ZellijError>() {
Some(ZellijError::CommandNotFound { terminal_id, .. }) => {
// Do something now that this error occured.
// We can even access the values stored inside it, "terminal_id" in
// this case
},
// You can check for other error variants here
_ => {
// Some other error, which we haven't checked for, occured here.
// Now we can, for example, log it!
Err::<(), _>(err).non_fatal(),
},
},
}
```
[tracking_issue]: https://github.com/zellij-org/zellij/issues/1753 [tracking_issue]: https://github.com/zellij-org/zellij/issues/1753
[handle_panic]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/fn.handle_panic.html [handle_panic]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/fn.handle_panic.html
[miette]: https://crates.io/crates/miette [miette]: https://crates.io/crates/miette
[anyhow]: https://crates.io/crates/anyhow [anyhow]: https://crates.io/crates/anyhow
[thiserror]: https://crates.io/crates/thiserror
[context]: https://docs.rs/anyhow/latest/anyhow/trait.Context.html [context]: https://docs.rs/anyhow/latest/anyhow/trait.Context.html
[result]: https://doc.rust-lang.org/stable/std/result/enum.Result.html [result]: https://doc.rust-lang.org/stable/std/result/enum.Result.html
[fatal]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/trait.FatalError.html#tymethod.fatal [fatal]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/trait.FatalError.html#tymethod.fatal
[non_fatal]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/trait.FatalError.html#tymethod.non_fatal [non_fatal]: https://docs.rs/zellij-utils/latest/zellij_utils/errors/trait.FatalError.html#tymethod.non_fatal
[1]: https://github.com/zellij-org/zellij/commit/99e2bef8c68bd166cf89e90c8ffe8c02272ab4d3 [1]: https://github.com/zellij-org/zellij/commit/99e2bef8c68bd166cf89e90c8ffe8c02272ab4d3
[2]: https://doc.rust-lang.org/stable/std/io/enum.ErrorKind.html#variant.NotFound [2]: https://doc.rust-lang.org/stable/std/io/enum.ErrorKind.html#variant.NotFound
[3]: https://github.com/zellij-org/zellij/issues/1753
[pr1881]: https://github.com/zellij-org/zellij/pull/1881