Writing a Simple Version of safely

R function operators and environments galore!
Published

November 26, 2022

Set up

library(purrr)
library(tidyverse)
library(rlang)

A function operator is defined as a function that takes any number of functions as input and returns a function as output. As a way to better understand how they work and why they work, I’ll write a simple version of purrr::safely(). safely() is a really nice way to guard your functional programming against errors as it will allow your functionals to keep running even when they throw errors. I’ll show it in action with a simple example

tests <- list(
  tibble(a = 1:2, b = 2:3),
  tibble(a = 4:15, b = 1:12),
  tibble(a = 1:5, c = 6:10)
)
map(tests, ~ .x["b"])

# Error in `.x["b"]`:
# ! Can't subset columns that don't exist.
# ✖ Column `b` doesn't exist.
# Backtrace:
#  1. purrr::map(tests, ~.x["b"])
#  2. global .f(.x[[i]], ...)
#  4. tibble:::`[.tbl_df`(.x, "b")

Run on its own, this immediately throws an error and returns no output. But fear not, we can use the function safely:

out <- map(tests, safely( ~ .x["b"]))

This runs without errors and if we inspect the first item in the output and the third item in the output we can see that each output item contains the output and any error that occurs.

out[[1]]
$result
# A tibble: 2 × 1
      b
  <int>
1     2
2     3

$error
NULL
out[[3]]
$result
NULL

$error
<error/vctrs_error_subscript_oob>
Error in `.x["b"]`:
! Can't subset columns that don't exist.
✖ Column `b` doesn't exist.
---
Backtrace:
  1. purrr::map(tests, safely(~.x["b"]))
  2. purrr (local) .f(.x[[i]], ...)
 11. global .f(...)
 13. tibble:::`[.tbl_df`(.x, "b")

Now that we understand the functionality of safely we are ready to write our own version.

Our own version of safely

Essentially all that our simple version will do is that it will return a function that uses tryCatch on our original function and returns a list with the output and the error (if one occurs). We include an option to explicitly return the execution environment of the function for reasons that will become clear later on.

simple_safely <- function(.f, show_exec_env = F) {
  .f <- as_mapper(.f)
  
  if(show_exec_env) print(current_env())
  function(...) {
    out <- tryCatch(
      {
      list(result = .f(...), error = NULL)
      },
      error = function(cond){
        list(result = NULL, error = cond)
      }
    )
  return(out)
  }
}


out2 <- map(tests, simple_safely(~ .x["b"]))

This seems simple enough, but at least for me it was unsatisfying that I didn’t really understand what it was about R as a programming language that allowed function operations to work. Luckily we can dig a little bit into how this works by exploring the various environments associated with this procedure.

Inspecting Environments

We’ll start by creating a function that is a safe version of the function sum()

safe_sum <- simple_safely(sum)

We can start by just looking at safe_sum itself

safe_sum
function(...) {
    out <- tryCatch(
      {
      list(result = .f(...), error = NULL)
      },
      error = function(cond){
        list(result = NULL, error = cond)
      }
    )
  return(out)
  }
<bytecode: 0x55bd1308b170>
<environment: 0x55bd12d6a3a0>

Just looking at this output is fairly confusing because it just looks like the body of simple_safely() and it’s unclear how the function knows how to find .f.

We can gain some insight by using env_print() which prints the label of the environment of its argument, the label of the parent environment of the argument’s environment, as well as any bindings in the argument’s environment.

In this case since our input is a function, we see the label for that function environment and we also see that the parent environment is the global environment. We also see a binding for .f which seems promising…

env_print(safe_sum)
<environment: 0x55bd12d6a3a0>
Parent: <environment: global>
Bindings:
• .f: <fn>
• show_exec_env: <lgl>

A few a-ha! moments

We can see what that binding actually is by extracting it:

fn_env(safe_sum)$`.f`
function (..., na.rm = FALSE) 
.Primitive("sum")(..., na.rm = na.rm)

And there it is, the function environment has a binding between .f and the function sum(). This is cool, but to really understand why this binding happens we can make use of the show_exec_env argument in our simple_safely function:

safe_sum2 <- simple_safely(sum, show_exec_env = T)
<environment: 0x55bd1434aff8>

This time we get to see the label for the execution environment of this instance of simple_safely that created our function safe_sum2. Now remember that label, and look for it in the result of printing the environment for our newly manufactured function.

env_print(safe_sum2)
<environment: 0x55bd1434aff8>
Parent: <environment: global>
Bindings:
• .f: <fn>
• show_exec_env: <lgl>

Ah! We see that the enclosing environment of our safe_sum2 function is the execution environment from the instance of simple_safely that created it! And it’s this little fact that makes these function operators work. When safe_sum2 looks for .f it is really looking in an execution environment of simple_safely which contains a binding for what that function should be!