7  Functions are values

In Section 5.4, you saw that a function is a value: you can assign sqrt to a new name and call it, the same way you’d assign a number. If functions are values, you can pass them to other functions, store them in lists, and build functions that create new functions.

7.1 Passing functions to functions

sapply() takes a vector and a function, and applies the function to every element:

sapply(1:5, sqrt)
#> [1] 1.000000 1.414214 1.732051 2.000000 2.236068

Notice: sqrt has no parentheses. You are not calling sqrt; you are handing it to sapply, which calls it five times, once per element. This is the same distinction from Section 5.8: fact is the definition (a rule), fact(n) is the rule applied to an argument. Here, sqrt is the function itself, and sqrt(4) is the function applied to 4. When you write sapply(1:5, sqrt), you pass the definition; sapply decides when to apply it.

This is the pattern behind most of R’s power: instead of writing a loop that processes each element, you hand a function to something that knows how to apply it. The loop is still happening inside sapply, but you didn’t have to write it. You said what to compute, not how to iterate.

sapply(c(-3, 0, 4, -1, 7), abs)
#> [1] 3 0 4 1 7
sapply(c("hello", "world"), nchar)
#> hello world 
#>     5     5

Any function that takes a single argument works here. abs computes absolute values. nchar counts characters. You can pass any of them to sapply the same way.

In lambda calculus notation (Section 1.2), passing a function to another function is just application:

sapply = λxs. λf. [f(xs₁), f(xs₂), ..., f(xsₙ)]
sapply([1,2,3,4,5])(sqrt) = [sqrt(1), sqrt(2), sqrt(3), sqrt(4), sqrt(5)]

The function sapply takes a list and a function, and produces a new list by applying the function to each element. This is the same term replacement from Section 4.2: replace each f(xsᵢ) with its value, and the expression reduces to a vector of results. A function that takes another function as an argument is called a higher-order function. sapply is one. You will meet many more.

The structure of sapply is the functor pattern from Section 4.4: take a container (a vector or list), apply a function to each element, return a container of the same shape with transformed contents. In Section 4.4, R did this implicitly with vectorization (sqrt(c(1,4,9))). Here, sapply does it explicitly: you choose the function, and sapply handles the mapping. The abstraction is the same; only the interface changes.

Exercises

  1. Use sapply() to compute log() of the numbers 1 through 10.
  2. What happens if you write sapply(1:5, sqrt()) with parentheses? Try it and read the error.
  3. Use sapply() and is.na() to check which elements of c(1, NA, 3, NA, 5) are missing.

7.2 Anonymous functions

You can pass a named function like sqrt to sapply. But what if the function you need doesn’t have a name?

sapply(1:5, function(x) x^2)
#> [1]  1  4  9 16 25

function(x) x^2 creates a function right where it’s needed: no name, no assignment. It exists only for this one call to sapply, and then it’s gone. This is an anonymous function, and you already saw a brief example in Section 5.4.

R 4.1 introduced a shorter syntax:

sapply(1:5, \(x) x^2)
#> [1]  1  4  9 16 25

\(x) is shorthand for function(x). The backslash is meant to evoke the Greek letter lambda (λ), which is where the whole idea comes from (Section 1.2). In lambda calculus, this function is written λx. x². In R before 4.1, it’s function(x) x^2. In R 4.1+, it’s \(x) x^2. Three notations for the same thing.

Anonymous functions are useful when the operation is short and you only need it once:

sapply(c(10, 20, 30), \(x) x / sum(c(10, 20, 30)))
#> [1] 0.1666667 0.3333333 0.5000000
sapply(c("alice", "bob"), \(name) paste("Hello,", name))
#>          alice            bob 
#> "Hello, alice"   "Hello, bob"

When the function gets longer than one line, or when you need it in more than one place, give it a name. A function called normalize is easier to read than an anonymous function that divides by the sum. A function called greet is easier to maintain than a lambda pasted into three different sapply calls.

TipOpinion

If you use a function twice, name it. Anonymous functions are for one-off operations inside sapply, lapply, and similar. If you find yourself copying and pasting the same \(x) ... expression, that’s a sign it deserves a name.

Exercises

  1. Use sapply() with an anonymous function to add 10 to each element of c(1, 5, 9).
  2. Rewrite sapply(1:5, \(x) x^2) using the older function(x) syntax. Verify you get the same result.
  3. Write a named function to_celsius that converts Fahrenheit to Celsius. Then use sapply() to convert c(32, 72, 100, 212).

7.3 Storing functions in lists

If functions are values, you can put them in a list:

transforms <- list(
  double = \(x) x * 2,
  triple = \(x) x * 3,
  negate = \(x) -x
)

transforms$double(5)
#> [1] 10
transforms$triple(5)
#> [1] 15
transforms$negate(5)
#> [1] -5

transforms is a list of three functions, accessed with $ the same way you access any named list element.

Why is this useful? Suppose you have data that needs different processing depending on a column type. You can store the processing functions in a named list and look up the right one:

summarizers <- list(
  numeric = \(x) mean(x, na.rm = TRUE),
  character = \(x) length(unique(x)),
  logical = \(x) sum(x, na.rm = TRUE)
)

summarizers$numeric(c(1, 2, 3, NA, 5))
#> [1] 2.75
summarizers$character(c("a", "b", "a", "c"))
#> [1] 3
summarizers$logical(c(TRUE, FALSE, TRUE, TRUE))
#> [1] 3

The list acts as a dispatch table: given a type name, it returns the appropriate function. This pattern appears in real R code more often than you might expect. Package internals, Shiny applications, and testing frameworks all use lists of functions to select behavior at runtime.

Exercises

  1. Create a list with two functions: square and cube. Apply both to the number 4.
  2. Write a list of functions where each one converts a temperature from Celsius to a different scale (Fahrenheit, Kelvin). Use them to convert 100 degrees Celsius.

7.4 Functions that return functions

A function can return a number, a string, a vector, or a data frame. It can also return a function.

make_adder <- function(n) {
  function(x) x + n
}

make_adder takes a number n and returns a new function that adds n to its argument. Watch:

add5 <- make_adder(5)
add5(3)
#> [1] 8
add5(100)
#> [1] 105
add10 <- make_adder(10)
add10(3)
#> [1] 13

add5 and add10 are both functions, built by make_adder with different values of n. Each returned function remembers the value of n it was created with. add5 remembers 5; add10 remembers 10.

How does the returned function “remember” n? When make_adder(5) runs, it creates a local environment where n = 5. The returned function function(x) x + n was defined inside that environment. When you later call add5(3), R looks for n in the function’s own environment, doesn’t find it, and looks in the environment where the function was defined, the one created by make_adder(5). There it finds n = 5. This is the same lexical scoping from Section 5.5, taken one step further.

A function that carries its creation environment with it is called a closure. Every function in R is technically a closure, but the term is most useful when a function returned by another function captures variables from the enclosing scope. Closures are powerful enough to deserve their own chapter (Chapter 18); for now, the important thing is that make_adder works because each returned function carries its creation environment with it.

In lambda calculus notation:

make_adder = λn. λx. x + n
make_adder(5) = (λn. λx. x + n)(5) = λx. x + 5

Replacing n with 5 gives a new function, λx. x + 5. That is add5. The returned function is a partially applied version of the original, with one argument already filled in. This is currying: fixing one argument to produce a simpler function.

In lambda calculus terms, make_adder(5) is a beta reduction: (λn. λx. x + n)(5) reduces to λx. x + 5. The outer lambda absorbs its argument and disappears, leaving a simpler function. Every time you call a function factory in R, you are performing a beta reduction. The returned function is the reduced term, with one variable already bound.

make_multiplier <- function(factor) {
  function(x) x * factor
}

double <- make_multiplier(2)
triple <- make_multiplier(3)

double(7)
#> [1] 14
triple(7)
#> [1] 21

make_multiplier is another function factory. It produces double and triple from the same template, with different values baked in. You will see function factories used throughout R for creating specialized functions from general patterns: custom plot themes, parameterized statistical tests, configured data transformations. Chapter Chapter 20 covers them in depth.

Exercises

  1. Write a function make_power that takes an exponent n and returns a function that raises its argument to the nth power. Create square <- make_power(2) and cube <- make_power(3) and test them.
  2. What does make_adder(0) return? Is it the identity function?
  3. Write a function make_greeter that takes a greeting string (like "Hello" or "Bonjour") and returns a function that takes a name and produces the greeting. Test: greet_en <- make_greeter("Hello"); greet_en("Alice") should return "Hello, Alice".

7.5 Operators are functions

Try this:

`+`(2, 3)
#> [1] 5

+ is a function. It takes two arguments and returns their sum. When you write 2 + 3, R translates it to `+`(2, 3). The infix notation is syntactic sugar; underneath, it’s a function call.

The same is true for every operator:

`*`(4, 5)
#> [1] 20
`>`(10, 3)
#> [1] TRUE
`&`(TRUE, FALSE)
#> [1] FALSE

And it goes further than operators. Indexing with [ is a function call:

x <- c(10, 20, 30)
`[`(x, 2)
#> [1] 20

Accessing a list element with $ is a function call. Even the curly brace { is a function call (it evaluates each expression inside and returns the last one). Assignment <- is a function call.

John Chambers, the creator of S (Section 2.3), summarized R’s design with two principles:

“Everything that exists is an object. Everything that happens is a function call.”

The first rule says that numbers, strings, vectors, functions, and even NULL are all objects. The second says that addition, comparison, indexing, assignment, and control flow are all function calls. These two rules, together with the fact that functions are values (so they are objects too, covered by the first rule), define what it means for R to be a functional language.

You can even override built-in operators. (This is a demonstration, not a recommendation.)

`+` <- function(a, b) a * b
2 + 3
#> [1] 6
rm(`+`)
2 + 3
#> [1] 5

After redefining + to mean multiplication, 2 + 3 returns 6. rm() removes the custom definition and restores the original. The fact that you can redefine + shows that it really is just a name bound to a function, like any other name. R’s entire syntax is built on function calls, and those functions are values you can inspect, replace, and pass around.

TipOpinion

Never redefine built-in operators in real code. The example above is to show you what R is made of, not to suggest a workflow. Code that redefines + is code that nobody (including you, six months later) will be able to read.

Exercises

  1. Rewrite 10 - 3 as an explicit function call using backticks.
  2. Rewrite x[1] as an explicit function call. (Hint: `[`(x, 1).)
  3. What does `{`(1, 2, 3) return? Why?