23 Lazy evaluation

You write filter(penguins, species == "Adelie") and it works. But species isn’t a variable in your environment; it’s a column buried inside a data frame, invisible to ordinary evaluation rules. So how does R find it?

The answer begins with something R does not do. When you pass an argument to a function, R does not evaluate it. Not yet. Instead, R wraps the argument in an invisible object called a promise, a small bundle containing the expression you wrote and the environment where that expression should eventually be evaluated, and then it waits, holding the unevaluated expression in suspension until the function actually needs the result. This single design choice, the decision to procrastinate, ripples outward into default arguments that reference other arguments, functions that inspect your code before running it, and the entire tidyverse convention of writing column names as if they were variables.

The idea has a lineage. Early programming languages split over a simple question: when should arguments be evaluated? Fortran and C chose eagerness, computing each argument before handing it to the function (call-by-value). Algol 60 tried the opposite, passing the raw expression and re-evaluating it every time the function touched it (call-by-name). If the expression had side effects, they fired on every access; if the expression was expensive, the cost multiplied by the number of times the function touched the argument. The compromise came in 1976: pass the expression unevaluated, but cache the result after the first evaluation. Evaluate at most once. The technique was called lazy evaluation. Haskell made laziness the default for everything when it launched in 1990; R took a narrower path, inheriting through Scheme and S a design where function arguments are lazy but everything else is eager. That middle ground gives you expression capture without pervasive laziness, but it also means that every function argument in R carries a hidden payload most programmers never notice.

The mechanism behind all of this is promises. Once you see how they work, non-standard evaluation, data masking, and { } follow as consequences. Full metaprogramming lives in Chapter 26; here, the goal is understanding the machinery well enough to use it, and well enough to know when it’s working against you.

23.1 Promises

When you call f(x + 1), R does not compute x + 1 and pass the result. It wraps x + 1 in a promise: a bundle containing the expression (x + 1) and the environment where that expression should be evaluated (your calling environment). The promise travels into f unevaluated.

The first time f actually touches that argument, R cracks the promise open: it takes the stored expression, evaluates it in the stored environment, and caches the result. If the argument is never used, the promise is never evaluated, which means side effects inside it never happen. If the argument is used twice, the second access returns the cached value rather than re-evaluating the expression. One evaluation, at most. This sounds like a harmless optimization, but it changes what it means for a function to “receive” an argument.

A promise has three components:

The expression: what was written (x + 1).
The environment: where to evaluate it (the caller’s environment).
The cached value: filled in after the first evaluation, empty before.

Promises are invisible by design: R has no user-facing “promise” type. You cannot create one manually, inspect one with str(), or test for one with is.promise(). The invisibility is deliberate, because the act of inspecting a promise would force its evaluation, changing program behavior. The abstraction only works because you cannot look behind it.

In the terminology of lambda calculus, this is call-by-need evaluation: the argument is passed unevaluated (like call-by-name) but the result is cached after the first evaluation (unlike call-by-name, which would re-evaluate each time). Lambda calculus distinguishes three evaluation strategies: call-by-value (strict, as in C and Python), call-by-name (non-strict, as in Algol 60), and call-by-need (lazy, as in Haskell). R uses call-by-need for function arguments through promises, though most arguments are forced immediately in practice. The Church-Rosser theorem guarantees that if a reduction terminates, all strategies reach the same normal form, but they may differ in whether they terminate at all.

That difference matters more than it sounds. In an eager language, every argument is fully evaluated before the function sees it — the computation tree grows left to right, arguments resolved before calls. In R, arguments arrive as suspended expressions, unevaluated until the function actually needs them. The tree hangs in place, branches dormant until touched. This is what makes non-standard evaluation possible at all: aes(x = bill_length_mm) works because bill_length_mm stays unevaluated in your environment, captured as a promise, carried into the plotting function, and evaluated later inside the data frame. Data masking, quasiquotation, and formula interfaces all depend on the fact that arguments are not evaluated the moment you pass them. Without lazy evaluation, none of these techniques would work, because the expression would have already been reduced to a value before the function had a chance to inspect it.

What does laziness buy you in practice? Default arguments that would be impossible under eager evaluation:

f <- function(x, y = x * 2) y
f(5)
#> [1] 10

The default for y is a promise containing the expression x * 2. When R needs y’s value, it evaluates x * 2 inside the function body, where x is already bound to 5, yielding 10. Supply y explicitly and the default promise is never touched:

f(5, 99)
#> [1] 99

Defaults can depend on computations that happen inside the function itself, which gets stranger:

g <- function(x, n = length(x)) n
g(c(1, 2, 3))
#> [1] 3

The default n = length(x) is an expression, not a precomputed value; R evaluates it when n is first accessed, by which time x is already bound. If R evaluated defaults at the moment of the call, before the function body had a chance to run, none of this would work. But what happens when laziness interacts with side effects?

Exercises

Predict the output of this code, then run it:

h <- function(x) {
  cat("about to use x\n")
  x
}
h(cat("evaluating argument\n"))

Which cat() call prints first? Why?

Write a function f(x, y = x + 1) and call f(10). What is y? Now call f(10, 50). What changed?
What happens if you call a function that never uses its argument?
```
ignore <- function(x) 42
ignore(stop("this should error"))
```
Does it error? Why or why not?

23.2 Consequences of laziness

R chose a specific point in the design space: function arguments are lazy, everything else is eager. Haskell sits at the other extreme, where everything is lazy (let xs = [1..] creates an infinite list without computing all elements), enabling infinite streams and on-demand computation at the cost of making performance harder to reason about. R’s middle ground gives you NSE and flexible defaults without the mental overhead of tracking which values exist yet and which are still suspended.

Side effects in arguments may never happen. If a function ignores an argument, the promise wrapping it is never forced, which means any side effects buried in the expression (printing, writing files, raising errors) simply vanish:

quiet <- function(x) "I ignore my argument"
quiet(print("you will never see this"))
#> [1] "I ignore my argument"

No output from print(). The promise sat there, inert, and was eventually garbage-collected without ever running. This is not a bug; it’s the logical consequence of evaluate-only-when-needed. But it does mean you cannot rely on argument expressions to produce side effects unless the function actually touches them.

missing() can detect unsupplied arguments. Because arguments start life as unevaluated promises, R can distinguish between “the caller supplied this argument” and “the caller didn’t,” a distinction that would be impossible if all arguments were evaluated before the function saw them:

report <- function(x) {
  if (missing(x)) "not supplied" else "supplied"
}
report()
#> [1] "not supplied"
report(42)
#> [1] "supplied"

This only works in the window before evaluation. Once an argument is used, missing() loses its ability to answer the question.

Default expressions, not default values. A function like function(x, n = length(x)) does not store a default value for n. It stores a default expression, evaluated lazily inside the function at the moment n is first accessed. Defaults can therefore depend on other arguments, on computations that happen earlier in the function body, or on variables in the function’s enclosing environment. The entire default-argument system is built on promises.

The lazy evaluation trap in function factories. If you read Chapter 20, you saw that promises can cause trouble when a factory captures a loop variable:

funs <- list()
for (i in 1:3) {
  funs[[i]] <- function() i
}
funs[[1]]()
#> [1] 3
funs[[2]]()
#> [1] 3

All three functions return 3. Why? Because i was captured as a promise pointing to the loop variable, and by the time any function is called, the loop has finished and i is 3. The fix requires two tools working together: force(), which evaluates a promise immediately, and local(), which creates a fresh environment for each iteration so that each function captures its own copy of i rather than sharing the loop variable:

funs <- list()
for (i in 1:3) {
  local({
    force(i)
    funs[[i]] <<- function() i
  })
}
funs[[1]]()
#> [1] 3
funs[[2]]()
#> [1] 3

local() evaluates its body in a new, temporary environment, giving each iteration its own i binding. force(i) inside that environment evaluates the promise immediately, locking in the current loop value before the next iteration overwrites it. Without local(), all three functions share the same binding. Without force(), the promise might linger unevaluated until after the loop finishes. You need both: local() for isolation, force() for immediacy.

What is force(x) actually doing? Nothing clever. Its definition is just x, which accesses the argument and thereby triggers evaluation and caching. The name exists purely to communicate intent: “evaluate this now, don’t wait.” Which raises a broader question: if laziness can cause subtle bugs in loops and closures, why does R use it at all?

Exercises

Predict: does quiet(log(-1)) produce a warning? Why or why not? (Use the quiet function defined above.)
Write a function greet(name = "stranger") that returns paste("Hello,", name). Call it with and without an argument. Explain which default mechanism makes this work.

23.3 What is non-standard evaluation?

The previous section ended with a question: if laziness can cause subtle bugs, why does R bother with it? Because arguments travel as unevaluated promises, a function can intercept the expression before R reduces it to a value, and that capability is the foundation of non-standard evaluation.

Standard evaluation is straightforward: R evaluates an expression, produces a value, and hands the value to a function. When you write mean(c(1, 2, 3)), R computes the vector [1] 1 2 3 first, then passes it to mean, and mean never sees the expression c(1, 2, 3) at all.

Now try subset(df, x > 3). If R evaluated x > 3 in your environment before passing it to subset, the call would fail: there is no x in your workspace. Something different is happening here, something that depends on the laziness you’ve been studying. Instead of evaluating the argument, subset reaches into the promise, extracts the expression before evaluation, and then decides for itself where that expression should run:

df <- data.frame(x = 1:5, y = c(10, 20, 30, 40, 50))
subset(df, x > 3)
#>   x  y
#> 4 4 40
#> 5 5 50

x here is a column of df, not a variable in your global environment. subset never evaluates x > 3 in your environment. It captures the expression with substitute(), then evaluates it inside df with eval():

# Simplified subset internals
subset.data.frame <- function(x, subset, ...) {
  expr <- substitute(subset)
  row_mask <- eval(expr, x, parent.frame())
  x[row_mask & !is.na(row_mask), , drop = FALSE]
}

substitute() grabs the unevaluated promise. eval() evaluates it in an environment where the columns of x are visible as variables. That’s why x > 3 finds the column instead of failing with “object ‘x’ not found.”

This is NSE in its simplest form: intercept the expression, choose where to evaluate it. Compare the two approaches side by side:

# Standard evaluation: explicit, verbose
df[df$x > 3, ]

# Non-standard evaluation: concise, readable
subset(df, x > 3)

NSE is the reason the tidyverse feels like a domain-specific language for data analysis. filter(penguins, species == "Adelie") reads almost like English because you write column names directly, without $ or quotes. The cost is that the rules become less transparent: the meaning of species depends on which function you’re inside, not just on what’s in your environment. That tension between conciseness and predictability runs through everything that follows.

Exercises

Run subset(df, x > 3) with the data frame above, then try df[df$x > 3, ]. Verify they produce the same result. Which is easier to read?
What happens if you define x <- 100 in your global environment and then run subset(df, x > 3) again? Does it use the column or the variable? Why?

23.4 Data masking

Data masking is the tidyverse’s name for NSE applied to data frames, and it follows a specific lookup order. When you write:

library(dplyr)
filter(penguins, species == "Adelie")

R looks for species first in penguins, then in your calling environment. The data masks the environment: if the data frame has a column called species, that column wins over any variable named species in your workspace. Same principle applies to aes(x = bill_length_mm) in ggplot2, which captures the expression and evaluates it against the data at plot time, treating columns as if they were ordinary variables.

The benefit for interactive analysis is enormous. You type column names hundreds of times in a session; not having to write penguins$species or penguins[["species"]] each time keeps your code compact and readable, closer to how you think about the data than to how the computer stores it.

The cost appears the moment you try to program with data-masked functions. Suppose you want to write a function that filters by a column whose name is stored in a variable:

col <- "species"
filter(penguins, col == "Adelie")

This doesn’t do what you want. filter looks for a column named col in penguins, doesn’t find one, then finds col in your environment (the string "species"), and compares that string to "Adelie". No error, just wrong results, the most dangerous kind of failure.

To see the mechanism concretely, here is a simplified filter() you could write yourself in about ten lines:

my_filter <- function(.data, expr) {
  e <- substitute(expr)                    # capture the caller's expression
  env <- list2env(.data, parent = parent.frame())  # build an environment from the data frame
  mask <- eval(e, envir = env)             # evaluate the expression in that environment
  .data[mask & !is.na(mask), , drop = FALSE]
}

df <- data.frame(x = 1:5, y = c(10, 20, 30, 40, 50))
my_filter(df, x > 3)
#>   x  y
#> 4 4 40
#> 5 5 50

substitute(expr) captures the caller’s expression (x > 3) without evaluating it, exactly as a promise holds an unevaluated expression (Section 23.1). list2env(.data, parent = parent.frame()) creates a new environment whose bindings are the columns of the data frame, with the caller’s environment as the parent so that variables not in the data frame are still found. eval(e, envir = env) forces the expression in that constructed environment, where x resolves to the column rather than to anything in the global environment. This works when you write my_filter(df, x > 3) directly. Try wrapping it in another function, and the expression gets trapped in the wrong scope. This is what dplyr::filter() does, with more machinery for tidy evaluation, error handling, and grouped data frames layered on top.

Data masking is optimized for the common case (interactive analysis) at the expense of the less common case (writing reusable functions). The moment you wrap a data-masked function inside another function, the column name is trapped in your wrapper’s scope, and the caller has no way to pass it through. That’s the problem tidy evaluation exists to solve.

Exercises

Define x <- 1000 in your global environment. Create a data frame df <- data.frame(x = 1:5). What does dplyr::filter(df, x > 3) return? Does it use the column or the variable?
Explain in one sentence why aes(x = bill_length_mm) doesn’t need quotes around bill_length_mm.

23.5 Tidy evaluation basics

Data masking works because substitute() captures the promise from the immediate caller. But when you wrap a data-masked function inside your own function, substitute() captures your wrapper’s argument name, not the expression the original caller wrote. Tidy evaluation exists to thread promises through that extra layer.

The embrace operator { } (called “curly-curly”) solves the most common programming-with-NSE problem: passing a column name through your function to a dplyr verb without it getting lost along the way.

library(dplyr)

my_summary <- function(data, var) {
  data |>
    summarise(mean = mean({{ var }}, na.rm = TRUE))
}

{ var } says: “take whatever the caller passed as var, capture it as a data-masked expression, and inject it here.” The caller writes column names unquoted, exactly as they would with dplyr directly:

my_summary(penguins, body_mass_g)

Without { }, you’d have to teach your users a different syntax for your wrapper than they use for dplyr itself, which defeats the purpose of wrapping. With it, the wrapper is transparent.

For tidy selection (used in select(), across(), and similar), { } works the same way:

my_select <- function(data, cols) {
  data |> select({{ cols }})
}

my_select(penguins, starts_with("bill"))

The caller passes tidy-select expressions exactly as they would to select() directly. Your function just forwards them.

When { } isn’t enough, there are more tools. .data[["col_name"]] lets you use string column names inside data-masked functions. !! (bang-bang) and enquo() give finer control over quoting and unquoting. These are part of the full tidy evaluation framework in the rlang package, and they matter when you’re building complex programmatic interfaces. But they belong in Chapter 26, not here.

Opinion

For most dplyr wrapper functions, { } is sufficient. The full tidy evaluation system (quosures, enquo(), !!, !!!) exists for cases where you need to build expressions programmatically, splice multiple variables, or mix quoted and unquoted inputs.

Exercises

Write a function my_count(data, group_var) that uses { } to count rows per group. Test it with any data frame.
Write a function my_arrange(data, sort_var) that arranges a data frame by a column passed by the caller. Use { }.
What happens if you forget { } and write summarise(data, mean = mean(var, na.rm = TRUE)) inside a wrapper function? What error do you get?

23.6 The trade-offs of NSE

Promises enable NSE, NSE enables data masking, and { } makes data masking programmable. Each link adds a layer of indirection.

Convenience vs. predictability. filter(df, x > 5) is concise and readable, but if a variable x exists in your environment and a column x exists in your data frame, the column wins silently. When that’s wrong, nothing tells you.

Interactive vs. programmatic. NSE is optimized for the console, where you type select(df, name, age) and it just works. Writing a function that wraps select requires { }, an extra concept that sits between you and the dplyr you already know. The interactive user pays nothing; the programmer pays a tax, and the tax scales with the complexity of the interface you’re building.

Debugging. “Object ‘species’ not found” could be a typo, missing data, or a masking scope problem. The error message is the same in all three cases.

This isn’t a tidyverse invention. Base R uses NSE in subset(), with(), transform(), and the formula interface (lm(y ~ x, data = df)). The tidyverse applies it more systematically and more pervasively, but the idea is as old as S.

Opinion

NSE is the right trade-off for data analysis. You type column names hundreds of times a day, and quoting each one would be painful. The cost falls on the programmer writing reusable functions, not on the analyst exploring data interactively, and that’s a reasonable place for the cost to land since most R users spend more time analyzing than abstracting. Accept the trade-off, learn { } for when you cross the line into programming, and spend your energy on the next question: when you call print(x), how does R decide which print to use?