x <- 42
y <- "hello"
ls()
#> [1] "pandoc_dir" "quarto_bin_path" "x" "y"18 Closures and scope
When you call a function, R needs to find every name you use inside it. The rules for how it searches are simple and general. Section 5.5 introduced the basics: local variables live inside the function, and a function can see variables from the environment where it was defined. This chapter makes those ideas precise and takes them further, to the point where functions can remember things between calls.
18.1 Environments
An environment is R’s internal data structure for keeping track of which names refer to which values. Think of it as a named bag: each name in the bag points to one object. Every environment has a parent, forming a chain that R walks when it needs to resolve a name. The empty environment, at the very top of the chain, has no parent; everything else does.
You can inspect the current environment with environment(), list its contents with ls(), and find its parent with parent.env():
environment()
#> <environment: R_GlobalEnv>18.1.1 The global environment
The global environment (.GlobalEnv, or equivalently globalenv()) is where your interactive work lives. Every time you type x <- 42 at the console, you create a binding in this environment. It is the bottom of the search path, the starting point for name resolution, and the default enclosing environment for any function you define interactively.
identical(environment(), globalenv())
#> [1] TRUEThe global environment is special in three ways. First, it never gets garbage-collected; it lives for the entire R session. Second, it has no fixed parent: its parent is the last package you attached with library(). Third, it is the only environment you routinely modify by hand. Functions create and destroy execution environments automatically (Section 18.3); the global environment accumulates bindings as you work.
This accumulation is both convenient and dangerous. Convenient because you can explore interactively, building up data objects step by step. Dangerous because any function you define in the global environment can see and depend on those objects. A function that works in your current session may break in a fresh one because it relied on a global variable you forgot to pass as an argument. This is the free variable problem from lambda calculus: a function with unbound names is an open term, and its behavior depends on the context where it runs. Making all dependencies explicit (passing them as arguments) turns it into a closed term, a closure in the formal sense, and makes it portable.
18.1.2 The search path
When you type x at the console, R looks in the global environment first; if it doesn’t find a match, it moves to the parent, then the parent’s parent, walking up the chain until it either finds the name or runs out of environments. This chain is the search path, and you can see it with search():
search()
#> [1] ".GlobalEnv" "package:stats" "package:graphics"
#> [4] "package:grDevices" "package:utils" "package:datasets"
#> [7] "package:methods" "Autoloads" "package:base"The global environment sits at the bottom, with attached packages stacked above it and package:base near the top. This is why you can call mean() without writing base::mean(): R walks the chain, finds mean in the base package, and uses it.
Exercises
- Run
ls()in a fresh R session. What do you see? Now assigna <- 1and runls()again. - Run
search()and count how many environments are in the chain. Load a package withlibrary(tools)and runsearch()again. Where did the new package appear? - What does
parent.env(globalenv())return? What aboutparent.env(baseenv())?
18.2 Lexical scoping
R uses lexical scoping: a function looks for names where it was defined, not where it was called. This choice traces back to Scheme (1975), which adopted lexical scoping over Lisp’s dynamic scoping and changed programming history. Dynamic scoping searches the call stack, which is simpler to implement but makes functions unpredictable: the same function returns different results depending on who calls it. R inherited lexical scoping from Scheme, not from S (which originally used dynamic scoping; see Section 2.3), and this one design decision is why closures work in R at all.
x <- 10
f <- function() x
g <- function() { x <- 20; f() }
g()
#> [1] 10g() returns 10, not 20. Because f was defined in the global environment, it looks for x there, not inside g where it was called. The x <- 20 inside g is invisible to f. What matters is where the function was written in the source code (the definition site), not where it happens to be called at runtime.
Four rules govern how R resolves names:
Name masking. Local names shadow names in parent environments. If a function defines
x, thatxhides anyxin the global environment.Functions vs variables. R distinguishes function lookups from variable lookups. If you call
f(3), R searches forfbut skips non-function values. This means you can have a variablec <- 10and still callc(1, 2, 3)to create a vector, because R knows you want the function.Fresh start. Every function call gets a fresh execution environment. Variables from a previous call don’t carry over.
Dynamic lookup. R looks up names when the function runs, not when it’s defined. If a function uses a free variable, its value can change between calls:
multiplier <- 2
scale <- function(x) x * multiplier
scale(5)
#> [1] 10multiplier <- 10
scale(5)
#> [1] 50scale doesn’t snapshot the value of multiplier when it’s defined; it looks it up fresh each time it runs. This means the function adapts automatically if multiplier changes, which is occasionally useful but more often a source of bugs: someone modifies a global variable, and a seemingly unrelated function starts returning different results.
Exercises
Predict the output before running:
x <- 1 f <- function() { x <- 2 g <- function() x g() } f()Predict the output:
x <- 1 f <- function() { g <- function() x x <- 2 g() } f()Why is the result different from what you might expect? (Hint: rule 4.)
Can you have a variable named
meanand still call the functionmean()? Try it.
18.3 Execution environments
Every time you call a function, R creates a new environment, the execution environment, with the function’s arguments and local variables as bindings.
f <- function() {
n <- 0
n <- n + 1
n
}
f()
#> [1] 1
f()
#> [1] 1
f()
#> [1] 1Call f() ten times and you always get 1, because each call gets its own execution environment with its own n. The previous call’s n doesn’t carry over (rule 3).
The parent of this fresh execution environment is not the environment where the function was called; it’s the environment where the function was defined (the enclosing environment). This is what makes scoping lexical: the parent chain is determined by the structure of the source code, not by which function happened to call which at runtime.
x <- "global"
outer <- function() {
x <- "outer"
inner <- function() x
inner()
}
outer()
#> [1] "outer"inner was defined inside outer, so its enclosing environment is outer’s execution environment. When inner looks for x, it finds "outer", not "global".
Normally, local variables live and die with the call: when the function returns, its execution environment gets garbage-collected. But if the function returns another function that was defined inside it, that returned function holds a reference to the execution environment, keeping it alive. This is a closure.
Exercises
Write a function
fresh()that creates a local variablen <- 0, increments it, and returns it. Call it three times and verify you always get 1.Predict the output:
make <- function() { a <- 1 function() a } h <- make() a <- 99 h()
18.4 Closures
All R functions are technically closures (they all have an enclosing environment), but the term is most useful when a function captures variables from a parent that isn’t the global environment. The classic example is a counter:
make_counter <- function() {
n <- 0
function() {
n <<- n + 1
n
}
}
count <- make_counter()
count()
#> [1] 1
count()
#> [1] 2
count()
#> [1] 3When you call make_counter(), R creates an execution environment with n <- 0. The inner function is defined in that environment, making it the inner function’s enclosing environment. When make_counter returns the inner function, the execution environment would normally be garbage-collected, but the returned function still points to it, so it survives. Each subsequent call to count() creates its own fresh execution environment, but the enclosing environment (the one from make_counter that holds n) is shared across all calls. That’s how count remembers its state between invocations.
<<- is the super-assignment operator. Instead of creating a local binding, it searches parent environments for an existing binding named n and modifies it in place. Without <<-, writing n <- n + 1 would create a local n in the execution environment, shadowing the captured one, and the counter would always return 1.
Two counters are independent. Each call to make_counter() creates a separate execution environment with its own n:
a <- make_counter()
b <- make_counter()
a()
#> [1] 1
a()
#> [1] 2
a()
#> [1] 3
b()
#> [1] 1a has counted to 3; b has counted to 1. They don’t share state.
Connection to Section 7.4: make_adder was a preview. Now you understand why the returned function remembers n. the mechanism is an environment that stays alive because something still points to it.
This is also exactly what happens in lambda calculus. In make_adder(5), the returned function \(x) x + n has n as a free variable: a name that is used but not defined inside the function. The closure captures n = 5, binding that free variable. In lambda notation, make_adder is (λn. λx. x + n). Applying it to 5 gives λx. x + 5 by substitution (beta reduction). The free variable n gets replaced by a concrete value. Closures are the runtime mechanism that makes this substitution real. The name itself comes from this idea: a closure closes over its free variables, turning an open term (one with unresolved names) into a closed one (where every variable is accounted for).
Every closure you create in R is a beta reduction frozen mid-step. make_adder(5) reduces (λn. λx. x + n) to λx. x + 5, but the returned function doesn’t evaluate further until you give it an x. The closure holds the partially reduced term, waiting. When you finally call add5(3), another beta reduction fires: (λx. x + 5)(3) becomes 3 + 5, and R evaluates it to 8. Two reductions, two calls, one answer. This is not a metaphor; it is what R does internally when it resolves the captured binding n = 5 in the enclosing environment.
Exercises
- Create a counter with
make_counter. Call it five times. Then inspect the capturednwithenvironment(count)$n. - Modify
make_counterto accept a starting value:make_counter <- function(start = 0) { ... }. Verify thatmake_counter(10)starts counting from 11. - Write
make_countdown(n)that counts down fromn. Each call returns the next value. What happens when it reaches 0?
18.5 <<- and mutable state
<<- is the only way to create mutable state in R’s functional world. It searches parent environments for an existing binding and modifies it. If it doesn’t find the variable in any parent, it creates one in the global environment. That’s almost always a mistake.
rm(oops, envir = globalenv()) # clean slate
#> Warning in rm(oops, envir = globalenv()): object 'oops' not found
f <- function() {
oops <<- "surprise"
}
f()
oops
#> [1] "surprise"oops was never defined, so <<- created it in the global environment. This is a side effect, invisible at the call site, and exactly the kind of hidden dependency that makes code hard to debug.
Use <<- inside closures, never in scripts. If you’re using <<- to modify a global variable, you’re writing a bug you haven’t found yet. The legitimate use is closures that encapsulate state: counters, caches, accumulators. The state is private to the closure, invisible to the outside world.
Inside a closure, <<- is safe because the variable it modifies lives in the closure’s private environment, not in the global environment. Nobody outside the closure can see it or change it (unless they deliberately reach into the environment with environment(f)$n, which is an explicit choice, not an accident).
rm(oops, envir = globalenv())Exercises
- Write a closure
make_accumulator(start)that returns a function. Each call adds its argument to a running total and returns the new total. Test:acc <- make_accumulator(0); acc(5); acc(3); acc(10)should return 5, 8, 18. - What happens if you use
<-instead of<<-inside the closure? Try it with the counter example.
18.6 Closures as portable state
A closure bundles a function with its data. No global variables, no side effects visible from outside. This makes closures one of the most practical tools in R.
A running mean:
make_running_mean <- function() {
total <- 0
count <- 0
function(x) {
total <<- total + x
count <<- count + 1
total / count
}
}
avg <- make_running_mean()
avg(10)
#> [1] 10
avg(20)
#> [1] 15
avg(6)
#> [1] 12The pattern is always the same: a factory function creates an environment, returns an inner function that captures it, and <<- lets the inner function modify the captured state. Because the state lives in a private environment that travels with the function, no outside code can see or tamper with it, and it persists across calls.
Compare this to the global-variable approach:
# Don't do this
total <- 0
count <- 0
running_mean <- function(x) {
total <<- total + x
count <<- count + 1
total / count
}This version pollutes the global environment with total and count. Any other code can read or modify them. You can’t have two independent running means. And if you forget to reset them, your next analysis starts with stale state. The closure version has none of these problems.
Practical uses for closures go beyond counters and accumulators:
- Function factories (Chapter 20): parameterized families of functions.
make_adder,make_multiplier, and their real-world cousins in ggplot2 themes, statistical tests, and data transformations. - Memoization: cache expensive results so they’re computed only once.
- Callbacks and event handlers: carry context without globals, common in Shiny applications.
- Encapsulated modules: a list of closures sharing a private environment.
That last pattern deserves a concrete example. A list of closures sharing an environment is functionally equivalent to an object with methods and private fields:
make_bank_account <- function(balance = 0) {
list(
deposit = function(amount) {
balance <<- balance + amount
invisible(balance)
},
withdraw = function(amount) {
balance <<- balance - amount
invisible(balance)
},
check = function() balance
)
}
acct <- make_bank_account(100)
acct$deposit(50)
acct$withdraw(30)
acct$check()
#> [1] 120Three closures share a single environment containing one private variable, balance. From the outside, acct behaves like an object with methods; from the inside, it’s just functions closing over a shared environment. Peter Norvig observed that “closures are poor man’s objects, and objects are poor man’s closures.” R6 works this way internally: each R6 object is an environment, and its methods are functions whose enclosing environment is that same environment.
Exercises
- Build a
make_running_meanclosure. Feed it the values 4, 8, 12. Verify the running mean is 4, 6, 8. - Create two independent running means,
avg1andavg2. Feed different values to each and verify they don’t interfere. - Extend
make_bank_accountwith astatementfunction that returns a character vector of all transactions (deposit or withdrawal). You’ll need ahistoryvariable in the enclosing environment.
18.7 Inspecting environments
Closures are easier to understand when you can look inside them. environment(f) returns the enclosing environment of a function. ls() lists what’s in it. And you can access captured variables directly with $:
count <- make_counter()
count()
#> [1] 1
count()
#> [1] 2
environment(count)
#> <environment: 0x5601590950b8>
ls(environment(count))
#> [1] "n"
environment(count)$n
#> [1] 2The counter has been called twice, so n is 2. You can watch it change in real time:
count()
#> [1] 3
environment(count)$n
#> [1] 3Now n is 3, incremented by the call.
For a richer view, rlang::env_print() shows the environment’s contents, parent, and memory address:
rlang::env_print(environment(count))These tools are not for production code (reaching into a closure’s environment breaks its encapsulation), but for learning they’re invaluable. The cycle of “make a closure, inspect its environment, call it, inspect again” is the fastest way to build intuition for how closures actually work under the hood.
Exercises
- Create a running mean with
make_running_mean. Feed it three values. Then inspectenvironment(avg)$totalandenvironment(avg)$countto verify the internal state. - Create two counters. Inspect their environments and confirm they have different
nvalues after calling them different numbers of times. - What does
environment(mean)return? Why is it different fromenvironment(count)?