penguin <- list(
species = "Adelie",
mass = 3750,
measurements = c(39.1, 18.7, 181)
)
penguin
#> $species
#> [1] "Adelie"
#>
#> $mass
#> [1] 3750
#>
#> $measurements
#> [1] 39.1 18.7 181.010 Lists
Vectors are strict: every element must be the same type. If you combine a number and a string, R coerces one to match the other (Section 4.3). This works well for columns of data, but not everything fits into a single type. A penguin has a species (character), a body mass (double), and a vector of measurements (double). You need a container that holds different types together without coercing them. That container is a list.
10.1 Why lists exist
A vector is homogeneous, but a list is heterogeneous: each element can be a number, a string, a vector, a function, or another list.
species is a character scalar, mass is a numeric scalar, measurements is a numeric vector of length 3. A vector could not hold these together without flattening and coercing them. A list keeps each element intact.
Technically, a list is a recursive vector. “Recursive” because its elements can themselves be lists, allowing arbitrary nesting. R’s documentation calls lists “generic vectors” to distinguish them from atomic vectors (Section 4.2). The type is "list":
typeof(penguin)
#> [1] "list"You already used a list in Section 7.3 when you stored functions as named elements. Lists are the same structure whether they hold numbers, strings, or functions.
The idea of pairing two values into one object has deep roots. In lambda calculus, a pair is encoded as λa.λb.λf. f a b: a function that captures two values and waits for a selector. Pass it λx.λy. x (select first) and you get a; pass λx.λy. y (select second) and you get b. Lisp’s cons, car, and cdr are exactly these operations with names. R’s list() is the practical descendant of this construction: a container built from the idea that you can bundle values together and retrieve them with selectors ([[1]], [[2]], $name).
Exercises
- Create a list with your name (character), your age (numeric), and your three favourite colours (a character vector). Print it.
- What does
typeof(list(1, "a", TRUE))return? - What happens if you use
c()instead oflist()to combine1,"a", andTRUE? Why is the result different?
10.2 Creating and accessing lists
10.2.1 Creating lists
list() creates a list. Elements can be named or unnamed:
named <- list(a = 1, b = "hello", c = TRUE)
unnamed <- list(1, "hello", TRUE)Named elements are easier to work with. Unnamed elements are accessed by position only.
10.2.2 The train analogy
Think of a list as a train. Each carriage holds cargo. There are three ways to access it:
x[1]returns the first carriage, still attached to the train. The result is a list of length 1.x[[1]]opens the first carriage and pulls out what’s inside. The result is the element itself.x$nameis shorthand forx[["name"]].
x <- list(a = 10, b = c(1, 2, 3), c = "hello")[ returns a sub-list:
x[1]
#> $a
#> [1] 10
typeof(x[1])
#> [1] "list"The result is still a list, just a shorter one.
[[ extracts the element:
x[[1]]
#> [1] 10
typeof(x[[1]])
#> [1] "double"Now it’s the number 10, not a list containing 10. This is the most common source of confusion with lists: [ keeps the container, [[ removes it.
$ works with names:
x$b
#> [1] 1 2 3
x[["b"]]
#> [1] 1 2 3Both return the same thing. $ is convenient for interactive use; [[ is necessary when the name is stored in a variable:
key <- "b"
x[[key]]
#> [1] 1 2 3x$key
#> NULLx$key looks for a literal element named "key", not for the value stored in the variable key. Use [[ when the name is computed.
You can select multiple elements with [:
x[c(1, 3)]
#> $a
#> [1] 10
#>
#> $c
#> [1] "hello"
x[c("a", "c")]
#> $a
#> [1] 10
#>
#> $c
#> [1] "hello"But [[ only works with a single index. It extracts one element at a time.
Default to [[ and $ for extracting elements. Use [ only when you need a sub-list. If you find yourself writing x[1][[1]], you wanted x[[1]] all along.
Exercises
- Given
x <- list(a = 10, b = 20, c = 30), predict the output ofx[2],x[[2]], andx$bbefore running them. - What is
typeof(x[1])versustypeof(x[[1]])? Explain the difference. - Create a variable
name <- "c". Use it to extract the element"c"from the listx. Which accessor works?
10.3 Nested lists
A list element can be another list. This creates nested structures:
study <- list(
site = "Palmer Station",
years = 2007:2009,
species = list(
list(name = "Adelie", count = 152),
list(name = "Gentoo", count = 124),
list(name = "Chinstrap", count = 68)
)
)str() shows the tree:
str(study)
#> List of 3
#> $ site : chr "Palmer Station"
#> $ years : int [1:3] 2007 2008 2009
#> $ species:List of 3
#> ..$ :List of 2
#> .. ..$ name : chr "Adelie"
#> .. ..$ count: num 152
#> ..$ :List of 2
#> .. ..$ name : chr "Gentoo"
#> .. ..$ count: num 124
#> ..$ :List of 2
#> .. ..$ name : chr "Chinstrap"
#> .. ..$ count: num 68str() is the most useful function for inspecting lists. It prints the structure compactly, showing types, lengths, and nesting levels. Use it whenever you receive an unfamiliar object. Every nested list is, in fact, a tree: the top-level list is the root, each element is a branch, and atomic values are leaves. The same structure appears in file systems (directories containing files and subdirectories), HTML (the DOM), and JSON.
To access nested elements, chain [[ or $:
study$species[[1]]$name
#> [1] "Adelie"
study[["species"]][[2]][["count"]]
#> [1] 124Each [[ or $ steps one level deeper. study$species is a list of three lists. study$species[[1]] is the first of those lists. study$species[[1]]$name is the string "Adelie".
Real-world nested lists are everywhere. You will encounter nested lists constantly in practice: JSON from a web API, the output of lm(), and configuration files are all nested lists or objects built on them. The access pattern is always the same: $ or [[ to step down one level, repeated as many times as needed.
Exercises
- Given the
studylist above, extract the count for Chinstrap penguins. - Use
str()on the result oflm(mpg ~ wt, data = mtcars). How many top-level elements does the model object have? - Create a nested list representing a book: title, author, and a list of chapters (each with a number and a title). Extract the title of the second chapter.
10.4 Lists as the backbone of R
Lists are not just a data structure you use directly. They are the foundation that other structures are built on.
A data frame is a list:
df <- data.frame(x = 1:3, y = c("a", "b", "c"))
typeof(df)
#> [1] "list"
is.list(df)
#> [1] TRUEEach column is one element of the list. The data frame adds a constraint: all columns must have the same length. But underneath, df$x works exactly like list access, because it is list access.
A linear model is a list:
fit <- lm(mpg ~ wt, data = mtcars)
typeof(fit)
#> [1] "list"
names(fit)
#> [1] "coefficients" "residuals" "effects" "rank"
#> [5] "fitted.values" "assign" "qr" "df.residual"
#> [9] "xlevels" "call" "terms" "model"fit$coefficients, fit$residuals, fit$fitted.values: these are all list extractions. The model object is a list with a class attribute ("lm") that tells R how to print and summarize it, but the data access works the same way.
Understanding lists means understanding everything built on top of them. Data frames (Chapter 11), model objects, and environments are all lists (or list-like structures) with different rules about what they can contain. The next chapter shows how data frames exploit this structure.
Exercises
- Run
typeof()on a data frame you create. Then runis.list()on it. What do you conclude? - Fit a model with
lm(Sepal.Length ~ Petal.Length, data = iris). Usenames()to see its elements, then extract the R-squared fromsummary()of the fit. (Hint:str(summary(fit))will help.) - What does
length()return for a data frame with 5 columns and 100 rows? Why?
10.5 Modifying lists
Add an element by assigning to a new name:
x <- list(a = 1, b = 2)
x$c <- 3
x[["d"]] <- "new"
str(x)
#> List of 4
#> $ a: num 1
#> $ b: num 2
#> $ c: num 3
#> $ d: chr "new"Remove an element by setting it to NULL:
x$b <- NULL
str(x)
#> List of 3
#> $ a: num 1
#> $ c: num 3
#> $ d: chr "new"b is gone, not set to NULL. This is a common trap: assigning NULL to a list element deletes it. If you actually want to store NULL as a value, use x["e"] <- list(NULL):
x["e"] <- list(NULL)
str(x)
#> List of 4
#> $ a: num 1
#> $ c: num 3
#> $ d: chr "new"
#> $ e: NULLe exists and its value is NULL. The single-bracket assignment with list(NULL) is the only way to store an actual NULL in a list.
Replace an element by assigning to an existing name:
x$a <- 100
x$a
#> [1] 100R uses copy-on-modify for lists, the same as for vectors. When you modify a list, R copies only the parts that change, not the entire structure. For practical purposes, you can treat assignment as modifying in place; the copying is an implementation detail that rarely affects your code.
Exercises
- Create a list with elements
x = 1andy = 2. Add an elementz = 3, then deletex. Print the result. - What does
length()return after you delete an element from a list? - Try
x$a <- NULLon a list whereaexists. Then tryx["a"] <- list(NULL). What is the difference?
10.6 Linked lists
A linked list is the simplest recursive data structure: each node holds a value and a pointer to the next node. The chain ends with NULL (the empty list). You can build one in R using nested lists:
cons <- function(head, tail) list(head = head, tail = tail)
car <- function(lst) lst$head
cdr <- function(lst) lst$tailcons constructs a node. car extracts the first element. cdr extracts the rest. These names come from Lisp (1958), where car and cdr referred to hardware registers on the IBM 704.
Build a linked list of three elements:
ll <- cons(1, cons(2, cons(3, NULL)))
str(ll)
#> List of 2
#> $ head: num 1
#> $ tail:List of 2
#> ..$ head: num 2
#> ..$ tail:List of 2
#> .. ..$ head: num 3
#> .. ..$ tail: NULLTo traverse it, recurse until you hit NULL:
ll_to_vector <- function(lst) {
if (is.null(lst)) return(c())
c(car(lst), ll_to_vector(cdr(lst)))
}
ll_to_vector(ll)
#> [1] 1 2 3This structure is a direct translation of Church encoding. In lambda calculus, a pair is λh.λt.λf. f h t: a function that captures two values and waits for a selector. car passes a selector that returns the first value; cdr passes one that returns the second. A linked list is a chain of such pairs, terminated by a special “nil” value. Lisp was built on exactly this encoding, and cons/car/cdr remain the standard names for these operations across functional languages.
Why doesn’t R use linked lists for everyday work? R was designed for column-oriented numerical computing, where you need to pass entire columns to C routines like sum() or BLAS. That requires contiguous memory: a single block of doubles that a C function can walk with a pointer. A linked list scatters its elements across the heap, so you would need to copy them into a contiguous buffer before any vectorized operation could touch them. R’s built-in lists are arrays of pointers (VECSXP), giving O(1) random access and contiguous pointer storage. Linked lists are good for prepending (O(1), just cons a new head) and for recursive processing where you always work with the first element and pass the rest along. Lisp used them because Lisp’s model is recursive symbolic processing, not numerical arrays. R chose arrays because its model is vectorized computation.
Exercises
- Using
cons,car, andcdrdefined above, build the linked list(10, 20, 30)and extract the second element without converting to a vector. - Write a function
ll_lengththat counts the number of nodes in a linked list by recursing throughcdruntilNULL. - Write a function
ll_mapthat takes a linked list and a function, and returns a new linked list with the function applied to each element. Test it by doubling every element ofcons(1, cons(2, cons(3, NULL))).
10.7 Looking ahead
Lists hold anything, including other lists. Data frames are lists where every element is a vector of the same length. That single constraint turns a general-purpose container into a table. Chapter 11 picks up exactly where this chapter leaves off.