6 Numerical computing

Type 0.1 + 0.2 == 0.3 in R. It returns FALSE.

0.1 + 0.2 == 0.3
#> [1] FALSE

This is not a bug. Every number you type into R is an approximation, and operations on those approximations do not always behave the way real arithmetic does. Most of the time the approximation is so good that you never notice it. This chapter shows you the cases where it matters, how to recognize the traps, and how to work with R’s matrix algebra tools. The first four sections (storage, traps, cancellation, integers) are for every R user. The rest covers numerical stability and matrix algebra; a callout box marks where you can skip ahead if you are new to R.

6.1 How R stores numbers

R uses IEEE 754 double-precision floating point for all numeric values by default, packing each number into 64 bits: 1 for the sign, 11 for the exponent, 52 for the significand (the significant digits). That buys you roughly 15 to 16 significant decimal digits and a range from about 5e-324 to 1.8e308.

Two machine constants are worth knowing:

.Machine$double.eps
#> [1] 2.220446e-16

This is machine epsilon: the smallest number such that 1 + eps is not equal to 1. It defines the granularity of floating point near 1; any difference smaller than this disappears.

.Machine$integer.max
#> [1] 2147483647

The largest integer R can represent with the integer type. Beyond this, integer arithmetic produces NA with a warning.

R also has three special floating-point values:

1 / 0
#> [1] Inf
-1 / 0
#> [1] -Inf
0 / 0
#> [1] NaN

Inf and -Inf result from overflow and division by zero, while NaN (“Not a Number”) results from undefined operations like log(-1). These values propagate: any arithmetic involving NaN returns NaN, and any comparison with NaN returns NA.

So R’s numbers are approximations, and for most work the approximation is excellent. But “excellent” is not “exact.”

The lambda calculus connection

Lambda calculus and mathematical logic operate on exact symbolic terms. Physical computers operate on finite approximations. The gap between 0.1 (the mathematical real number) and 0.1 (the closest IEEE 754 double) is where the two diverge.

How big is that gap? You can see the stored representation with sprintf():

sprintf("%.20f", 0.1)
#> [1] "0.10000000000000000555"
sprintf("%.20f", 0.2)
#> [1] "0.20000000000000001110"
sprintf("%.20f", 0.3)
#> [1] "0.29999999999999998890"

None of these are the decimal values you typed; they are the closest 64-bit floating-point numbers. Every trap in this chapter traces back to that sliver of space between what you write and what R stores.

6.2 Floating-point traps

The classic:

0.1 + 0.2 == 0.3
#> [1] FALSE

Neither 0.1 nor 0.2 nor 0.3 can be represented exactly in binary floating point, so the sum of the stored representations of 0.1 and 0.2 differs from the stored representation of 0.3 by a tiny amount (about 5.6e-17). Because == checks for exact equality, it returns FALSE. This is a specific instance of a deeper fact: floating-point operations do not obey real arithmetic. Addition is not even associative; (a + b) + c may differ from a + (b + c) because each intermediate result gets rounded independently. For the full map of these traps, David Goldberg’s 1991 paper “What Every Computer Scientist Should Know About Floating-Point Arithmetic” remains the definitive reference.

The fix is all.equal(), which compares with a tolerance:

all.equal(0.1 + 0.2, 0.3)
#> [1] TRUE

The default tolerance is sqrt(.Machine$double.eps), about 1.5e-8. For vectorized comparisons, use dplyr::near():

dplyr::near(0.1 + 0.2, 0.3)
#> [1] TRUE

Opinion

Never use == to compare computed floating-point numbers. Use all.equal() for tests and dplyr::near() for vectorized comparisons. If you find yourself writing x == 0.3, stop and ask whether x was computed or typed.

Accumulation is subtler. Adding a small number to a large number can lose the small number entirely:

1e16 + 1 - 1e16
#> [1] 0

The result should be 1. It is 0, because 1e16 + 1 cannot be distinguished from 1e16 in 64-bit floating point; the 1 simply vanishes. When you call sum() on a million numbers, these rounding errors accumulate with each addition. R’s sum() uses pairwise summation (splitting the vector in half and summing each half recursively) to reduce this drift. For extreme precision, compensated summation algorithms like Kahan’s (1965) track the accumulated error in a separate variable and correct for it at each step.

Sequences suffer the same problem:

x <- seq(0, 1, by = 0.1)
sum(x == 0.3)
#> [1] 0

The sequence has 11 elements, and one of them is “close to” 0.3, but none of them is exactly equal to 0.3 as stored in memory.

Subtraction of nearby values is another trap. Computing sqrt(x^2 + 1) - x for large x should give a value near 1/(2*x), but for large enough x the subtraction loses all significant digits:

x <- 1e15
sqrt(x^2 + 1) - x   # should be tiny and positive
#> [1] 0

The result is 0, because sqrt(x^2 + 1) rounds to x before the subtraction happens. This is called catastrophic cancellation, and it deserves its own section.

Exercises

Check whether sqrt(2)^2 == 2 returns TRUE or FALSE. Then use all.equal() to compare them.
Compute 1e16 + 1 == 1e16 and explain the result.
Use dplyr::near() to find which elements of seq(0, 1, by = 0.1) are close to 0.3.

6.3 Catastrophic cancellation

When you subtract two nearly equal numbers, the significant digits cancel, and what remains is dominated by rounding error.

The textbook example is the variance formula. The mathematically correct identity is:

\[\text{Var}(x) = E[x^2] - (E[x])^2\]

In code:

naive_var <- function(x) mean(x^2) - mean(x)^2

For well-separated values, this works fine:

x <- c(1, 2, 3, 4, 5)
naive_var(x)
#> [1] 2
var(x) * (length(x) - 1) / length(x)  # population variance for comparison
#> [1] 2

Now add a large offset so the values are large and close together:

y <- x + 1e8
naive_var(y)
#> [1] 2
var(y) * (length(y) - 1) / length(y)
#> [1] 2

The naive formula returns a wildly wrong answer (possibly negative, depending on the data), because mean(y^2) and mean(y)^2 are both enormous and nearly equal; their difference is noise.

R’s built-in var() uses a numerically stable one-pass algorithm that avoids this trap. The lesson: mathematically equivalent expressions are not numerically equivalent. Prefer formulations that avoid subtracting nearly equal quantities. When you see unexpected negative values for something that should be positive (a variance, a squared distance), suspect cancellation.

The same problem appears when computing sqrt(x^2 + 1) - 1 for small x. As x approaches zero, sqrt(x^2 + 1) rounds to 1, and the subtraction gives 0, even though the true answer is a small positive number. The algebraically equivalent form x^2 / (sqrt(x^2 + 1) + 1) avoids the cancellation:

x <- 1e-8
sqrt(x^2 + 1) - 1          # cancellation
#> [1] 0
x^2 / (sqrt(x^2 + 1) + 1)  # stable
#> [1] 5e-17

Exercises

Compute naive_var(x + 1e10) for x <- 1:10. Compare with var(x) * 9/10. What happens?
For x <- 1e-12, compare sqrt(x^2 + 1) - 1 with x^2 / (sqrt(x^2 + 1) + 1). Which is more accurate?

6.4 Integers

R has 32-bit integers, created with the L suffix:

typeof(1L)
#> [1] "integer"
typeof(1)
#> [1] "double"

1L is an integer; 1 without the suffix is a double. The colon operator produces integers:

typeof(1:10)
#> [1] "integer"

But c() coerces to double unless every element carries the L suffix:

typeof(c(1, 2, 3))
#> [1] "double"
typeof(c(1L, 2L, 3L))
#> [1] "integer"

Integer arithmetic is exact within its range, with no rounding and no floating-point surprises. That range, however, is limited:

.Machine$integer.max
#> [1] 2147483647
.Machine$integer.max + 1L
#> Warning in .Machine$integer.max + 1L: NAs produced by integer overflow
#> [1] NA

Overflow produces NA with a warning. R does not silently wrap around to negative numbers the way C does.

When should you use integers? For counts, indices, anything naturally discrete: use as.integer() or the L suffix. For most data analysis, doubles are fine. Integer precision matters in package code, algorithms, and large-scale computation where the 15-digit limit of doubles becomes a concern for exact counting.

One subtlety: mixing integers and doubles promotes the result to double:

typeof(1L + 1.0)
#> [1] "double"
typeof(1L + 1L)
#> [1] "integer"

Usually harmless, but it means you cannot always predict the type of an arithmetic result. If exact integer arithmetic matters, keep all operands as integers.

Exercises

What is typeof(1:10 * 1.5)? Why?
What happens when you compute .Machine$integer.max + 1L? What about .Machine$integer.max + 1 (without the L)?

6.5 Numerical stability in practice

Reading guide

The sections above (storage, floating-point traps, cancellation, integers) are knowledge every R user needs. The sections below (condition numbers, matrix algebra, log-space) are more specialized. If you are working through the book sequentially, you can skip ahead to Chapter 7 and return here when you encounter these topics in practice, for instance when reading Chapter 28 or Chapter 31.

Individual floating-point traps lose a few digits. But when traps compound across a chain of operations, the entire result can become meaningless. How do you tell in advance?

6.5.1 Condition numbers

The condition number of a matrix measures how sensitive the solution of a linear system is to small perturbations in the input. R computes it with kappa():

# A well-conditioned matrix
A_good <- matrix(c(2, 1, 1, 3), nrow = 2)
kappa(A_good)
#> [1] 3.333333

A condition number near 1 means the system is well-behaved: small changes in the input produce small changes in the output. Large condition numbers mean trouble.

The Hilbert matrix is the classic example of an ill-conditioned matrix, with entries $H_{ij} = 1/(i + j - 1)$:

hilbert <- function(n) {
  i <- 1:n
  1 / outer(i, i, function(a, b) a + b - 1)
}

kappa(hilbert(5))
#> [1] 640356.4
kappa(hilbert(10))
#> [1] 2.41618e+13
kappa(hilbert(15))
#> [1] 2.185423e+17

The condition number grows exponentially. For a 15x15 Hilbert matrix it is astronomical, meaning that solving $Hx = b$ with 64-bit floats produces answers that are essentially noise. You can verify:

H <- hilbert(8)
x_true <- rep(1, 8)
b <- H %*% x_true
x_solved <- solve(H, b)
max(abs(x_solved - x_true))  # should be 0, will not be
#> [1] 4.386696e-07

The error can be many orders of magnitude larger than machine epsilon. The matrix is not singular, the code is correct, and yet the problem itself amplifies rounding errors. What, then, can an algorithm guarantee?

6.5.2 Backward stability

A backward-stable algorithm produces the exact answer to a slightly perturbed problem. The computed solution $\hat{x}$ satisfies $(A + \delta A)\hat{x} = b + \delta b$ where $\delta A$ and $\delta b$ are small.

Most LAPACK routines (which R calls internally for solve(), qr(), svd(), chol()) are backward-stable. For well-conditioned problems, backward stability guarantees accurate results. For ill-conditioned problems, no algorithm can save you, but you can sometimes reformulate the problem so that it becomes well-conditioned.

6.5.3 Polynomial fitting: a practical demonstration

Fitting a high-degree polynomial shows both the problem and the reformulation in action. The naive approach constructs a Vandermonde matrix (columns $1, x, x^2, \ldots, x^{15}$), which becomes badly conditioned as the degree grows:

set.seed(42)
x <- seq(0, 1, length.out = 50)
y <- sin(2 * pi * x) + rnorm(50, sd = 0.1)

# Unstable: raw powers form an ill-conditioned design matrix
X_raw <- outer(x, 0:15, "^")
kappa(X_raw)
#> [1] 201550340949

# Stable: orthogonal polynomials
X_ortho <- model.matrix(~ poly(x, 15))
kappa(X_ortho)
#> [1] 7.014364

The raw Vandermonde matrix has a condition number many orders of magnitude larger than the orthogonal polynomial basis. In practice:

# Unstable fit (may produce wild coefficients)
fit_raw <- lm(y ~ X_raw[, -1])

# Stable fit (well-behaved coefficients)
fit_ortho <- lm(y ~ poly(x, 15))

# Compare fitted values: they should be similar but may diverge
max(abs(fitted(fit_raw) - fitted(fit_ortho)))
#> [1] 0.07816089

poly() exists for a reason. Whenever you fit polynomials, use orthogonal bases. More generally, if a computation involves a matrix, check its condition number before trusting the result.

Exercises

Compute the condition number of hilbert(5), hilbert(10), and hilbert(15). How does it grow?
Create a 20x20 random matrix with matrix(rnorm(400), 20, 20). Compute its condition number. Is it well-conditioned? Now create matrix(rnorm(400), 20, 20) + diag(100, 20) (adding a large diagonal). How does the condition number change?
Fit a degree-12 polynomial to sin(x) on [0, 1] using both poly(x, 12) and raw powers. Compare the fitted values. At what degree do the two approaches start to disagree noticeably?

6.6 Matrix algebra

Matrices in R are vectors with a dim attribute:

A <- matrix(1:6, nrow = 2, ncol = 3)
A
#>      [,1] [,2] [,3]
#> [1,]    1    3    5
#> [2,]    2    4    6

Every matrix operation in R eventually calls LAPACK or BLAS — Fortran libraries that have been tested, tuned, and argued over for decades. When you write solve(A, b), you are asking those routines to find the vector x that satisfies Ax = b exactly, using numbers that are not exact. LAPACK is backward-stable, in the sense of Section 6.5: the x it hands back solves a slightly perturbed version of your problem, A + δA. For well-conditioned matrices, δA is negligible — a few ulps, invisible in any application. The Hilbert matrix from the previous section is a different story; there, δA is large enough to drown the answer in noise, because the matrix itself amplifies every perturbation and no amount of algorithmic care can undo that. The condition number you computed is the magnification factor, and it is the single number that determines whether solve() gives you a trustworthy answer or an expensive random number.

The key operations:

Matrix multiplication uses %*%, not * (which is element-wise):

B <- matrix(c(1, 0, 0, 1, 1, 0), nrow = 3, ncol = 2)
A %*% B
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    6

Transpose with t():

t(A)
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    3    4
#> [3,]    5    6

Cross products are faster than explicit transpose-multiply:

X <- matrix(rnorm(6), nrow = 3, ncol = 2)
crossprod(X)     # t(X) %*% X, but faster
#>           [,1]      [,2]
#> [1,] 3.2009566 0.5723763
#> [2,] 0.5723763 0.4978568
tcrossprod(X)    # X %*% t(X)
#>            [,1]       [,2]       [,3]
#> [1,]  0.5169554 -0.1946305  0.6850608
#> [2,] -0.1946305  0.6224605 -1.2102932
#> [3,]  0.6850608 -1.2102932  2.5593975

Solving linear systems: solve(A, b) solves $Ax = b$ for $x$:

A <- matrix(c(2, 1, 1, 3), nrow = 2)
b <- c(5, 7)
x <- solve(A, b)
x
#> [1] 1.6 1.8
A %*% x  # verify: should equal b
#>      [,1]
#> [1,]    5
#> [2,]    7

Opinion

Use solve(A, b) to solve systems, not solve(A) %*% b. Computing the inverse is slower and less numerically stable. The only reason to compute solve(A) explicitly is if you need the inverse itself, which is rare.

Decompositions: R provides eigen() for eigenvalues, svd() for singular value decomposition, qr() for QR decomposition, and chol() for the Cholesky decomposition of symmetric positive-definite matrices, all backed by LAPACK and BLAS. R’s matrix operations are fast for moderate sizes, up to thousands of rows and columns.

eigen(A)$values
#> [1] 3.618034 1.381966
svd(A)$d
#> [1] 3.618034 1.381966

A practical note: R fills matrices column by column by default, which catches people who expect row-by-row filling:

matrix(1:6, nrow = 2)
#>      [,1] [,2] [,3]
#> [1,]    1    3    5
#> [2,]    2    4    6
matrix(1:6, nrow = 2, byrow = TRUE)
#>      [,1] [,2] [,3]
#> [1,]    1    2    3
#> [2,]    4    5    6

The byrow = TRUE argument changes the fill order. Forgetting this is a common source of bugs in matrix code.

Exercises

Create a 3x3 matrix and compute its determinant with det(). Then compute the inverse with solve() and verify that A %*% solve(A) is close to the identity matrix (use round() or all.equal()).
Solve the system: $2x + y = 5$, $x + 3y = 7$. Verify the solution by multiplying A %*% x.
Use crossprod() and t() %*% on the same matrix. Verify the results are identical. (Try bench::mark() on a larger matrix to see the speed difference.)

6.7 When R lies and when it is precise

A short guide to where you can trust R’s numbers and where you should be cautious:

Precise: integer arithmetic, logical operations, string operations, counting. These are exact.

Approximate: any computation involving doubles. Sums, means, variances, regression coefficients. The approximation is excellent (15 digits), but it is still an approximation.

Dangerous spots: comparing doubles for equality, subtracting nearly equal values, summing many numbers of very different magnitudes, numbers near overflow or underflow.

Log-space arithmetic: when numbers get very small (probabilities in statistical models, likelihoods), work in log-space. Compute log(a) + log(b) instead of a * b. For log(exp(a) + exp(b)), use a logSumExp function that avoids the overflow of exp(a):

log_sum_exp <- function(a, b) {
  m <- pmax(a, b)
  m + log(exp(a - m) + exp(b - m))
}

# Direct computation overflows
a <- 1000
b <- 1001
# exp(a) + exp(b) would be Inf
log_sum_exp(a, b)
#> [1] 1001.313

Most data analysis is safe. The traps bite in numerical algorithms, simulation, and optimization. Be aware of them, but don’t be paranoid. One principled approach to tracking error is interval arithmetic: instead of storing a single float, you track [lower, upper] bounds, and every operation propagates the interval so that the result is a proof that the true answer lies within the bounds. R has no built-in interval arithmetic, but the idea is worth knowing about when precision is critical.

The lambda calculus connection

Interval arithmetic connects to dependent types in type theory, where a value is annotated with its error bound, turning “how wrong could this be?” from a question into a quantity the computation tracks automatically.