I like to have control over the objects I create, even when they might be arbitrary.
Consider, then, that all possible n×n covariance matrices Σ can be expressed in the form
Σ=P′ Diagonal(σ1,σ2,…,σn) P
where P is an orthogonal matrix and σ1≥σ2≥⋯≥σn≥0.
Geometrically this describes a covariance structure with a range of principal components of sizes σi. These components point in the directions of the rows of P. See the figures at Making sense of principal component analysis, eigenvectors & eigenvalues for examples with n=3. Setting the σi will set the magnitudes of the covariances and their relative sizes, thereby determining any desired ellipsoidal shape. The rows of P orient the axes of the shape as you prefer.
One algebraic and computing benefit of this approach is that when σn>0, Σ is readily inverted (which is a common operation on covariance matrices):
Σ−1=P′ Diagonal(1/σ1,1/σ2,…,1/σn) P.
Don't care about the directions, but only about the ranges of sizes of the the σi? That's fine: you can easily generate a random orthogonal matrix. Just wrap n2 iid standard Normal values into a square matrix and then orthogonalize it. It will almost surely work (provided n isn't huge). The QR decomposition will do that, as in this code
n <- 5
p <- qr.Q(qr(matrix(rnorm(n^2), n)))
This works because the n-variate multinormal distribution so generated is "elliptical": it is invariant under all rotations and reflections (through the origin). Thus, all orthogonal matrices are generated uniformly, as argued at How to generate uniformly distributed points on the surface of the 3-d unit sphere?.
A quick way to obtain Σ from P and the σi, once you have specified or created them, uses crossprod
and exploits R
's re-use of arrays in arithmetic operations, as in this example with σ=(σ1,…,σ5)=(5,4,3,2,1):
Sigma <- crossprod(p, p*(5:1))
As a check, the Singular Value decomposition should return both σ and P′. You may inspect it with the command
svd(Sigma)
The inverse of Sigma
of course is obtained merely by changing the multiplication by σ into a division:
Tau <- crossprod(p, p/(5:1))
You may verify this by viewing zapsmall(Sigma %*% Tau)
, which should be the n×n identity matrix. A generalized inverse (essential for regression calculations) is obtained by replacing any σi≠0 by 1/σi, exactly as above, but keeping any zeros among the σi as they were.