Tukey's ladder of powers/box-cox transformations

This document discusses the most important family of transformations used in statistics to promote symmetry and linearity, Tukey's :(a particular form of the Box-Cox transformations).

Tukey's ladder of powers
PowerAlgebraicExpression
3X3 x^3
2X2 x^2
1X1 x
½√x x^0.5 or sqrt(X)
0log(x) or ln(x) log10(x) or log(x)
-1/√x-1/sqrt(x)
-1-1/x (Reciprocal) -1/X
-2-1/x21/x^2)

Lower and higher, as well as intermediate powers are consistent with the ladder power and the properties of the transformations.

R being a programming language, the transformations can be used readily everywhere as arguments to functions or to create derived variables.

Package car

Package carcontains many function for finding and graphically representing transformations.

symbox(infmor, powers=c(3,2,1,0.5,0,-0.5,-1,-2))
shows the boxplots for the common powers of the ladder of powers; by default (omitting the powers argument, only powers -1, -0.5, 0, 0.5, 1 are shown.

Instead of the box-cox family of power transformations, you can get Yeo-Johnson power transformations
symbox(infmor, powers=c(3,2,1,0.5,0,-0.5,-1,-2), trans=yjPower)

For variables with zero or negative values you might need to add a constant using the start= option

The package has many power transformation oriented tools, many non-graphical ones, for instance powerTransform that gives estimates of the transformation parameters, using maximum likelihood, to obtain multidimensional normality. Works also for single variables.

Package HH

This package offers, among many other function, a ladder() function displaying a scatterplot matrix showing all transformations using the common powers of the ladder of powers.

 require(HH)
 ladder(urb~infmor, data=world)

All values must be positive and non-missing; for an unknown reason, if data= is not present, an error message is issued.

Using one of the ancillary functions of the HH package you can also construct a chart that contains boxplots of a variable and its re-expressions.

require(HH)
par(mfrow=c(1,6))
apply(ladder.f(urb),2,boxplot)

These are in reality six separated boxplots showing form left to right powers of -1, -0.5, 0, 0.5, 1 et 2. The use of the apply() function does not allow for adding legends and a global title. Some more programming effort is needed to build a nicely documented sequence of boxplots or by the way any other chart or summary statistic.

Package LearnEDA

The LearnEDA package has several functions to work with reexpressions two function to simplify the reexpression of variables (taking care of things like negative numbers and rescaling:

It also has a spread.level.plot function

The LearnEDA has several functions to transform a variable or a relationship interactively, using a slider controlling the power, namely

slider.straighten(infmor, urb) Straighten a relationship (shows a scatterplot with a line, the transformation power, as well as a residual plot
slider.compare(infmor,continent) Groupwise boxplot
slider.powerHistogram with a single variable
slider.matchSame, but uses matched reexpressions
See also