Tukey's ladder of powers/box-cox transformations

This document discusses the most important family of transformations used in statistics to promote symmetry and linearity, Tukey's :(a particular form of the Box-Cox transformations).

Tukey's ladder of powers

Power | Algebraic | Expression | |
---|---|---|---|

↑ | 3 | X^{3} | x^3 |

↑ | 2 | X^{2} | x^2 |

… | 1 | X^{1} | x |

↓ | ½ | √x | x^0.5 or sqrt(X) |

↓ | 0 | log(x) or ln(x) | log10(x) or log(x) |

↓ | -½ | -1/√x | -1/sqrt(x) |

↓ | -1 | -1/x (Reciprocal) | -1/X |

↓ | -2 | -1/x^{2} | 1/x^2) |

Lower and higher, as well as intermediate powers are consistent with the ladder power and the properties of the transformations.

R being a programming language, the transformations can be used readily everywhere as arguments to functions or to create derived variables.

Package car

Package carcontains many function for finding and graphically representing transformations.

symbox(infmor, powers=c(3,2,1,0.5,0,-0.5,-1,-2))

shows the boxplots for the common
powers of the ladder of powers; by default (omitting the powers argument, only
powers -1, -0.5, 0, 0.5, 1 are shown.

Instead of the box-cox family of power transformations, you can get Yeo-Johnson power transformations

symbox(infmor, powers=c(3,2,1,0.5,0,-0.5,-1,-2), trans=yjPower)

For variables with zero or negative values you might need to add a constant using the start= option

The package has many power transformation oriented tools, many non-graphical ones, for instance powerTransform that gives estimates of the transformation parameters, using maximum likelihood, to obtain multidimensional normality. Works also for single variables.

Package HH

This package offers, among many other function, a ladder() function displaying a scatterplot matrix showing all transformations using the common powers of the ladder of powers.

require(HH) ladder(urb~infmor, data=world)

All values must be positive and non-missing; for an unknown reason, if data= is not present, an error message is issued.

Using one of the ancillary functions of the HH package you can also construct a chart that contains boxplots of a variable and its re-expressions.

require(HH) par(mfrow=c(1,6)) apply(ladder.f(urb),2,boxplot)

These are in reality six separated boxplots showing form left to right powers of -1, -0.5, 0, 0.5, 1 et 2. The use of the apply() function does not allow for adding legends and a global title. Some more programming effort is needed to build a nicely documented sequence of boxplots or by the way any other chart or summary statistic.

Package LearnEDA

The LearnEDA package has several functions to work with reexpressions two function to simplify the reexpression of variables (taking care of things like negative numbers and rescaling:

- power.t(infmor,0) Transform variable infmor using power 0, i.e. logs
- mtrans(infmor,2)Matched transformations (matched transformations preserve the median value.

It also has a spread.level.plot function

The LearnEDA has several functions to transform a variable or a relationship interactively, using a slider controlling the power, namely

slider.straighten(infmor, urb) | Straighten a relationship (shows a scatterplot with a line, the transformation power, as well as a residual plot |

slider.compare(infmor,continent) | Groupwise boxplot |

slider.power | Histogram with a single variable |

slider.match | Same, but uses matched reexpressions |

See also