Boxplots (box and whisker plots)

Below you will find a series of examples showing how to produce boxplots, using the boxplot() function.


boxplot() and the functions boxplot.stats and bxp it calls, allow -if you are able to program in R - to obtain any kind of boxplot you wish to produce. If you are curious, just examine the documentation and examples for these functions..

boxplot() does not identify outliers, but it is quite easy to program, as boxplot.stats() supplies a list of outliers..

Adding elements to a boxplot

You can add a density plot (barcode plot) to the boxplot.

Identify outliers interactively

The identify() function can be used to interactively identify observations on a boxplot using the mouse. As the function, most useful for scatterplots (see this document)

, requires coordinates in the x and y direction, the example below creates a simple sequence variable: rep(1,length(area)) (1,2,3 ... up to the number of observations in the variable).

boxplot(area, ylab="Area of the country")
identify(rep(1, length(area)), super, rownames(worldl))

To stop the identification, use the Stop button or the context menu.

Boxplot() in package car

Boxplot() (Uppercase B !) built on the base boxplot() function but has more options, specifically the possibility to label outliers.

The Boxplot functions returns the list of outliers as a result, however by default only 10 outliers are shown (in the example below the id.n=Inf has been added to show all outliers (Inf=Infinity

 > Boxplot(area, data=world,labels=rownames(world),id.n=Inf)
 [1] "ALGE" "ARAB" "ARG"  "AUS"  "BRES" "CAN"  "CHIN" "USA"  "GROE" "INDE"

You can take advantage of this to analyze these outliers further to for instance show the values of urb for these outlying countries, or display the full data frame for these countries.

outarea=Boxplot(area, labels=rownames(world),id.n=Inf)
Boxplots from ggplot

A few examples, assuming that you are familiar with ggplot

p<=ggplot(world,aes(x=continent,y=infmor)) Start by creating an object with data (by continent)
p + geom_boxplot()Add the boxplot layer
p + geom_boxplot(outlier.size=2,outlier.shape=21,width=0.5) same but change outlier size and shape and box width
p + geom_boxplot(notch=TRUE) same, but notched boxplots
p+geom_boxplot+stat_summary(fun.y="mean",geom="point",shape=23,size=3,fill="blue") add mean diamonds to the boxplot
ggplot(world,aes(x=1,y=infmor)) + geom_boxplot() Simple boxplot. As geom_boxplot requires a x variable (factor). a constant value forces a single boxplot of y
p+geom_violin()Violin plot

Note that with ggplot you cannot (currently) label the outliers. You could create a label variable replacing all outliers with NAs and then add geom_text(label=labelvar).

See also