Tools for categorical variables
Data

The examples below use a different data set: Click minarets or within R

• names(minarets) Shows the names of the variables in the data set
• str(minarets) Note that there are several factor variables (categorical variables), one of them is ordered (PolInter).
• attach(minarets) to make variables names accessible directly
Frequency tables and graphics depicting frequency tables
• table(PolInter) Frequency table showing counts
• xtabs(~gender) Same, but uses formula interface
• freq1<-table(PolInter) Create a table for further use
• prop.table(freq1) Table with proportions
• barplot(freq1,main="Count of Political Interest") Barchart of the table
• mosaicplot(freq1)Mosaic plot of the table
• mosaic(freq1) Mosaic plot of the table (requires packace {vcd}
Crosstabulations

Several functions produce simple crosstabulations (contingency tables); table, ftable, xtabs.

Producing tables containing only frequencies (counts):

• table(PolInter,gender) Bivariate table
• ftable(language~PolInter) Same using formula interface
• xtabs(~gender+language) Same using formula interface
• xtabs(~gender+language+vote) Three variables
• ftable(xtabs(~gender+language+vote)) Same, but alternative presentation
• tab1<- table(PolInter,gender) store table as object for further use
• margin.table(tab1,1) display row margins
• margin.table(tab1,2) display column margins

Proportions computed from a table of counts:

• tab1<- table(PolInter,gender) store frequency count table as object for further use
• prop.table(tab1) total proportions
• prop.table(tab1,2) proportions columnwise
• prop.table(tab1,1) proportions rowwise
• ftable(prop.table(tab1)) alternative presentation
CrossTable (package {gmodels})

CrossTable produces crosstabs similar to the ones produced in the past by SPSS or SAS. By default the table cells show counts, chi-square contributions, row, column and total proportions (default, SAS) or percentages (SPSS format).

• library(gmodels)
• CrossTable(gender,language) crosstabulations with total, row and column proportions
• CrossTable(gender,language, format="SPSS") crosstabulations with total, row and column percentages
• CrossTable(PolInter,language,digits=1,prop.r=F,prop.t=F,prop.chisq=F,format="SPSS") Only column percentages with a single decimal digit.
• CrossTable(vote1,language,missing.include=TRUE) Missing values are included in the table.
• CrossTable(PolInter,language,fisher=TRUE, mcnemar=TRUE) add fisher and McNemar tests

The following values can be displays in the cells (shown with default values)

 prop.r=TRUE row proportions/percentages prop.c=TRUE column proportions prop.t=TRUE total proportions prop.chisq=TRUE contribution to chi-square expected=FALSE expected value resid=FALSE residual sresid=FALSE standardized residual asresid=FALSE, adjusted standardized residual
Summary statistics: tests and association coefficients
• summary(tab1) Chi square test
• assocstats(tab1) Association coefficients; package {vcd}
• chisq.test(tab1) Chisquare test
• fisher.test(tab1, alternative="greater") Fisher's exact test

For 2x2 tables:

• polinter1 <-recode(PolInter,"c('--','-')='low';c('+','++')='high';else=NA") recode PolInter into to categories; requires package {car}
• tab2<-table(polinter1,gender) produce a 2x2 table
• oddsratio(tab2, log=FALSE) {vcd} odds ratios
• summary(oddsratio(tab2)) more odds ratios

More coefficients:

• Packages polycor, epitools, and {rms have functions to produce other association coefficients, namely Polychoric any polyserial correlations, Kendall's tau, γ Somer's D and others.

descr package

The descr package provides similar functions but with some additional options.

• CrossTable(language, gender) nearly identical to the CrossTable function described above.
• crosstab(language,gender) wrapper function, produces by default a mosaic plot; has a weighting option
• freq(language) Frequency table with a barchart

Note for SPSS users: descr has several functions that help you to read/write SPSS label and missing value commands.

Graphics for categorical variables
• tab1<-table(PolInter,gender) Create a table of counts (object of class table)
• barplot(tab1) Stacked barchart
• barplot(tab1,beside=T) Barchart (not stacked)
• barplot(tab1,horiz=T) Bars are shown horizontally
• dotchart(tab1) Cleveland's dot chart
• mosaicplot(tab1) from the {graphics} package
• mosaic(tab1)in the {vcd} package
• Association plots: assoc()in the {vcd}

See on-line documentation for titles and legends.

You will find most of these graphics in packagesPackages lattice and ggplot with many more options and control on how charts are produced and displayed. See the documentation for details.

Graphics for categorical variables with lattice
• library(lattice)
• barchart(vote~gender+PolInter, data=minarets)
Graphics for categorical variables with package ggplot2
• library(ggplot2)
• ggplot(minarets,aes(x=PolInter)) + geom_bar()
• ggplot(minarets,aes(x=PolInter,fill=gender)) + geom_bar()
• ggplot(na.omit(minarets),aes(x=PolInter,fill=gender)) + geom_bar(position="dodge")
More...
• xtable package: prepare tables for Latex or HTML
Related documents