Creating dummy variables

For functions like lm() to include categorical variable into a regression formula, you do not need to create your dummies as long as the categorical variables is a factor, and the first element is to be used as the reference category in your regression (See Regression for an explanation.)

Creating a dummy variable from a continuous variable
rich <- gnpcap > median(gnpcap) Create the logical variable rich true for countries above the median of gnpcap, false otherwise
lm(infmor ~ urb+rich) Use rich in a formula. The dummy variable will be named richTRUE on the regression output.
ifelse
urbcat <- ifelse(urb > median(urb),c("Low urb"), c("High urb.")) Two groups based on urbanization, below and above the median, with labels
Other functions
Related documents