Andrew Bell: Multilevel Analysis

Andrew Bell
Multilevel models: Practical applications

Populations commonly exhibit complex structure with many levels, so that workers (at level 1) work in particular organizational environments (at level 2); while individuals (1) may 'learn' their health-related behaviour in the context of households (2) and local cultures (3). Similar data structures result from multi-stage sample surveys so that respondents (1) are nested within households (2), in neighbourhoods (3), in districts (4), and in regions (5). In many cases, the survey design reflects the population structure, so in a survey of voting intentions the respondents (1) are clustered by communes (2). Multilevel models are currently being applied in a growing number of social science research areas including educational and organisational research, epidemiology, voting behaviour, psychology, sociology, and geography.

These levels in data are often seen as a convenience in the design that has become a nuisance in the analysis. However, by using multilevel models we can model simultaneously at several levels, gaining the potential for improved estimation valid inference, and a better substantive understanding. In substantive terms, by working simultaneously at the individual and contextual levels, these analytic models begin to reflect the realities of social organisation. By providing estimates of both the average effect of a variable over a number of settings, and the extent to which that effect varies over settings, these models provide a means of 'thick' quantitative description.

The course begins by building on standard single-level models, and we develop the two-level model with continuous predictors and response. Examples include house-prices varying over districts, and pupil progress varying by school. These models are subsequently extended to cover complex variation, both within and between levels, three-level models, and models with categorical predictors. We conclude with a consideration of estimators including maximum likelihood (operationalised through iterative generalized least squares) and a full Bayesian approach (operationalised through Monte-Carlo Markov Chain estimation) Throughout the course, we shall use graphical examples, verbal equations, algebraic formulation, class-based model interpretation, and practical modelling using the software package MLwiN.

On completion of the course, participants should be able to recognise a multilevel structure; specify a multilevel model with complex variation at a number of levels; and fit and interpret a range of multilevel models. The course does not explicitly cover multilevel analysis of panel-type data, multivariate responses, or survival data, although the course does provide the groundwork for these extensions. This course is appropriate if you are analysing a survey with complex structure, are interested in the importance of contextual questions, or if you need to undertake a quantitative performance review of an organisation.

Bibliography

Basic texts/overview

(Representative text used during the course)

Jones, K. and Duncan, C. 1998. 'Modelling context and heterogeneity: Applying multilevel models.' In E. Scarbrough and E. Tanenbaum (eds.), Research Strategies in the Social Sciences. Oxford University Press.
A large provided course pack will include all necessary reading and transcripts of MLwiN sessions.

In terms of web-based resources, have a look at Centre for Multilevel Modelling.

Remedial Reading

You must have a good working knowledge of single-level regression modelling including the handling of categorical predictors by dummy variables. If you do not have this knowledge/ experience, do not come on the course. All will benefit from taking the first 3 modules of the free online training provided by Lemma, especially Module 3 as it has been specifically designed to provide the necessary background and links to the software we are going to use.
Jones, K Multilevel models for geographical research (freely downloadable here).

Prerequisites

Participants taking this course should have good familiarity with regression modelling and inferential statistics. The aim of the course is not to cover mathematical derivations and statistical theory, but to provide a conceptual framework and hands-on experience with the interactive package MLwiN. Students should fully understand regression intercepts and slopes, standard errors, t-ratios, residuals, and concepts of variances and covariances. In terms of software, previous exposure to a Windows environment is all that is required. The full range of multilevel models cannot currently be fitted using standard packages such as SPSS. Consequently full training will be given in MLwiN. To re-iterate if your knowledge of standard (that is single-level) regression is non-existent or weak, this is not the course for you.

Software

The course will use the MLwiN software because of its ability to fit a very broad range of multilevel models in both maximum likelihood and MCMC estimation. The software is able to read SAS, Stata and SPSS files. It can handle large datasets and has very efficient algorithms for estimation and many tools for post model estimation, thereby providing an ideal learning environment. A free time-delimited 30 day version is available from http://www.bristol.ac.uk/cmm/software/mlwin/download/.

Two useful free add-ons are runmlwin which is a Stata procedure to fit multilevel models in MLwiN from within Stata, and r2mlwin which is an equivalent procedure for the R statistical software environment.

Andrew BellMultilevel models: Practical applications