Swiss Summer School 2019

Thomas Hills
Content Analysis and Natural Language Processing

Thomas Hills is a Professor of Psychology at the University of Warwick. He teaches courses in quantitative approaches to behavioral science, language, and computational social sciences. His publications include work in psychology, communications, education, and economics, and focus on issues associated with large-scale analysis of language. He is currently the Director of the Bridges Doctoral Training Centre in Mathematical and Social Sciences and the Co-Director of the Behavioural Science Global Research Priority at the University of Warwick, both of which aim to provide and develop quantitative approaches to data in the social sciences.

Workshop contents and objectives

The aim of this workshop is to provide participants with an understanding of new methods in content analysis made possibly by new digital technology applied to text corpora. This will include new methods for collecting content data (e.g., social media or other text corpora), and computational methods for quantifying changes in content at the word and document level. Some of the kinds of questions this approach has been involved in include predicting consumer views of brands or political leaders, detecting regional and historical changes in happiness, and using language to predict personality.

The course will begin by providing participants with a broad overview of data science and big data applications to existing problems in content analysis and its advances through natural language processing (computer applications to content analysis). Specific cases will then be taken up for a more detailed analysis of their methodological approach, and participants will work with data to replicate existing findings and investigate novel hypotheses of their own. Finally, participants will receive guidance in developing and answering questions of their own.

On completion of the course, participants will be able to recognize and implement many common approaches to content analysis and take the first steps towards formulating and addressing problems of their own as social data scientists. Participants will also be provided with detailed information about how to follow up and learn more with respect to their particular area of interest.

Bibliography

Prerequisites

Participants taking this course should be familiar with basic statistical ideas and have some experience with computer programming. The course will primarily use R, but I will provide all the code.

Software

Students are advised that prior knowledge with R and Python will help them advance more quickly with their applications, but this knowledge is not necessary to learn from this course.



[Back] [Workshop Programme]
EH