ROUSSEAUX Emmanuel

Emmanuel Rousseaux


Teaching and Research Assistant
DSEC IT Manager

Office 5228
Tel. +41 22 379 82 31
E-mail


Seminars

  • Analyse statistique de données catégorielles (Master)
  • Statistical Inference (Master)

University Degrees

  • Master of science in Knowledge Discovery in Databases, Nantes-Lyon, France.
  • Engineering Degree in Business Intelligence, Polytech'Nantes, Nantes, France.
  • Master of science (Maîtrise universitaire) in Mathematics, Nantes, France.

Research Fields

  • Machine Learning and Data Mining methodologies for Life Course analysis
  • Swarm Intelligence
  • Unbalanced Data
  • Cognitive Psychology
  • Health Sociology


Thesis project: A knowledge discovery and management framework for mining rare life course patterns

Increasingly used in social sciences during the past decades, longitudinal analysis has recently seen new tools emerge, in particular in the sequences analysis field. These works showed that data mining tools, for example association rules, decision trees, self-organizing maps, etc., can successfully be applied for extracting knowledge about life trajectories. But a database and software framework for handling life course as a whole is currently lacking. Therefore a first goal of this thesis project is to provide an high-level tool for manipulating and managing life course data. The software currently in development aims at (1) securing data with automatic tests of data consistency and representativity of the initial population, and facilitating (2) manipulation of life courses, (3) transmission of datasets, (4) the interoperability between methods and (5) the interoperability between datasets. In this sense this software aims at providing a rigourous and efficient framework for what we could call "life course mining". Then, we will design inside this framework two specific mining methods. The first one will aim at adapting the learning process of entropy-based decision trees in the case of unbalanced data with a very low occuring in some classes. This case occurs in particular when we study vulnerable situations (poor health, low income, divorce, etc.) which fortunately are usually rare, or in person-period data. The second one will aim at extending association rule method based on the intensity of implication measure for the mining of multi-channel sequences. Special attention will be paid to the treatment of rules redundancy. Behind all this work two thematic goals in health sociology are followed: (1) having the possibility to better detect and understand how some people fall into a poor health state, how some of them succeded in leaving this state and how some nearly vulnerable people manage to preserve good health, and (2) gaining more insights on the manifestation of the Cumulative Advantage/Disadvantage model in health trajectories.
Poster presented by at the NCCR LIVES Site Visit of the Review Panel, november 12, 2012


The Dataset project: handling survey data in R

Especially designed for social scientists, the project aims at providing an efficient and secure way of handling and preparing cross-sectional and longitudinal survey data ready for analysis.
The software is freely available on R-Forge, to install and use it please follow this link.

  • For starting with the package easily, you can ask for an introductory material to .
  • You can ask for help from other users by subscribing at the Dataset-users mailing list.
  • Be alerted on new releases by subscribing at the Dataset-updates mailing list.
  • Feel free to ask for feature request or bug fix by sending an email at .

Publications

top