This data analysis course introduces approaches to regression modeling for a wide variety of response variables. By the end of the course, students should be able to decide which technique is relevant in which situation and apply it successfully, including to new situations.
Students work in groups of 3-4 and are given a set of simulated data. Unlike the first year, which revealed that the proposed data set was too trivial (real data), in the second year the teachers altered the simulated data beforehand, to get the students to make methodological choices. Missing values, outliers and inconsistencies were deliberately introduced, so that the analysis could not simply be automated. Their task is to analyze the data provided to answer scientific question of interest, using ChatGPT as a digital partner playing the role of analyst.
The project relies on students' active involvement in a critical interaction with the tool, guiding its analysis by integrating the methodological and statistical tools covered in the course. ChatGPT is used to generate code in R and analyze the data provided in relation to the research question. Students then write a critical report.
The expected report must include a rigorous evaluation of the AI proposals, considering the accuracy and consistency of the statistical choices, the validity of the methodologies applied in relation to the course content, the robustness of the interpretations, as well as any biases, errors or inconsistencies encountered in the response generated. All exchanges with ChatGPT must be documented via an accessible link. At the end of the project, the groups present their work orally and are questioned by another group having worked on the same data set.
This is supplemented by an in-class practical session during which the teaching assistant demonstrates an interaction with ChatGPT on a concrete example. An intermediate feedback phase is also offered, during which groups can request a meeting to discuss their progress.
Whereas in the first year this work only provided students with a bonus, the project assessment now accounts for 30% of the final course grade, with the remaining 70% allocated to an individual computer-based examination. This exam is based on a new dataset, with questions assessing students' understanding and application of the methods taught in class.