Séminaire de Recherche en Linguistique

Ce séminaire reçoit des conférenciers invités spécialisés dans différents domaines de la linguistique. Les membres du Département, les étudiants et les personnes externes intéressées sont tous cordialement invités.

Description du séminaire Print

Titre Inducing Interpretable Causal Structures in Neural Networks
Conférencier Christopher Potts (Stanford University)
Date mardi 01 mars 2022
Heure 16h15  changement d'horaire
Salle Zoom (Meeting ID: 679 4364 6694, Passcode: 513912) changement de salle
Description

Early symbolic NLP models were designed to leverage valuable insights about language and cognition. These insights were expressed directly in hand-designed structures, and this ensured that model behaviors were systematic and interpretable. Unfortunately, these models tended also to be brittle and specialized. By contrast, present-day models are data-driven and can flexibly acquire complex behaviors, which has opened up many new avenues. However, the trade-offs are now evident: these models often find opaque, unsystematic solutions. In this talk, I'll report on our ongoing efforts to combine the best aspects of the old and new using techniques from causal abstraction analysis. In this method, we define high-level causal models, usually in symbolic terms, and then train neural networks to conform to the structure of those models while also learning specific tasks. The central technical piece is interchange intervention training (IIT), in which we swap internal representations in the target neural model in a way that is guided by the input–output behavior of the causal model.  Where the IIT objective is minimized, the high-level model is an interpretable, faithful proxy for the underlying neural model. My talk will focus on how and why IIT works, since I am hoping this will help people identify new application areas for it, and I will also briefly review case studies applying IIT to natural language inference, grounded language understanding, and language model distillation.

(Joint work with Atticus Geiger, Zhengxuan Wu, Hanson Lu, Josh Rozner, Elisa Kreiss, Thomas Icard, and Noah Goodman)

   
Document(s) joint(s) -