Machine learning (ML) methods have achieved remarkable successes on problems with independent and
identically-distributed (IID) data. However, real-world data is not IID—environments change,
experimental conditions shift, new measurement devices are used, and selection biases are introduced.
Current ML methods struggle when asked to transfer or adapt quickly to such
out-of-distribution (OOD) data.
Causality [1] provides a principled mathematical framework to describe the distributional
differences that arise from the aforementioned system changes. In particular, it supposes that observed
system changes arise from changes to just a few underlying modules or mechanisms which function
independently [2].
In my PhD studies I am exploring how best to exploit the invariances that are observed across multiple
environments or experimental conditions by viewing them as imprints of (or clues about) the underlying
causal mechanisms. The central hypothesis is that these invariances reveal how the system can
change and thus how best to prepare for shifts that may occur in the future. My two main focuses are
causal representation learning [3]—the discovery of high-level abstract causal variables from low-level
observations—and the learning of invariant predictors [4,5] to enable OOD
generalization. I am also excited by causal discovery, where causal relations are learned from
heterogeneous data to e.g. understand cellular processes.
[1] Pearl, J. (2009). Causality. Cambridge University Press.
[2] Peters, J., Janzing, D., & Schölkopf, B. (2017). Elements of causal inference: foundations and learning algorithms. MIT Press.
[3] Schölkopf, B. et al. (2021). Toward causal representation learning. Proceedings of the IEEE, 109(5), 612-634.
[4] Peters, J., Bühlmann, P., & Meinshausen, N. (2016). Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 947-1012.
[5] Arjovsky, M., Bottou, L., Gulrajani, I., & Lopez-Paz, D. (2019). Invariant risk minimization.
arXiv:1907.02893.
|