Andreas Kirsch

Bio & research interests

Hi there! 👋 I’m Andreas Kirsch.

I’m currently a Research Scientist at Google DeepMind on the Deep Learning Engineering team. Before this, I was a Researcher (Research Scientist) at Midjourney for a year.

I obtained my a PhD (“DPhil”) with Prof Yarin Gal in the OATML group at the University of Oxford and as a student in the AIMS CDT program. You can reach me via email or anonymously via admonymous.

During my DPhil, my interests were in information theory and its applications: information bottlenecks and active learning using Bayesian deep learning, and uncertainty quantification. I also enjoyed thinking about AI ethics and AI safety: in particular, the ML safety course by the Center of AI Safety was a lot of fun.

My thesis focuses on data subset selection: “Advancing Deep Active Learning & Data Subset Selection: Unifying Principles with Information-Theory Intuitions”.

Originally from Romania, I grew up in Southern Germany. After studying Computer Science and Mathematics at the Technical University in Munich (i.a. reading machine learning under JĂŒrgen Schmidhuber 🎉), I spent a couple of years in Zurich as a software engineer at Google (YouTube Monetization) and worked as a performance research engineer at DeepMind for a year in 2016/17 before spending a gap year as a fellow at Newspeak House. I began my DPhil in September 2018 and submitted my thesis in April 2023.

selected publications

  1. ICLR Blogpost ’24
    Bayesian Model Selection: The Marginal Likelihood, Cross-Validation, and Conditional Log Marginal Likelihood
    Kirsch, Andreas
    In The Third Blogpost Track at ICLR 2024 2024
  2. ICLR Blogpost ’24
    Highlight
    Bridging the Data Processing Inequality and Function-Space Variational Inference
    Kirsch, Andreas
    In The Third Blogpost Track at ICLR 2024 2024
  3. PhD Thesis
    Advancing Deep Active Learning & Data Subset Selection: Unifying Principles with Information-Theory Intuitions
    Kirsch, Andreas
    2023
  4. TMLR
    Black-Box Batch Active Learning for Regression
    Kirsch, Andreas
    Transactions on Machine Learning Research 2023
  5. CVPR 2023
    Highlight
    Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty
    Conference on Computer Vision and Pattern Recognition 2023
  6. AISTATS 2023
    Prediction-Oriented Bayesian Active Learning
    Bickford Smith*, Freddie, Kirsch*, Andreas, Farquhar, Sebastian, Gal, Yarin, Foster, Adam, and Rainforth, Tom
    26th International Conference on Artificial Intelligence and Statistics 2023
  7. TMLR
    Repro. Cert.
    Does ’Deep Learning on a Data Diet’ reproduce? Overall yes, but GraNd at Initialization does not
    Kirsch, Andreas
    Transactions on Machine Learning Research (Reproducibility Certification) 2023
  8. TMLR
    Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities
    Kirsch, Andreas, and Gal, Yarin
    Transactions on Machine Learning Research 2022
  9. ICML 2022
    Prioritized Training on Points that are Learnable, Worth Learning, and not yet Learnt
    Mindermann*, Sören, Brauner*, Jan M, Razzak*, Muhammed T, Sharma*, Mrinank, Kirsch, Andreas, Xu, Winnie, Höltgen, Benedikt, Gomez, Aidan N, Morisot, Adrien, Farquhar, Sebastian, and Gal, Yarin
    In Proceedings of the 39th International Conference on Machine Learning 2022
  10. UDL 2020
    Learning CIFAR-10 with a Simple Entropy Estimator Using Information Bottleneck Objectives
    Kirsch, Andreas, Lyle, Clare, and Gal, Yarin
    In Uncertainty & Robustness in Deep Learning at Int. Conf. on Machine Learning (ICML Workshop) 2020
  11. Preprint
    Unpacking Information Bottlenecks: Unifying Information-Theoretic Objectives in Deep Learning
    Kirsch, Andreas, Lyle, Clare, and Gal, Yarin
    arXiv Preprint 2020
  12. NeurIPS 2019
    BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning
    Kirsch*, Andreas, van Amersfoort*, Joost, and Gal, Yarin
    NeurIPS 2019

news

Jun 20, 2024

Another year, another update: I have published two blog posts on the ICLR 2024 blog post track—while working at Midjourney. I’m grateful for the opportunity to work on this open research on the side.

In particular, one of the blog posts was selected as a Highlight of the blog post track:

  • “Bridging the Data Processing Inequality and Function-Space Variational Inference” Highlight (Kirsch, 2024)

  • “Bayesian Model Selection: The Marginal Likelihood, Cross-Validation, and Conditional Log Marginal Likelihood” (Kirsch, 2024)

Both blog posts are available on the ICLR 2024 Blogpost Track website.

Dec 1, 2023

Another year, another set of papers. This year was dominated by writing up my thesis and defending it. I’m also very happy to have joined MidJourney as a Research Scientist at the end of September. Thus, this year was mostly about wrapping up some loose ends into papers:

  1. “Does ‘Deep Learning on a Data Diet’ reproduce? Overall yes, but GraNd at Initialization does not” (Kirsch, 2023)
  2. “Black-Box Batch Active Learning for Regression” (Kirsch, 2023)
  3. “Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning” (Kirsch et al., 2023)

And finally my thesis: “Advancing Deep Active Learning & Data Subset Selection: Unifying Principles with Information-Theory Intuitions” (Kirsch, 2023)

Dec 1, 2022

Very happy to have published a few papers at TMLR (and co-authored one presented at ICML) this year and to have (joint) co-authored papers that we will present at CVPR and AISTATS next year:

  1. “Prioritized Training on Points that are Learnable, Worth Learning, and not yet Learnt”, ICML 2022 (Mindermann* et al., 2022)
  2. “A Note on”Assessing Generalization of {SGD} via Disagreement"", TMLR(Kirsch & Gal, 2022)
  3. “Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities”, TMLR (Kirsch & Gal, 2022)
  4. “Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty”, CVPR 2023 (Highlight) (Mukhoti* et al., 2023)
  5. “Prediction-Oriented Bayesian Active Learning”, AISTATS 2024 (Bickford Smith* et al., 2023)

Several of these can be traced to workshop papers, which we were able to expand and polish into full papers.

Jul 24, 2021

Seven workshop papers at ICML 2021 (out of which five are first author submissions):

Uncertainty & Robustness in Deep Learning

Two papers and posters at the Uncertainty & Robustness in Deep Learning workshop:

SubSetML: Subset Selection in Machine Learning: From Theory to Practice

Four papers (posters, one spotlight) at the SubSetML: Subset Selection in Machine Learning: From Theory to Practice workshop:

Neglected Assumptions In Causal Inference

One paper (poster) at the Neglected Assumptions In Causal Inference workshop:

Feb 23, 2021

Lecture on “Bayesian Deep Learning, Information Theory and Active Learning” for Oxford Global Exchanges. You can download the slides here.

Feb 21, 2021

Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty has been uploaded to arXiv as pre-print. Joint work with Jishnu Mukhoti, and together with Joost van Amersfoort, Philip H.S. Torr, Yarin Gal. We show that a single softmax neural net with minimal changes can beat the uncertainty predictions of Deep Ensembles and other more complex single-forward-pass uncertainty approaches.

Dec 10, 2020

Unpacking Information Bottlenecks: Unifying Information-Theoretic Objectives in Deep Learning was also presented as a poster at the “NeurIPS Europe meetup on Bayesian Deep Learning”.

You can find the poster below (click to open):

Image version

or as PDF version to download.

Jul 17, 2020

Two workshop papers have been accepted to Uncertainty & Robustness in Deep Learning Workshop at ICML 2020:

  1. Scalable Training with Information Bottleneck Objectives, and
  2. Learning CIFAR-10 with a Simple Entropy Estimator Using Information Bottleneck Objectives

both together with Clare Lyle and Yarin Gal. They are based on Unpacking Information Bottlenecks: Unifying Information-Theoretic Objectives in Deep Learning for the former, and an application of the UIB framework for the latter: we can use it to train models that perform well on CIFAR-10 without using a cross-entropy loss at all.

Mar 27, 2020

Unpacking Information Bottlenecks: Unifying Information-Theoretic Objectives in Deep Learning, together with Clare Lyle and Yarin Gal, has been uploaded as pre-print to arXiv. It examines and unifies different Information Bottleneck objectives and shows that we can introduce simple yet effective surrogate objectives without complex derivations.

Sep 4, 2019

BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning got accepted into NeurIPS 2019. See you all in Vancouver!

Follow me on Twitter @blackhc