-
Challenging Complexity: Stochastic Acquisition for Efficient Deep Batch Active Learning
“Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning” has been published in the Transactions of Machine Learning Research (TMLR). This paper challenges the status quo in deep active learning for medium-long acquisition batch sizes
“More than a few, but not that many.”: 10–1000 acquisitions depending on the dataset. and examines an efficient approach for batch acquisition through simple stochastic extensions of standard acquisition functions. -
Black-Box Batch Active Learning for Regression (B3AL)
Training machine learning models can require massive labeled datasets. Active learning aims to reduce labeling costs by selecting the most informative samples for labeling. But how well do prediction-focused black-box techniques compare to parameter-focused white-box methods?
-
Unifying Approaches in Active Learning and Active Sampling
Our paper “Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities”
was recently published in TMLR. -
Assessing Generalization via Disagreement
Our paper “A Note on ‘Assessing Generalization of SGD via Disagreement’”
was published in TMLR this week and serves both as a short reproduction and review note. It engages with the claims in “Assessing Generalization of SGD via Disagreement” by Jiang et al. (2022) , which received an ICLR 2022 spotlight. We would like to thank the authors for constructively engaging with our note on OpenReview. -
Stirling's Approximation for Binomial Coefficients
In MacKay (2003)
on page 2, the following straightforward approximation for a binomial coefficient is introduced: \[\begin{equation} \log \binom{N}{r} \simeq(N-r) \log \frac{N}{N-r}+r \log \frac{N}{r}. \end{equation}\] The derivation in the book is short but not very intuitive although it feels like it should be. Information theory would be the likely candidate to provide intuitions. But information-theoretic quantities like entropies do not apply to fixed observations, only random variables, or do they?