Abstract : The widespread uptake of machine learning for completing routine and complex tasks has been an ambition that feels closer and closer every year. There is currently a well-known reproducibility crisis impacting machine learning-based science which could damage public confidence in the tools and hamper a rapid uptake. The diverse array of speakers in this minisymposium will present a range of talks focussing on practical solutions to different aspects of the reproducibility crisis, ways to address inequalities in algorithm performance to improve fairness, improving explainability of models, and methods to assess the robustness of algorithms.
[02574] A tale of two crises: COVID-19 and ML reproducibility
Format : Talk at Waseda University
Author(s) :
Michael Thomas Roberts (University of Cambridge)
Abstract : Machine learning, like many fields before it, is suffering from a reproducibility crisis. In this talk we will give an overview of four different domains in which issues have been identified: (a) imaging, (b) missing data imputation, (c) learning at scale and (d) engineering of codebases. We also present solutions to problems identified.
[02265] Leakage and the reproducibility crisis in ML-based science
Format : Online Talk on Zoom
Author(s) :
Sayash Kapoor (Princeton University)
Arvind Narayanan (Princeton University)
Abstract : As quantitative fields adopt ML methods, it is important to ensure reproducibility. We show that data leakage is a widespread problem and has led to severe reproducibility failures. Through a literature survey of research in communities that adopted ML methods, we show that errors have been found in 17 fields, collectively affecting hundreds of papers and leading to wildly overoptimistic conclusions. We propose model info sheets to detect and prevent leakage in ML-based science.
[02088] A critical look: overly optimistic results on the TPEHGDB dataset
Format : Online Talk on Zoom
Author(s) :
Gilles Vandewiele (IDLab, Ghent)
Abstract : I will discuss the overly optimistic prediction results that arise when applying oversampling on data before partitioning into a train and test set. Specifically, I will present a case study on predicting preterm birth using the TPEHG database where many studies report near-perfect predictive performances due to making a fundamental mistake. After correcting this mistake, the predictive power of the models becomes similar to a coin toss.
[01993] Classification of datasets with imputed missing values: does imputation quality matter?
Format : Talk at Waseda University
Author(s) :
Tolou Shadbahr (University of Helsinki)
Michael Thomas Roberts (University of Cambridge)
Jan Stanczuk (University of Cambridge)
Julian Gilbey (University of Cambridge)
Philip Teare (AstraZeneca)
Sören Dittmer (University of Cambridge)
MAtthew Thorpe (University of Manchester)
Ramon Vinas Torne (University of Cambridge)
Evis Sala (University of Cambridge)
Pietro Lio (University of Cambridge)
Mishal Patel (AstraZeneca)
James H.F. Rudd (University of Cambridge)
Tuomas Mirtti (university of Helsinki)
Antti Sakari Rannikko (universiity of Helsinki)
John Aston (University of Cambridge)
Jing Tang (University of Helsinki)
Carola-Bibiane Schönlieb (University of Cambridge)
Abstract : Classifying samples in incomplete datasets is a common non-trivial task. Missing data is commonly observed in real-world datasets. Missing values are typically imputed, followed by classification of the now complete samples. Often, the focus is to optimize the downstream classification performance. In this talk, we highlight the serious consequences of using poorly imputed data, demonstrate how the common quality measures for measuring imputation quality are flawed, and introduce an improved class of imputation quality measures.
[01736] Improving reproducibility, trustworthiness and fairness for diverse applications of machine learning
Format : Talk at Waseda University
Author(s) :
Hirotaka Takahashi (Tokyo City University)
Abstract : Machine learning is applied to a diverse set of problems in our group. For example, research on gravitational wave physics and astronomy, development of traffic safety training and skills education methods, athlete support system of various sports, application to education to collaborate between teachers and machine learning etc.. In this presentation, we would like to focus on various applications and discuss how we can improve reproducibility, trustworthiness and fairness in machine learning.
[01850] Multi-domain & Multi-task Generalisation on Real-World Clinical Data
Format : Talk at Waseda University
Author(s) :
Daniel Kreuter (University of Cambridge)
Samuel Tull (University of Cambridge)
Abstract : Machine learning models have been holding the promise to revolutionise healthcare for several years. However, we rarely see promising approaches translate into deployment in the clinic. Often, this is due to an unexpected drop in performance when deploying the model on unseen test data due to domain shift. Our novel "Disentanglement Autoencoder" approach allows for multiple domains and tasks, both continuous and categorical, creating a disentangled embedding which can be used for multiple classification tasks.
[02239] Software engineering for data science
Format : Online Talk on Zoom
Author(s) :
Sören Dittmer (Uni Cambridge)
Abstract : Despite democratized data science tools, developing a trustworthy and effective data science system (DSS) is becoming increasingly challenging. The lack of software engineering (SE) skills and perverse incentives are among the root causes. We analyze why SE and building large complex systems, in general, is hard. We identify how SE addresses those difficulties and discuss how to adapt the insides to DSSs. We emphasize two key development philosophies: incremental growth and feedback loops.
[02340] ShearletX: A Mathematical Approach Towards Explainability
Format : Online Talk on Zoom
Author(s) :
Gitta Kutyniok (LMU Munich)
Stefan Kolek (LMU Munich)
Robert Windesheim (LMU Munich)
Hector Andrade Loarca (LMU Munich)
Ron Levie (Technion)
Abstract : Automated decision making using machine learning, in particular deep learning, is becoming an increasingly important component of modern technical systems and often affects humans directly. In this talk, we will present an explainability approach, coined ShearletX, based on a combination of information theory and applied harmonic analysis, which not only often outperforms state-of-the-art methods, but is also accessible to a mathematical analysis.