Abstract : Successful applications of machine learning algorithms usually motivate theoretical studies of their computational and consistent properties. These theoretical studies help researchers and practitioners better understand the algorithms, identify appropriate application domains, and set up hyperparameters to achieve the best performance. On the other side, theoretical studies can also in turn motivate new algorithms by addressing the limitations of existing algorithms. This usually improves the performance in some specific scenarios or broaden the application domains of the existing algorithms. This minisymposium will collect talks on recent advances that address the interplay of mathematical foundations of machine learning and their applications.
Organizer(s) : Andreas Christmann, Han Feng, Qiang Wu
[03668] Learning through empirical gain maximization
Format : Online Talk on Zoom
Author(s) :
Yunlong Feng (State University of New York at Albany)
Qiang Wu (Middle Tennessee State University)
Abstract : In this presentation, we introduce a novel empirical gain maximization (EGM) framework for addressing robust regression problems with heavy-tailed noise or outliers in the response variable. EGM focuses on approximating the noise distribution's density function rather than directly approximating the truth function. This approach, stemming from minimum distance estimation, allows for the exclusion of abnormal observations, unlike traditional maximum likelihood estimation. We demonstrate that well-known robust nonconvex regression techniques, such as Tukey regression and truncated least square regression, can be reformulated within this new framework. By developing a learning theory for EGM, we provide a unified analysis for these established, yet not fully-understood, regression methods. This framework offers fresh insights into existing bounded nonconvex loss functions and reveals close connections between seemingly unrelated terminologies, such as Tukey's biweight loss and the triweight kernel. We also show that other prevalent bounded nonconvex loss functions in machine learning can be reinterpreted from specific smoothing kernels in statistics. Lastly, our framework facilitates the creation of new bounded nonconvex loss functions for robust learning.
[01357] SKELETAL BASED IMAGE PROCESSING FOR CNN BASED IMAGE CLASSIFICATION
Format : Online Talk on Zoom
Author(s) :
Cen Li (Middle Tenn State University)
Tsega Tsahai (Middle Tenn State University)
Abstract : This work studies image processing techniques as a preprocessing step in image classification. Deep Learning based Human Pose Estimation was used to preprocess raw image to extract key posture information, and CNN was applied to learn the classification models for human postures. Two applications have been developed: (1) teaching a humanoid robot to play an interactive game of Simon Says, (2) a fall detection system for elderly residents in an assisted living facility.
[01023] Total stability of kernel methods and localized learning
Format : Online Talk on Zoom
Author(s) :
Andreas Christmann (University of Bayreuth)
Hannes Koehler (University of Bayreuth)
Abstract : Regularized kernel-based methods typically depend on the underlying probability measure P and very
few hyperparameters.
We investigate the influence of simultaneous slight pertubations of P, the hyperparameters, and the
kernel on the resulting predictor.
Furthermore, kernel methods suffer from their super-linear computational requirements for big data.
Hence we extend our results to the context of localized learning.
The talk is based on Koehler and Christmann, JMLR, 23, 1-41, 2022.
[01397] Learning Ability of Interpolating Convolutional Neural Networks
Format : Talk at Waseda University
Author(s) :
Tian-Yi Zhou (Georgia Institute of Technology)
Xiaoming Huo (Georgia Institute of Technology)
Abstract : It is frequently observed that overparameterized neural networks generalize well. Regarding such phenomena, existing theoretical work mainly devotes to linear settings or fully connected neural networks.
This paper studies the learning ability of an important family of deep neural networks, deep convolutional
neural networks (DCNNs), under underparameterized and overparameterized settings. We establish
the best learning rates of underparameterized DCNNs without parameter restrictions presented in the
literature. We also show that, by adding well-defined layers to an underparameterized DCNN, we can
obtain some interpolating DCNNs that maintain the good learning rates of the underparameterized
DCNN. This result is achieved by a novel network deepening scheme designed for DCNNs. Our work
provides theoretical verification of how overfitted DCNNs generalize well.
[03542] Classification with Deep Neural Networks
Format : Talk at Waseda University
Author(s) :
Lei Shi (Fudan University)
Zihan Zhang (Fudan University & City University of Hong Kong)
Dingxuan Zhou (University of Sydney)
Abstract : Classification with deep neural networks (DNNs) has made impressive advancements in various learning tasks. Due to the unboundedness of the target function, generalization analysis for DNN classifiers with logistic loss remains scarce. This talk will report our recent progress in establishing a unified framework of generalization analysis for both bounded and unbounded target functions. Our analysis is based on a novel oracle-type inequality, which enables us to deal with the boundedness restriction of the target function. In particular, for logistic classifiers trained by deep fully connected neural networks, we obtain the optimal convergence rates only by requiring the H\"{o}lder smoothness of the conditional probability. Under certain circumstances, such as when decision boundaries are smooth and the two classes are separable, the derived convergence rates can be independent of the input dimension. This talk is based on joint work with Zihan Zhang and Prof. Ding-Xuan Zhou.
[03729] Robust Deep Learning with Applications
Format : Talk at Waseda University
Author(s) :
Qiang Wu (Middle Tennessee State University)
Shu Liu (Middle Tennessee State University)
Abstract : Deep neural networks are playing increasing roles in machine learning and artificial intelligence. Their performance highly depends on the network architecture and the loss function. The classical square loss is widely known to be sensitive to outliers. We propose the use of robust loss and two stage algorithms for deep neural networks, which are able to extract robust features and deal with outliers effectively. Applications in regression analysis and adversarial machine learning will be discussed.