Abstract : The recovery of high-dimensional functions from point evaluations
or more general linear measurements is a cornerstone of
approximation theory and numerical analysis. While both are
well-developed areas, the recent advances in learning theory
and high-dimensional statistics sparked several new relations and
tools for approximating functions. The research area hence has
seen a quite remarkable synthesis of old and new results,
especially in the context of nonlinear models such as neural
networks and randomized techniques.
In this minisymposium we aim at highlighting recent developments
of high-dimensional regression and sampling with modern
applications in machine learning and function approximation.
[05486] Lattice-based algorithms for multivariate function approximation
Format : Talk at Waseda University
Author(s) :
Frances Kuo (UNSW Sydney)
Abstract : This will be a talk on using lattices for multivariate function approximation.
[05565] Efficient recovery of non-periodic multivariate functions via samples
Format : Talk at Waseda University
Author(s) :
Nicolas Nagel (Chemnitz University of Technology)
Tino Ullrich (Chemnitz University of Technology)
Kai Lüttgen (Chemnitz University of Technology)
Felix Bartel (Chemnitz University of Technology)
Abstract : Based on previous observations on function recovery from samples via Chebyshev polynomials, we introduce a periodization technique which preserves smoothness and thus, via know results for the periodic case, yields an efficient recovery algorithm for non-periodic functions. Together with deterministic subsampling algorithms we can prove, for the first time, near optimal bounds on the approximation error and support these claims with numerical experiments. This talk is based on work together with Felix Bartel, Kai Lüttgen and Tino Ullrich.
[04593] Polynomial tractability for integrating functions with slowly decaying Fourier series
Format : Talk at Waseda University
Author(s) :
Takashi Goda (The University of Tokyo)
Abstract : This talk is concerned with high-dimensional numerical integration problem. In this context, polynomial tractability refers to the scenario where the minimum number of function evaluations required to make the worst-case error less than or equal to a tolerance $\varepsilon$ grows only polynomially with respect to $\varepsilon^{-1}$ and the dimension $d$. For function spaces that are \emph{unweighted}, meaning that all variables contribute equally to the norm of functions, there are many negative results known in the literature that exhibit the curse of dimensionality. The aim of this paper is to show a contrasting result by introducing a non-trivial unweighted function space with absolutely convergent Fourier series that exhibits polynomial tractability with an explicit quasi-Monte Carlo rule.
[04266] Weighted least-squares approximation in expected $L^2$ norm
Format : Talk at Waseda University
Author(s) :
Matthieu Dolbeault (Sorbonne Université)
Albert Cohen (Sorbonne Université)
Abdellah Chkifa (Mohammed VI Polytechnic University)
Abstract : We investigate the problem of approximating a function in $L^2$ with a linear space of functions of dimension $n$, using only evaluations at m chosen points. We improve on earlier results based on the solution to the Kadison-Singer problem, by using a randomized greedy strategy, which allows to reduce the oversampling ratio $m/n$ and provides an algorithm of polynomial complexity.
[05485] On the relation between adaptive and non-adaptive randomized sampling
Format : Talk at Waseda University
Author(s) :
Stefan Heinrich (RPTU Kaiserslautern-Landau)
Abstract : Recently the author solved a long-standing problem of Information-Based Complexity: Is there a constant $c>0$ such that for all linear problems the randomized non-adaptive and adaptive $n$-th minimal errors can deviate at most by the factor $c$? The analysis of vector-valued mean computation showed that the answer is negative.
In this talk we give a survey on this and related results concerning further aspects the problem.
[05488] A multivariate Riesz basis of ReLU neural networks
Format : Talk at Waseda University
Author(s) :
Cornelia Schneider (Friedrich-Alexander Universität Erlangen)
Jan Vybiral (CzechTechnical University)
Abstract : We consider the trigonometric-like system of piecewise linear functions introduced recently by Daubechies, DeVore,
Foucart, Hanin, and Petrova. We provide an alternative proof that this system forms a Riesz basis of $L_2([0,1])$
based on the Gershgorin theorem. We also generalize this system to higher dimensions $d>1$ by a construction,
which avoids using (tensor) products. As a consequence, the functions from the new Riesz basis of $L_2([0,1]^d)$
can be easily represented by neural networks. Moreover, the Riesz constants of this system are independent of $d$,
making it an attractive building block regarding future multivariate analysis of neural networks.
Abstract : Machine learning regression methods leverage large datasets for training predictive models.However, using large datasets may not be feasible due to computational limitations or high labelling costs. Therefore, sampling small training sets from large pools of unlabelled data is essential to maximize performance while maintaining computational efficiency. In this work, we study a sampling approach aimed to minimize the fill distance of the selected set. We derive an upper bound for the maximum expected prediction error that linearly depends on the training set fill distance, conditional to the knowledge of data features. For empirical validation, we show that selecting a training by farthest point sampling significantly reduces the maximum prediction error, outperforming existing sampling approaches.
[05514] Efficient training of Gaussian processes with tensor product structure
Format : Talk at Waseda University
Author(s) :
Max Pfeffer (University of Goettingen)
Josie König (Universität Potsdam)
Martin Stoll (TU Chemnitz)
Abstract : We consider the special case of Gaussian process kernel learning where the covariance function is given by a sum of products of RBF kernels. For a given dataset, the parameters of the kernel are learned by minimizing the log marginal likelihood. Computing the log-determinant of the covariance matrix is prohibitive for large datasets or in high dimensions unless one exploits its low-rank tensor structure. We employ a stochastic trace estimation together with a Lanczos algorithm for TT-tensors (Tensor Trains). This allows us to break the curse of dimensionality and to perform a gradient-based optimization even in high dimensions.
[05610] Tensor decompositions for high-dimensional kernel regression
Author(s) :
Frederiek Wesel (Delft University of Technology)
Kim Batselier (Delft University of Technology)
Abstract : Kernel machines provide a nonlinear extension to linear methods by projecting the data in a higher-dimensional feature space. When considering product kernels, the dimensionality of the feature space grows exponentially with the dimensionality of the data, limiting these methods to small-dimensional data. We lift this limitation by constraining the model weights to be a canonical polyadic decomposition of low rank. This allows us to derive a block coordinate descent algorithm which allows for the training of kernel machines under such constraint at a computational complexity which is linear both in the number of samples and in the dimensionality of the data, allowing to tackle both large-sampled and high-dimensional data.
[05481] Optimal sampling for regression: from linear to nonlinear approximation
Format : Online Talk on Zoom
Author(s) :
Anthony Nouy (Nantes Université - Centrale Nantes)
Abstract : We consider the approximation of functions in L^2 from point evaluations, using linear or nonlinear approximation tools. For linear approximation, recent results show that weighted least-squares projections allow to obtain quasi-optimal approximations with near to optimal sampling budget.
This can be achieved by drawing i.i.d. samples from suitable distributions (depending on the linear approximation tool) and subsampling methods.
In a first part of this talk, we review different strategies based on i.i.d. sampling and present alternative strategies based on repulsive point processes that allow to achieve the same task with a reduced sampling complexity.
In a second part, we show how these methods can be used to approximate functions with nonlinear approximation tools, in an active learning setting, by coupling iterative algorithms and optimal sampling methods for the projection onto successive linear spaces. We particularly focus on the approximation using tree tensor networks, whose architectures allow for an efficient implementation of optimal sampling procedures within coordinate descent algorithms.
These are joint works with R. Gruhlke, B. Michel, C. Miranda and P. Trunschke