Abstract : The “unreasonable effectiveness” of deep learning for massive datasets posed numerous mathematical and algorithmic challenges along the path towards gaining deeper understandings of new phenomena in machine learning. This minisymposium aims at bringing together applied mathematicians interested in the mathematical aspects of deep learning, with diverse background and expertise to modeling high-dimensional scientific computing problems and nonlinear physical systems; the talks reflect the collaborative, multifaceted nature of the mathematical theory and applications of deep neural networks.
[05474] Monte Carlo neural networks: Stochastic gradient descent learns random variables
Format : Online Talk on Zoom
Author(s) :
Sebastian Becker (ETH Zurich)
Arnulf Jentzen (The Chinese University of Hong Kong, Shenzhen & University of Münster)
Marvin Müller (2Xideas Switzerland AG)
Philippe von Wurstemberger (ETH Zurich & The Chinese University of Hong Kong, Shenzhen)
Abstract : In financial engineering, prices of financial products are computed approximately many times each trading day with (slightly) different parameters in each calculation. Here we introduce a new approximation strategy for such parametric approximation problems where we employ stochastic gradient descent not to train parameters of standard neural networks (NNs) but instead to learn random variables appearing in Monte Carlo approximations. The proposed approach achieves in the tested examples much high approximation precisions than standard NNs.
[01329] Deep adaptive basis Galerkin method for evolution equations
Format : Online Talk on Zoom
Author(s) :
Yiqi Gu (University of Electronic Science and Technology of China)
Michael K. NG (The University of Hong Kong)
Abstract : We study deep neural networks (DNNs) for solving high-dimensional evolution equations. Unlike other existing methods (e.g., the least square method) that simultaneously deal with time and space variables, we propose a deep adaptive basis approximation structure. On the one hand, orthogonal polynomials are employed to form the temporal basis to achieve high accuracy in time. On the other hand, DNNs are employed to form the adaptive spatial basis for high dimensions in space.
[01635] Identifying reaction channels via reinforcement learning
Format : Talk at Waseda University
Author(s) :
Senwei Liang (Lawrence Berkeley Laboratory)
Abstract : Reactive trajectories between metastable states are rare yet important in studying reactions. This talk introduces a new method to identify the reaction channels where reactive trajectories occur frequently via reinforcement learning (RL). The action function in RL learns to seek the connective configurations based on reward from simulation. We characterize the reactive channels by data points sampled by shooting from the located connective configurations. These data points bridge stable states and cover most transition regions of interest, enabling us to study reaction mechanism on narrowed regions rather than entire configuration space.
[03340] Finite Expression Methods for Discovering Pyhsical Laws from Data
Format : Online Talk on Zoom
Author(s) :
chunmei wang (University of Florida)
Abstract : The speaker will present the finite expression method (FEX) for discovering the governing equations of data. By design, FEX can provide physically meaningful and interpretable formulas for physical laws compared to black-box deep learning methods. FEX only requires a small number of predefined operators to automatically generate a large class of mathematical formulas. Therefore, compared to existing symbolic approaches, FEX enjoys favorable memory cost and can discover a larger range of governing equations while other methods fail, as shown by extensive numerical tests.
[03345] Approximation Theory for Sequence Modelling
Format : Talk at Waseda University
Author(s) :
Qianxiao Li (National University of Singapore)
Abstract : In this talk, we present some recent results on the approximation theory of deep learning architectures for sequence modelling. In particular, we formulate a basic mathematical framework, under which different popular architectures such as recurrent neural networks, dilated convolutional networks (e.g. WaveNet), encoder-decoder structures, and transformers can be rigorously compared. These analyses reveal some interesting connections between approximation, memory, sparsity and low rank phenomena that may guide the practical selection and design of these network architectures.
[03467] Multi-scale Neural Networks for High Frequency Problems in Regressions and PDEs
Format : Talk at Waseda University
Author(s) :
Wei Cai (Southern Methodist University)
Lizuo Liu (Southern Methodist University)
Bo Wang (LCSM(MOE), School of Mathematics and Statistics, Hunan Normal University, Changsha, Hunan, 410081, P. R. China.)
Abstract : In this talk, we will introduce multiscale deep neural networks (MscaleDNNs) in order to overcome the spectral bias of deep neural networks when approximating functions with wide-band frequency information. The MscaleDNN uses a radial scaling in the frequency domain, which converts the problem of learning high frequency contents in regression problems or PDE’s solutions to one of learning lower frequency functions. As a result, the MscaleDNN achieves fast uniform convergence over multiple scales as demonstrated in solving regression problems and highly oscillatory Navier-Stokes flows. Moreover, a diffusion equation model in the frequency domain is obtained based on the neural tangent kernel, which clearly shows how the multiple scales in the MscaleDNN improves the convergence of the training of neural networks over wider frequency ranges with more scales, compared with a traditional fully connected neural network.
[05655] Coupling Deep Learning with Full Waveform Inversion
Format : Online Talk on Zoom
Author(s) :
Wen Ding (Stripe)
Kui Ren (Columbia University)
Lu Zhang (Rice University)
Abstract : In recent years, there has been increasing interest in applying deep learning to geophysical/medical data inversion. However, the direct application of end-to-end data-driven approaches to inversion has quickly shown limitations in practical implementation. Indeed, due to the lack of prior knowledge about the objects of interest, the trained deep learning neural networks very often have limited generalization. This talk presents a new methodology for coupling model-based inverse algorithms with deep learning for full waveform inversion. In particular, we present an offline-online computational strategy that couples classical least-squares-based computational inversion with modern deep learning-based approaches for full waveform inversion to achieve benefits that cannot be achieved by either component alone.
[03154] Finite Expression Method: A Symbolic Approach for Scientific Machine Learning
Format : Online Talk on Zoom
Author(s) :
Haizhao Yang (University of Maryland College Park)
Abstract : Machine learning has revolutionized computational science and engineering with impressive breakthroughs, e.g., making the efficient solution of high-dimensional computational tasks feasible and advancing domain knowledge via scientific data mining. This leads to an emerging field called scientific machine learning. In this talk, we introduce a new method for a symbolic approach to solving scientific machine learning problems. This method seeks interpretable learning outcomes in the space of functions with finitely many analytic expressions and, hence, this methodology is named the finite expression method (FEX). It is proved in approximation theory that FEX can avoid the curse of dimensionality in discovering high-dimensional complex systems. As a proof of concept, a deep reinforcement learning method is proposed to implement FEX for learning the solution of high-dimensional PDEs and learning the governing equations of raw data.
[05449] Implicit bias in deep learning based PDE solvers
Format : Talk at Waseda University
Author(s) :
Tao Luo (Shanghai Jiao Tong University)
Qixuan Zhou (Shanghai Jiao Tong University)
Abstract : We will discuss some recent development on the theory of deep learning based PDE solvers. We would like to mention some new ideas on modeling and analysis of such algorithms, especially some related phenomenon observed during the training process. For the theorectical part, both optimization and approximation will be considered.
[01880] Discretization Invariant Operator Learning for Solving Inverse Problems
Format : Talk at Waseda University
Author(s) :
Yong Zheng Ong (National University of Singapore)
Abstract : Discretization invariant learning aims at learning in the infinite-dimensional function spaces with the capacity to process heterogeneous discrete representations of functions as inputs and/or outputs of a learning model. This talk presents a novel deep learning framework based on integral autoencoders, IAE-Net, for discretization invariant learning. Using IAE-Net, an adaptive training scheme is proposed with different loss functions to train the model. The proposed model is tested with various applications.
[03159] Deep Adaptive Basis Galerkin Method for Evolution Equations
Format : Talk at Waseda University
Author(s) :
Yiqi Gu (University of Electronic Science and Technology of China)
Abstract : We study deep neural networks (DNNs) for solving high-dimensional evolution equations. Unlike other existing methods (e.g., the least square method) that simultaneously deal with time and space variables, we propose a deep adaptive basis approximation structure. On the one hand, orthogonal polynomials are employed to form the temporal basis to achieve high accuracy in time. On the other hand, DNNs are employed to form the adaptive spatial basis for high dimensions in space.