Abstract : The two minisymposia on Advances in Optimization I and II will bring together a diverse group of leading researchers and practitioners from both continuous and combinatorial optimization, theoretical and applied. One of the goals of these two minisymposia is to raise awareness to the most recent advances in optimization theory, algorithms, and applications, and to develop connections and encourage collaboration.
[05543] Optimal Diagonal Preconditioning: Theory and Practice
Format : Talk at Waseda University
Author(s) :
Yinyu Ye (Stanford University)
Abstract : Preconditioning has long been a staple technique in optimization, often applied to reduce the condition number of a matrix and to speed up the convergence of algorithms. Although there are many popular preconditioning techniques in practice, most lack guarantees on reductions in condition number, and the degree to which we can improve over existing heuristic preconditioners remains an important question. In this paper, we study the problem of optimal diagonal preconditioning, that achieves maximal reduction in the condition number of any full-rank matrix by scaling its rows and/or columns with positive numbers. We first reformulate the problem as a quasi-convex optimization problem and provide a simple algorithm based on bisection. Then we develop an interior point algorithm with $O(\log(1/\epsilon))$ iteration complexity. Next, we specialize to one-sided optimal diagonal preconditioning problems, and demonstrate that they can be formulated as standard dual SDP problems. We then develop efficient customized solvers for the SDP approach and study the empirical performance of our optimal diagonal preconditioning procedures through extensive experiments. Our findings suggest that optimal diagonal preconditioners can significantly improve upon existing heuristics-based diagonal preconditioners at reducing condition numbers, and our SDP approach can find such optimal preconditioners efficiently for large matrices. We also extend our SDP approach to compute optimal mixtures of base preconditioners, which further improves its scalability and applicability.
[03259] Smart Initial Basis Selection for Linear Programs
Format : Talk at Waseda University
Author(s) :
Yong Zhang (Huawei Technologies Canada Co., Ltd)
Abstract : The simplex method, introduced by Dantzig more than half a century ago, is still to date one of the most efficient methods for solving large-scale linear programming (LP) problems. While the simplex method is known to have the finite termination property under mild assumptions, the number of iterations until optimality largely depends on the choice of initial basis. Existing strategies for selecting an advanced initial basis are mostly rule-based. These rules usually require extensive expert knowledge and empirical study to develop. Yet, many of them fail to exhibit consistent improvement, even for LP problems that arise in a single application scenario. In this paper, we propose a learning-based approach for initial basis selection. We employ graph neural networks as a building block and develop a model that attempts to capture the relationship between LP problems and their optimal bases. In addition, during the inference phase, we supplement the learning-based prediction with linear algebra tricks to ensure the validity of the generated initial basis. We demonstrate through extensive experiments with state-of-the-art simplex solvers that the proposed strategy can achieve substantial speedup and consistently outperforms existing rule-based methods. Furthermore, we extend the proposed approach to generating restricted master problems for column generation methods and present encouraging numerical results.
[04683] Interior Point Methods are Not Worse Than Simplex
Abstract : We develop a path-following IPM whose number of iterations is at most $O(n^{1.5} \log n)$ times the number of segments of any piecewise linear curve traversing the wide neighborhood of the central path. Our IPM matches the number of iterations of any path following IPM up to this polynomial factor and admits an $O(2^n n^{1.5} \log n)$ upper bound. The latter result complements an exponential lower bound of Allamigeon et al (SIAGA 18) for IPMs.
[03175] Analysis of Algorithms on Growing Networks
Format : Talk at Waseda University
Author(s) :
Shuji Kijima (Shiga University)
Abstract : Real networks are often dynamic. Nevertheless, very few is known about the theoretical analysis of algorithms on dynamic networks. This talk is concerned with some analysis techniques for dynamics on networks with a moderately increasing number of vertices regarding the growing speed.
[03367] Two constructive techniques for producing linear extended formulations
Format : Talk at Waseda University
Author(s) :
Yuri Faenza (Columbia University)
Abstract : We will present two constructive techniques for producing extended formulations: one based on communication protocols, the other on Birkhoff's representation theorem for finite distributive lattices. We also show applications to some problems, mostly in combinatorial optimization.
[03641] Interior-point methods on manifolds: theory and applications
Format : Talk at Waseda University
Author(s) :
Hiroshi Hirai (Nagoya University)
Harold Nieuwboer (University of Amsterdam)
Michael Walter (Ruhr University)
Abstract : We extend the theory of self-concordance for manifolds, and establish analogous results as in the Euclidean setting, such as quadratic convergence of Newton's method and polynomial iteration complexity of the path-following method. We show that on symmetric spaces of nonpositive curvature, the squared distance function to a point is self-concordant. These results are applied to norm minimization problems for reductive group actions, and to other geometric problems, such as minimum enclosing balls and geometric medians.
[03069] Alternating Linear Minimization: Revisiting von Neumann’s alternating projections
Format : Talk at Waseda University
Author(s) :
Gábor Braun (ZIB)
Sebastian Pokutta (ZIB / TUB)
Robert Weismantel (ETH)
Abstract : In 1933 von Neumann proved a beautiful result that one can compute a point in the intersection of two convex sets (under suitable assumptions) by alternating projections, i.e., successively projecting on one set and then the other. This algorithm assumes that one has access to projection operators for both sets. Here we consider the much weaker setup where we have only access to linear minimization oracles over the convex sets and present an algorithm to find a point in the intersection of two convex sets. Moreover, we provide a modification of the algorithm to minimize a linear function over the intersection and discuss further extensions.
[03052] Combinatorial and geometric aspects of linear optimization
Format : Talk at Waseda University
Author(s) :
Antoine Deza (McMaster University)
Abstract : Worst-case constructions have helped providing a deeper understanding of how the structural properties of the input affect the computational performance of linear optimization. Recent examples include the construction of Allamigeon et al. for which the interior point method performs an exponential number of iterations. In a similar spirit, recent lower bounds on the number of simplex pivots required in the worst-case to perform linear optimization over a lattice polytope are presented.
[05126] On Some Optimization-Related Issues In Deep Learning
Format : Talk at Waseda University
Author(s) :
Yin Zhang (The Chinese University of Hong Kong (Shenzhen))
Abstract : Despite many great advances achieved by deep learning, our understandings of it remain sorely limited. In this talk, we discuss a few optimization-related issues in deep learning, including model trainability, gradient stability, over-parameterization, quality of (globally optimal) solutions, interpolation versus extrapolation. We will introduce a new neural-layer architecture using Householder weighting and Absolute-value activating that has a low complexity but guarantees gradient stability and 1-Lipschitz continuity. We empirically evaluate the capacities of the proposed new layer and demonstrate its potential usefulness.
[03071] Creating Collaborative Data Representations Using Matrix Manifold Optimization
Format : Talk at Waseda University
Author(s) :
Keiyu Nosaka (University of Tsukuba)
Akiko Yoshise (University of Tsukuba)
Abstract : The trade-off between performance and privacy is a pain in the neck for centralized machine learning methods. Fed-DR-Filter and Data Collaboration Analysis (DCA) can overcome this difficulty through Collaborative Data Representation (CDR). We propose an alternative algorithm for CDR creation, utilizing matrix manifold optimization. We devise machine learning models in the DCA setting to evaluate algorithms. The results show that our algorithm outperforms the state-of-the-art approach in mean recognition performance within acceptable computation time.
[03047] Accelerated and Sparse Algorithms for Approximate Personalized PageRank
Format : Talk at Waseda University
Author(s) :
David Martinez-Rubio (Zuse Institute Berlin)
Elias Wirth (Zuse Institute Berlin)
Sebastian Pokutta (Zuse Institute Berlin)
Abstract : It has recently been shown that ISTA, an unaccelerated optimization first-order method, presents sparse updates for the $\ell_1$-regularized personalized PageRank problem, leading to cheap iteration complexity.
In this talk I'll explain our work on accelerated optimization algorithms for this problem that also perform sparse updates leading to faster convergence for certain parameter regimes.
Further, we design a conjugate directions algorithm that achieves an exact solution while exploiting sparsity.
Our findings apply beyond PageRank and work for any quadratic objective whose Hessian is a positive-definite $M$-matrix.
[04893] Optimal Composition Ordering for Linear Functions
Format : Talk at Waseda University
Author(s) :
Kazuhisa Makino (Kyoto Univ.)
Abstract : We outline the composition ordering problem of linear functions, i.e.,
given $n$ linear functions $f_1,\dots,f_n:\mathbb{R}\to\mathbb{R}$
and a constant $c\in\mathbb{R}$, we construct a permutation $\sigma:[n]\to[n]$ that
minimizes $f_{\sigma(n)}\circ f_{\sigma(n-1)}\circ\dots\circ f_{\sigma(1)}(c)$, where $[n]=\{1,\dots,n\}$.
We discuss structual properties of optimal solutions for the problem as well as the current status of the complexity issue. We also consider the multiplication ordering of $n$ matrices.