Abstract : Streamed data are ubiquitous. In this context, a key challenge is to quantify our understanding and account for the interaction between channels. Rough path theory provides new insights for producing actionable inference for multimodal path-like data. The path signature is a mathematical object with desirable approximation properties and geometric interpretation which leads to more effective features and analysis. Further, the expected signature provides a powerful way to describe empirical measures on streams. Applications include award-winning machine learning methods in healthcare and finance, as well as commercial-quality Chinese handwriting software. We expose new challenges and work on applications in this area.
Abstract : Path dependent options can be generated by combinations of signatures. We focus on the case of one asset augmented with time. We construct an incremental basis of signature elements which allows us to write a smooth path dependent payoff as a converging series of signature elements. By recalling the main concepts of Functional Itô Calculus, a natural framework for path-dependence, we draw links between two approximation results, the Taylor expansion and the Wiener chaos decomposition. We also establish the pathwise Intrinsic Expansion and link it to Functional Taylor Expansion.
[01349] Neural Stochastic PDEs: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics
Format : Talk at Waseda University
Author(s) :
Maud Lemercier (University of Oxford)
Cristopher Salvi (Imperial College London)
Andris Gerasimovics (University of Bath )
Abstract : Neural SDEs are a class of physics-inspired neural networks that are particularly well-suited for modelling temporal dynamics. However, they may not be the most appropriate tool to model systems that vary both in space and in time. In this talk, I will present a way to address this issue, leveraging the notion of a mild solution of an SPDE. I will introduce the Neural SPDE model and demonstrate its ability to learn solution operators of PDEs with stochastic forcing from partially observed data.
[01332] Neural Controlled Differential Equations: The Log-ODE Method
Format : Talk at Waseda University
Author(s) :
Benjamin Walker (University of Oxford)
Abstract : Neural controlled differential equations $($NCDEs$)$ are a powerful approach to time-series modelling. Their output is a linear map of a CDE's solution, where the vector field is learnt and the control is a continuous interpolation of the input data. This work demonstrates that NCDEs can achieve start-of-the-art performance on long time-series given two modifications: ensuring the vector field is smooth by bounding its Lip$(2)$-norm, and applying the Log-ODE method to the learnt vector field.
[01297] Nowcasting with signatures
Format : Talk at Waseda University
Author(s) :
Lingyi Yang (Alan Turing Institute)
Samuel Cohen (University of Oxford)
Abstract : Nowcasting refers to inference of the recent past, present, or near future. This is common in economics as key indicators, like GDP, are published with significant delays due to data collection/cleansing. The signature, a mathematical object arising from rough analysis, captures geometric properties and handles missing data from complex sampling patterns. We look at nowcasting with regression on signatures and show that this simple model subsumes the popular Kalman filter in theory and performs well in practice.
Abstract : In this talk we discuss the problem of forecasting general stochastic processes using a path-dependent extension of the Neural Jump ODE (NJ-ODE) framework.
While NJ-ODE was the first framework to establish convergence guarantees for the prediction of irregularly observed time-series, these results were limited to data stemming from It\^o-diffusions with complete observations, in particular Markov processes where all coordinates are observed simultaneously.
Here, we first revisit the NJ-ODE and its results and then generalise them to generic, possibly non-Markovian or discontinuous, stochastic processes with incomplete observations, by utilising the reconstruction properties of the signature transform.
These theoretical results are supported by empirical studies, where it is shown that the path-dependent NJ-ODE outperforms the original NJ-ODE framework in the case of non-Markovian data.
[00499] Signature Methods for Outlier Detection
Format : Talk at Waseda University
Author(s) :
Paola Arrubarrena (Imperial College London and DataSig)
Terry Lyons (University of Oxford)
Thomas Cass (Imperial College University)
Maud Lemercier (University of Oxford and DataSIg)
Abstract : An anomaly detection methodology is presented that identifies if a given observation is unusual by deviating from a corpus of non-contaminated observations. The signature transform is applied to the streamed data as a vectorization to obtain a faithful representation in a fixed-dimensional feature space. This talk is applied to radio astronomy data to identify very faint radio frequency interference (RFI) contaminating the rest of the data.
[01312] Path Development Network with Finite-dimensional Lie Group
Format : Online Talk on Zoom
Author(s) :
Hang Lou (University College London)
Hao Ni (University College London)
Siran Li (Shanghai Jiao Tong University)
Abstract : We propose a novel, trainable path development layer that exploits representations of sequential data through finite-dimensional Lie groups. The path development, which originates from rough path theory, inherits useful analytical properties from path signatures while also offering much richer group structures. Empirical results show the superiority of the development layer over signature features in terms of accuracy and dimensionality. The compact hybrid model, which stacks a one-layer LSTM with the development layer, achieves state-of-the-art performance against various RNN and continuous time series models on various datasets.
[01355] From CCTV video streams to inferring NO2 emissions at city-scale
Format : Online Talk on Zoom
Author(s) :
Mohamed Ibrahim (University of Leeds)
Terry Lyons (Oxford university)
Abstract : In this talk, we show how we can infer NO2 emissions from CCTV video streams at city-scale through rough path theory. we introduce a framework for mapping objects in CCTV video streams as a stream of paths highlighting the order in which events take place. This temporal representation gives a descriptive summary for video contents which we can maximise: 1) data anonymity, and 2) systematic readability of large-scale video streams.
[01359] Addressing bias adversarially in online learning.
Format : Online Talk on Zoom
Author(s) :
Elena Gal (University of Oxford)
Abstract : We consider a class of online problems where the true label is only observed when a data point is assigned a positive label by a learner, eg for bank loans. In this setting the labelled training set suffers from accumulating bias since it is created by learners past decisions.
We propose to address the bias in the training set using adversarial domain adaptation. Our approach significantly exceeds SOTA on a set of challenging benchmark problems.
[01353] Improving Training of Neural CDEs
Format : Online Talk on Zoom
Author(s) :
Jason Michael Rader (University of Oxford)
Abstract : Neural CDEs are continuous-time analogues of recurrent neural networks which are effective at handling irregular time steps, densely sampled data, and long time series. We present recent advances in training neural CDEs in a memory efficient manner for large-scale problems.
[03470] From MMD-Regime detection to MMD-Generative Models with Applications
Format : Online Talk on Zoom
Author(s) :
Blanka Horvath (Universiry of Oxford)
Zacharia Issa (King's College London)
Abstract : Time series data derived from asset returns are known to exhibit certain properties, termed stylised facts, that are consistently prevalent across asset classes and markets. For example, to name a few, asset price returns are widely accepted to be non-stationary, non-auto correlative, and to exhibit volatility clustering. We refer the reader to \cite{cont2001empirical} for a thorough discussion of such properties. In this article we will turn our attention to one property in particular, the heteroscedastic nature of financial time series, since it is of imminent practical relevance to financial analysts and quants for a multitude of practical applications. In this context, one may be interested in whether a given asset returns series---or a set of series, in case of multiple assets---can be divided into periods in which the (random) asset price dynamics can be attributed to the same underlying distribution (up to, perhaps, a small estimation error). Such periods are often referred to as market regimes, and we call the task of finding an effective way of grouping these regimes the market regime clustering problem (MRCP). This article is devoted to the online detection of such regimes, i.e. to developing tools that help us recognise in real time (as data comes in) if a shift in the underlying regime is happening.
[01236] Capturing Graphs with Hypo-Elliptic Diffusions
Format : Online Talk on Zoom
Author(s) :
Csaba Toth (University of Oxford)
Darrick Lee (University of Oxford)
Celia Hacker (MPI for Mathematics in the Sciences)
Harald Oberhauser (University of Oxford)
Abstract : Convolutional layers within graph neural networks operate by aggregating information about local neighbourhood structures; one common way to encode such substructures is through random walks. The distribution of these random walks evolves according to a diffusion equation defined using the graph Laplacian. We extend this approach by leveraging classic mathematical results about hypo-elliptic diffusions. This results in a novel tensor-valued graph operator, which we call the hypo-elliptic graph Laplacian. We provide theoretical guarantees and efficient low-rank approximation algorithms.