Current research
Forward Orthogonal Least Squares Regression:
Papers submission to IEEE
My first paper has been published by IEEE concerning my research on NARMAX systems, where I propose an optimized version of the FOrLSR in addition to two new algorithmic classes in that framework. In short, the FOrLSR algorithms allow to obtain an analytical expression (aka symbolic representation) for any unknown system or function. The abstract and the conclusion of the papers below give an overview of the work.
Paper 1: Arborescent Orthogonal Least Squares Regression (AOrLSR)
The first part (rFOrLSR) presents a common algorithm but with its nested loops being transformed in a single matrix operation (for heavy parallelization and GPU operation) and made recursive to flatten the complexity from quadratic to linear. → linear algebra-based optimization with numerics.
The second part of the paper is the arborescence design, which is essentially a linear algebra and graph-theory-based optimization procedure to tackle the NP-hard problem of finding the linear equation solution with the largest amount of zero entries in the solution vector and the smallest solving error.
Read or download the AOrLSR paper. (alt download).Paper 2: Dictionary Morphing Least Squares Regression (DMOrLSR)
The second paper (DMOrLSR) morphs the user-passed regressor (functions) to allow expansions which adapt their terms to the system output (imagine a fourier transform which finds the exact peaks and then only contains those terms, being sparse). This is based on genetic algorithms, linear algebra, matrix calculus / infinitesimal optimization, with closed form gradients and Hessians independent of the user-passed function, number of arguments and data.
The library makes optimal sparse least squares fitting of any vectors passed by the user for the given system response. All provided examples are with severely non-linear auto-regressive systems, however the user can pass dictionaries with any type of functions. This is thus a very general mathematical framework supporting for example wavelet-, RBF, polynomial, etc fitting. The papers also contain an example of using the library to fit a nested expansion in |x|^n inside a fraction or of nesting non-linearities inside other non-linearities.
Read or download the DMOrLSR paper.Python Package for both Papers:
Those algorithms will also soon be released as a GPU-accelerated open-source python machine learning packageThe library makes optimal sparse least squares fitting of any vectors passed by the user for the given system response. All provided examples are with severely non-linear auto-regressive systems, however the user can pass dictionaries with any type of functions. This is thus a very general mathematical framework supporting for example wavelet-, RBF, polynomial, etc fitting. The papers also contain an example of using the library to fit a nested expansion in |x|^n inside a fraction or of nesting non-linearities inside other non-linearities.
Supplementary NARMAX research
Based on my previously described paper, further ameliorations and papers are planned:- Together with a Technische Universität Berlin professor, research is planned to replace some layers of neurons in Audio-Generating Neural Networks (for Text to Speech generation, Music synthesis, etc) by more efficient and interpretable NARMAX systems
- In depth study of the effect of data-set size and excitation signal on the fitting quality and convergence speed
- Ameliorations to the genetic vector space generator sub-algorithm
- Morphing support for rational NARMAXes and noise stabilization
Dimensionality selection paper
This machine learning algorithm automatically selects the amount of information worth keeping to represent data or a digital system. The applications are for example finding ideal compression ratios (storage and transmission) or ideal approximations of systems.
I propose a version of the algorithm by Zhu and Ghodsi in 2006 , where the recursion is unrolled to a much more efficient matrix-form algorithm, which also adds the ability to introduce constraints on the dimensionality selection. The derivation and a C++ and a Python implementation are finished and examples are documented but the paper hasn’t been written yet.
Alternative link for paper.Eigen contributions
Eigen is a open-source highly optimized lazy-evaluation C++ expression-tree-based linear algebra library, which is used amongst other in Tensorflow, on which Google's AIs are based.
→ Accepted Contribution:- Meta-templated DenseBase Concatenation
- Meta-templated Binary indexing as found in Numpy/Matlab
→ Currently in discussion with the engineers if this conversion workaround is to be adopted or if the framework should be modified to support it natively. - Some bug reports
→ Contributions I am working on:
- A Least-Squares Sparsifying system solver for largely under-determined systems based on the Machine Learning research going into my next paper (rFOrLSR)
- Guides/Docs to convert Numpy code into Eigen code
- Potentially contribute C++ bindings allowing to use Matplotlib for debugging Eigen data-structures.