Personal Blog

Nonlinear Dimensionality Reduction

In statistical learning, many problems require initial preprocessing of multi-dimensional data, and often reduce dimensionality of the data, in a way, to compress features without loosing information about relevant data properties. Common linear dimensionality reduction methods, such as PCA or MDS, in many cases cannot properly reduce data dimensionality especially when data located around nonlinear manifold embedding in high-dimensional space.

There are many nonlinear dimensionality reduction (NLDR) methods for construction of low-dimensional manifold embeddings. A Julia language package ManifoldLearning.jl provides implementation of most common algorithms.

Feb 27, 2022 julia machine learning manifold learning dimensionality reduction clustering

Autoencoders

An autoencoder is a type of a neural network used to learn, in an unsupervised way, a compressed data representation by matching its input to its output. An efficient compression is archived by minimizing the reconstruction error.



Michela Massi, CC BY-SA 4.0, via Wikimedia Commons

Jul 29, 2021 deep learning unsupervised leanring autoencoder julia DNN SGD MMD

MLE Optimization for Mixture Models

Introduction

Mixture models are used for many purposes in data science, e.g. to represent feature distributions or spatial relations. Given a fixed data sample, one can fit a mixture model to it using one of a variety of methods. A very common mixture structure is based on Gaussian distributions, the Gaussian Mixture Model (GMM). The expectation-minimization (EM) algorithm allows to find GMM parameters by maximizing model’s likelihood. This approach usually requires closed-form description of parameters’ estimates and have slow convergence to optimal solution. These limitation can be overcome by using (quasi-)Newton optimization. In conjunction with automatic differentiation (AD), finding optimal model parameters become trivial.

GMM

For modeling a collection of time series, another model based on a mixture of ARMA processes, Mixture of ARMA Models, can be used. Using similar MLE optimization and AD approach, we will show how to cluster time series.

Jun 17, 2021 data science differential programming optimization mixture models clustering MLE julia ARMA GMM

MLE Optimization for Regression Models

Introduction

The goal of regression is to predict the value of one or more continuous target variables $t$ given the value of a $D$-dimensional vector $x$ of input variables. The polynomial is a specific example of a broad class of functions called linear regression models. The simplest form of linear regression models are also linear functions of the input variables. However, much more useful class of functions can be constructed by taking linear combinations of a fix set of nonlinear functions of the input variables, known as basis functions [1].

Regression

Regression models can be used for time series modeling. Typically, a regression model provides a projection from the baseline status to some relevant demographic variables. Curve-type time series data are quite common examples of these kinds of variables. Typical time series model is the ARMA model. It’s a combination of two types of time series data processes, namely, autoregressive and moving average processes.

May 22, 2021 julia data science differential programming SGD MLE regression ARMA

Stochastic Gradient Descent in Data Science

Introduction

Stochastic gradient descent (SGD) is a popular stochastic optimization algorithm in the field of machine learning, especially for optimizing deep neural networks. In its core, this iterative algorithm combines two optimization techniques: a stochastic approximation with gradient descent.

sgd

SGD is common for optimizing various range of models. We are interested in application of this optimization techniques to standard data science tasks as linear regression and clustering. In addition, we’ll use differentiable programming techniques for simplifying and making versatile our SGD implementation.

Apr 12, 2021 julia data science differential programming SGD linear regression clustering