t-Distributed Stochastic Neighborhood Embedding

The t-Distributed Stochastic Neighborhood Embedding (t-SNE) is a statistical dimensionality reduction methods, based on the original SNE^[1] method with t-distributed variant^[2]. The method constructs a probability distribution over pairwise distances in the data original space, and then optimizes a similar probability distribution of the pairwise distances of low-dimensional embedding of the data by minimizing the Kullback-Leibler divergence between two distributions.

This package defines a TSNE type to represent a t-SNE model, and provides a set of methods to access its properties.

ManifoldLearning.TSNE — Type

TSNE{NN <: AbstractNearestNeighbors, T <: Real} <: NonlinearDimensionalityReduction

The TSNE type represents a t-SNE model constructed for T type data with a help of the NN nearest neighbor algorithm.

source

StatsAPI.fit — Method

fit(TSNE, data; p=30, maxoutdim=2, kwargs...)

Fit a t-SNE model to data.

Arguments

data: a matrix of observations. Each column of data is an observation.

Keyword arguments

p: a perplexity parameter (defaut 30).
maxoutdim: a dimension of the reduced space (defaut 2).
maxiter: a total number of iterations for the search algorithm (defaut 800).
exploreiter: a number of iterations for the exploration stage of the search algorithm (defaut 200).
tol: a tolerance threshold (default 1e-7).
exaggeration: a tightness control parameter between the original and the reduced space (defaut 12).
initialize: an initialization parameter for the embedding (defaut :pca).
rng: a random number generator object for initialization of the initial embedding.
nntype: a nearest neighbor construction class (derived from AbstractNearestNeighbors)

Examples

M = fit(TSNE, rand(3,100)) # construct t-SNE model
R = predict(M)             # perform dimensionality reduction

source

StatsAPI.predict — Method

predict(R::TSNE)

Transforms the data fitted to the t-SNE model R into a reduced space representation.

source

References

1Hinton, G. E., & Roweis, S. (2002). Stochastic neighbor embedding. Advances in neural information processing systems, 15.
2Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(11).