t-Distributed Stochastic Neighborhood Embedding

The t-Distributed Stochastic Neighborhood Embedding (t-SNE) is a statistical dimensionality reduction methods, based on the original SNE[1] method with t-distributed variant[2]. The method constructs a probability distribution over pairwise distances in the data original space, and then optimizes a similar probability distribution of the pairwise distances of low-dimensional embedding of the data by minimizing the Kullback-Leibler divergence between two distributions.

This package defines a TSNE type to represent a t-SNE model, and provides a set of methods to access its properties.

ManifoldLearning.TSNEType
TSNE{NN <: AbstractNearestNeighbors, T <: Real} <: NonlinearDimensionalityReduction

The TSNE type represents a t-SNE model constructed for T type data with a help of the NN nearest neighbor algorithm.

source
StatsAPI.fitMethod
fit(TSNE, data; p=30, maxoutdim=2, kwargs...)

Fit a t-SNE model to data.

Arguments

  • data: a matrix of observations. Each column of data is an observation.

Keyword arguments

  • p: a perplexity parameter (defaut 30).
  • maxoutdim: a dimension of the reduced space (defaut 2).
  • maxiter: a total number of iterations for the search algorithm (defaut 800).
  • exploreiter: a number of iterations for the exploration stage of the search algorithm (defaut 200).
  • tol: a tolerance threshold (default 1e-7).
  • exaggeration: a tightness control parameter between the original and the reduced space (defaut 12).
  • initialize: an initialization parameter for the embedding (defaut :pca).
  • rng: a random number generator object for initialization of the initial embedding.
  • nntype: a nearest neighbor construction class (derived from AbstractNearestNeighbors)

Examples

M = fit(TSNE, rand(3,100)) # construct t-SNE model
R = predict(M)             # perform dimensionality reduction
source
StatsAPI.predictMethod
predict(R::TSNE)

Transforms the data fitted to the t-SNE model R into a reduced space representation.

source

References

  • 1Hinton, G. E., & Roweis, S. (2002). Stochastic neighbor embedding. Advances in neural information processing systems, 15.
  • 2Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(11).