2.3 Discriminative clustering for high-dimensional data (Camille

StatLearn 2012 - Workshop on "Challenging...

2.3 Discriminative clustering for high-dimensional data (Camille Brunet)

Listen now

Description

A new family of 12 probabilistic models, introduced recently, aims to simultaneously cluster and visualize high-dimensional data. It is based on a mixture model which fits the data into a latent discriminative subspace with an intrinsic dimension bounded by the number of clusters. An estimation procedure, named the Fisher-EM algorithm has also been proposed and turns out to outperform other subspace clustering in most situations. Moreover the convergence properties of the Fisher-EM algorithm are discussed; in particular it is proved that the algorithm is a GEM algorithm and converges under weak conditions in the general case. Finally, a sparse extension of the Fisher-EM algorithm is proposed in order to perform a selection of the original variables which are discriminative.

More Episodes

See all »

StatLearn 2012 - Workshop on "Challenging problems in Statistical Learning"

Published 12/03/14

2.2 Functional estimation in high dimensional data : Application to classification (Sophie Dabo-Niang)

Functional data are becoming increasingly common in a variety of fields. Many studies underline the importance to consider the representation of data as functions. This has sparked a growing attention in the development of adapted statistical tools that allow to analyze such kind of data :...

Published 12/03/14

4.1 Data-driven penalties: heuristics, results and thoughts... (Pascal Massart)

The idea of selecting a model via penalizing a log-likelihood type criterion goes back to the early seventies with the pioneering works of Mallows and Akaike. One can find many consistency results in the literature for such criteria. These results are asymptotic in the sense that one deals with a...

Published 12/03/14