4.3 Transfer to an Unlabeled Task using kernel marginal predictors (Gilles Blanchard)
Listen now
Description
We consider a classification problem: the goal is to assign class labels to an unlabeled test data set, given several labeled training data sets drawn from different but similar distributions. In essence, the goal is to predict labels from (an estimate of) the marginal distribution (of the unlabeled data) by learning the trends present in related classification tasks that are already known. In this sense, this problem belongs to the category of so-called "transfer learning" in machine learning. The probabilistic model used is that the different training and test distributions are themselves i.i.d. realizations from a distribution on distributions. Conceptually, this setting can be related to traditional random effects models in statistics, although here the approach is nonparametric and distribution-free. This problem arises in several applications where data distributions fluctuate because of biological, technical, or other sources of variation. We develop a distribution-free, kernel-based approach to the problem. This approach involves identifying an appropriate reproducing kernel Hilbert space and optimizing a regularized empirical risk over the space. We present generalization error analysis, describe universal kernels, and establish universal consistency of the proposed methodology. Experimental results on flow cytometry data are presented.
More Episodes
Functional data are becoming increasingly common in a variety of fields. Many studies underline the importance to consider the representation of data as functions. This has sparked a growing attention in the development of adapted statistical tools that allow to analyze such kind of data :...
Published 12/03/14
The idea of selecting a model via penalizing a log-likelihood type criterion goes back to the early seventies with the pioneering works of Mallows and Akaike. One can find many consistency results in the literature for such criteria. These results are asymptotic in the sense that one deals with a...
Published 12/03/14