Talk: High Margin => Low Dimension
03 / 2009
en | de
Home
Icon

Slides of a short seminar talk again about Support Vector Machines:
Kernels, Margins, Coresets and The Fight of High- vs. Low-Dimensional Spaces

Abstract:

Finding a separating hyperplane between two point clouds is the essence of kernel methods, which have become a standard tool in machine learning for all kinds of classification and regression problems (and it can read your thoughts ;-)). It is usually assumed that the points live in an implicit high-dimensional space, and that in order to solve the problem we are only allowed black-box access to the scalar product (called the kernel) in that space.

a-and-o-small.png

It has been entirely assumed that the success of these methods comes from just this ability to use very high-dimensional mappings, without increasing computational cost.

Well, not entirely... One small village paper of indomitable Balcan, Blum and Vempala '06 still holds out against the Roman invaders, claiming that a kernel should instead be interpreted as a good mapping to a low-dimensional space.

The idea comes from the Johnson-Lindenstrauss lemma applied to our setting: If the two point clouds are separable by some margin, then also in a random projection to a very low-dimensional subspace, the points should still be separable by roughly the same margin. 

o-by-large-margin-small.png

We will try to support this idea by advertising coresets as the deterministic analogon of such low-dimensional projections.

 

Based on a true story:

M. Balcan, A. Blum and S. Vempala. Kernels as features: On kernels, margins, and low-dimensional mappings. Machine Learning (2006) vol. 65 (1) pp. 79-94

or the survey on the same topic:

A. Blum. Random Projection, Margins, Kernels, and Feature-Selection. Subspace, Latent Structure and Feature Selection, LNCS (2006) pp. 52-68