Here is a simple algorithm for nuclear norm regularized problems, based on the recent approximate SDP solver of Hazan. This is a first-order method that only needs to compute one approximate eigenvector in each iteration, and can be seen as an instance of our more recent general "Convex Optimization without Projection Steps"-framework. And here is a link to the discussion site for our ICML 2010 paper.
Abstract:
Optimization problems with a nuclear norm regularization, such as e.g. low norm matrix factorizations, have seen many applications recently. We propose a new approximation algorithm building upon the recent sparse approximate SDP solver of (Hazan '08). The experimental efficiency of our method is demonstrated on large matrix completion problems such as the Netflix dataset. The algorithm comes with strong convergence guarantees, and can be interpreted as a first theoretically justified variant of Simon-Funk-type SVD heuristics. The method is free of tuning parameters, and very easy to parallelize.
This algorithm can be used to find arbitrary matrix factorizations under some convex loss function, if you want to do this under an additional nuclear norm constraint (regularization). Problems fitting into this framework are known under various names as for example robust PCA, kernel PCA, variants of sparse PCA, LDA, CCA, Locality Preserving Projections, and Spectral Clustering, see e.g. the overview by F. de la Torre. A nice side-effect of our approach is that we do not have any problems of local minima, and obtain a guaranteed approximate solution which is simultaneously of low rank. Paper and talk slides are attached below.
Implementation: The .java file below contains the algorithm we used for the experiments, using the power-method as the internal eigenvector procedure. The code is not yet completely self-contained, as the input data (the known/observed entries of the matrix which we want to complete) come in different formats for different datasets. The algorithm just assumes that all known entries are stored in a sequential list already. The algorithm never stores a full matrix, and therefore is able to scale to e.g. the size of the Netflix dataset.
Also attached below, you find my 10 slides of an earlier short seminar talk on the same topic:
Approximate SDP solvers, Matrix Factorizations, the Netflix Prize, and PageRank [ Mittags-Seminar Talk Abstract ]















































