JMLR: Vol 13, No

Volume 133/1/2012

Volume 13

3/1/2012

Publisher:

JMLR.org

ISSN:1532-4435

EISSN:1533-7928

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

article

Free

Distance metric learning with eigenvalue optimization

Pages 1–26

The main theme of this paper is to develop a novel eigenvalue optimization framework for learning a Mahalanobis metric. Within this context, we introduce a novel metric learning approach called DML-eig which is shown to be equivalent to a well-known ...

article

Free

Conditional likelihood maximisation: a unifying framework for information theoretic feature selection

Pages 27–66

We present a unifying framework for information theoretic feature selection, bringing almost two decades of research on heuristic filter criteria under a single theoretical interpretation. This is in response to the question: "what are the implicit ...

article

Free

Plug-in approach to active learning

Stanislav Minsker

Pages 67–90

We present a new active learning algorithm based on nonparametric estimators of the regression function. Our investigation provides probabilistic bounds for the rates of convergence of the generalization error achievable by proposed method over a broad ...

article

Free

Refinement of operator-valued reproducing kernels

Pages 91–136

This paper studies the construction of a refinement kernel for a given operator-valued reproducing kernel such that the vector-valued reproducing kernel Hilbert space of the refinement kernel contains that of the given kernel as a subspace. The study is ...

article

Free

An active learning algorithm for ranking from pairwise preferences with an almost optimal query complexity

Nir Ailon

Pages 137–164

Given a set V of n elements we wish to linearly order them given pairwise preference labels which may be non-transitive (due to irrationality or arbitrary noise).

The goal is to linearly order the elements while disagreeing with as few pairwise ...

article

Free

Optimal distributed online prediction using mini-batches

Pages 165–202

Online prediction methods are typically presented as serial algorithms running on a single processor. However, in the age of web-scale prediction problems, it is increasingly common to encounter situations where a single processor cannot keep up with ...

article

Free

Active clustering of biological sequences

Pages 203–225

Given a point set S and an unknown metric d on S, we study the problem of efficiently partitioning S into k clusters while querying few distances between the points. In our model we assume that we have access to one versus all queries that given a point ...

article

Free

Multi kernel learning with online-batch optimization

Pages 227–253

In recent years there has been a lot of interest in designing principled classification algorithms over multiple cues, based on the intuitive notion that using more features should lead to better performance. In the domain of kernel methods, a ...

article

Free

Active learning via perfect selective classification

Pages 255–279

We discover a strong relation between two known learning models: stream-based active learning and perfect selective classification (an extreme case of 'classification with a reject option'). For these models, restricted to the realizable case, we show a ...

article

Free

Random search for hyper-parameter optimization

Pages 281–305

Grid search and manual search are the most widely used strategies for hyper-parameter optimization. This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid. ...

article

Free

Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics

Pages 307–361

We consider the task of estimating, from observed data, a probabilistic model that is parameterized by a finite number of parameters. In particular, we are considering the situation where the model probability density function is unnormalized. That is, ...

article

Free

Bounding the probability of error for high precision optical character recognition

Pages 363–387

We consider a model for which it is important, early in processing, to estimate some variables with high precision, but perhaps at relatively low recall. If some variables can be identified with near certainty, they can be conditioned upon, allowing ...

article

Free

Minimax-optimal rates for sparse additive models over kernel classes via convex programming

Pages 389–427

Sparse additive models are families of d-variate functions with the additive decomposition f* = Σ_j∈S f_j*, where S is an unknown subset of cardinality s < d. In this paper, we consider the case where each univariate component function f_j* lies in a ...

article

Free

Online learning in the embedded manifold of low-rank matrices

Pages 429–458

When learning models that are represented in matrix forms, enforcing a low-rank constraint can dramatically improve the memory and run time complexity, while providing a natural regularization of the model. However, naive approaches to minimizing ...

article

Free

Multi-assignment clustering for boolean data

Pages 459–489

We propose a probabilistic model for clustering Boolean data where an object can be simultaneously assigned to multiple clusters. By explicitly modeling the underlying generative process that combines the individual source emissions, highly structured ...

article

Free

Eliminating spammers and ranking annotators for crowdsourced labeling tasks

Pages 491–518

With the advent of crowdsourcing services it has become quite cheap and reasonably effective to get a data set labeled by multiple annotators in a short amount of time. Various methods have been proposed to estimate the consensus labels by correcting ...

article

Free

Metric and kernel learning using a linear transformation

Pages 519–547

Metric and kernel learning arise in several machine learning applications. However, most existing metric learning algorithms are limited to learning metrics over low-dimensional data, while existing kernel learning algorithms are often limited to the ...

article

Free

MULTIBOOST: a multi-purpose boosting package

Pages 549–553

The MULTIBOOST package provides a fast C++ implementation of multi-class/multi-label/multitask boosting algorithms. It is based on ADABOOST.MH but it also implements popular cascade classifiers and FILTERBOOST. The package contains common multi-class ...

article

Free

ML-Flex: a flexible toolbox for performing classification analyses in parallel

Pages 555–559

Motivated by a need to classify high-dimensional, heterogeneous data from the bioinformatics domain, we developed ML-Flex, a machine-learning toolbox that enables users to perform two-class and multi-class classification analyses in a systematic yet ...

article

Free

A primal-dual convergence analysis of boosting

Matus Telgarsky

Pages 561–606

Boosting combines weak learners into a predictor with low empirical risk. Its dual constructs a high entropy distribution upon which weak learners and training labels are uncorrelated. This manuscript studies this primal-dual relationship under a broad ...

article

Free

Non-sparse multiple kernel fisher discriminant analysis

Pages 607–642

Sparsity-inducing multiple kernel Fisher discriminant analysis (MK-FDA) has been studied in the literature. Building on recent advances in non-sparse multiple kernel learning (MKL), we propose a non-sparse version of MK-FDA, which imposes a general l_p ...

article

Free

Learning algorithms for the classification restricted Boltzmann machine

Pages 643–669

Recent developments have demonstrated the capacity of restricted Boltzmann machines (RBM) to be powerful generative models, able to extract useful features from input data or construct deep artificial neural networks. In such settings, the RBM only ...

article

Free

Structured sparsity and generalization

Pages 671–690

We present a data dependent generalization bound for a large class of regularized algorithms which implement structured sparsity constraints. The bound can be applied to standard squared-norm regularization, the Lasso, the group Lasso, some versions of ...

article

Free

A case study on meta-generalising: a Gaussian processes approach

Pages 691–721

We propose a novel model for meta-generalisation, that is, performing prediction on novel tasks based on information from multiple different but related tasks. The model is based on two coupled Gaussian processes with structured covariance function; one ...

article

Free

A kernel two-sample test

Pages 723–773

We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions ...

article

Free

GPLP: a local and parallel computation toolbox for Gaussian process regression

Pages 775–779

This paper presents the Getting-started style documentation for the local and parallel computation toolbox for Gaussian process regression (GPLP), an open source software package written in Matlab (but also compatible with Octave). The working ...

article

Free

Exact covariance thresholding into connected components for large-scale graphical lasso

Pages 781–794

We consider the sparse inverse covariance regularization problem or graphical lasso with regularization parameter λ. Suppose the sample covariance graph formed by thresholding the entries of the sample covariance matrix at λ is decomposed into connected ...

article

Free

Algorithms for learning kernels based on centered alignment

Pages 795–828

This paper presents new and effective algorithms for learning kernels. In particular, as shown by our empirical results, these algorithms consistently outperform the so-called uniform combination solution that has proven to be difficult to improve upon ...

article

Free

Causal bounds and observable constraints for non-deterministic models

Roland R. Ramsahai

Pages 829–848

Conditional independence relations involving latent variables do not necessarily imply observable independences. They may imply inequality constraints on observable parameters and causal bounds, which can be used for falsification and identification. ...

article

Free

NIMFA: a python library for nonnegative matrix factorization

Pages 849–853

NIMFA is an open-source Python library that provides a unified interface to nonnegative matrix factorization algorithms. It includes implementations of state-of-the-art factorization methods, initialization approaches, and quality scoring. It supports ...

The Journal of Machine Learning Research

Sections

Distance metric learning with eigenvalue optimization

Conditional likelihood maximisation: a unifying framework for information theoretic feature selection

Plug-in approach to active learning

Refinement of operator-valued reproducing kernels

An active learning algorithm for ranking from pairwise preferences with an almost optimal query complexity

Optimal distributed online prediction using mini-batches

Active clustering of biological sequences

Multi kernel learning with online-batch optimization

Active learning via perfect selective classification

Random search for hyper-parameter optimization

Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics

Bounding the probability of error for high precision optical character recognition

Minimax-optimal rates for sparse additive models over kernel classes via convex programming

Online learning in the embedded manifold of low-rank matrices

Multi-assignment clustering for boolean data

Eliminating spammers and ranking annotators for crowdsourced labeling tasks

Metric and kernel learning using a linear transformation

MULTIBOOST: a multi-purpose boosting package

ML-Flex: a flexible toolbox for performing classification analyses in parallel

A primal-dual convergence analysis of boosting

Non-sparse multiple kernel fisher discriminant analysis

Learning algorithms for the classification restricted Boltzmann machine

Structured sparsity and generalization

A case study on meta-generalising: a Gaussian processes approach

A kernel two-sample test

GPLP: a local and parallel computation toolbox for Gaussian process regression

Exact covariance thresholding into connected components for large-scale graphical lasso

Algorithms for learning kernels based on centered alignment

Causal bounds and observable constraints for non-deterministic models

NIMFA: a python library for nonnegative matrix factorization

Sections

Save to Binder

Subjects

Comments