Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–35 of 35 results for author: Borgwardt, K

.
  1. arXiv:2406.03386  [pdf, other

    cs.LG stat.ML

    Learning Long Range Dependencies on Graphs via Random Walks

    Authors: Dexiong Chen, Till Hendrik Schulz, Karsten Borgwardt

    Abstract: Message-passing graph neural networks (GNNs), while excelling at capturing local relationships, often struggle with long-range dependencies on graphs. Conversely, graph transformers (GTs) enable information exchange between all nodes but oversimplify the graph structure by treating them as a set of fixed-length vectors. This work proposes a novel architecture, NeuralWalker, that overcomes the limi… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2401.14819  [pdf, other

    q-bio.QM cs.LG q-bio.BM

    Endowing Protein Language Models with Structural Knowledge

    Authors: Dexiong Chen, Philip Hartout, Paolo Pellizzoni, Carlos Oliver, Karsten Borgwardt

    Abstract: Understanding the relationships between protein sequence, structure and function is a long-standing biological challenge with manifold implications from drug design to our understanding of evolution. Recently, protein language models have emerged as the preferred method for this challenge, thanks to their ability to harness large sequence databases. Yet, their reliance on expansive sequence data a… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  3. arXiv:2305.07580  [pdf, other

    stat.ML cs.LG

    Fisher Information Embedding for Node and Graph Learning

    Authors: Dexiong Chen, Paolo Pellizzoni, Karsten Borgwardt

    Abstract: Attention-based graph neural networks (GNNs), such as graph attention networks (GATs), have become popular neural architectures for processing graph-structured data and learning node embeddings. Despite their empirical success, these models rely on labeled data and the theoretical properties of these models have yet to be fully understood. In this work, we propose a novel attention-based node embe… ▽ More

    Submitted 6 June, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: ICML 2023

  4. arXiv:2207.02968  [pdf, other

    stat.ML cs.LG

    Unsupervised Manifold Alignment with Joint Multidimensional Scaling

    Authors: Dexiong Chen, Bowen Fan, Carlos Oliver, Karsten Borgwardt

    Abstract: We introduce Joint Multidimensional Scaling, a novel approach for unsupervised manifold alignment, which maps datasets from two different domains, without any known correspondences between data instances across the datasets, to a common low-dimensional Euclidean space. Our approach integrates Multidimensional Scaling (MDS) and Wasserstein Procrustes analysis into a joint optimization problem to si… ▽ More

    Submitted 16 February, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: ICLR 2023, see https://openreview.net/forum?id=lUpjsrKItz4

  5. arXiv:2206.01008  [pdf, other

    cs.LG stat.ML

    Approximate Network Motif Mining Via Graph Learning

    Authors: Carlos Oliver, Dexiong Chen, Vincent Mallet, Pericles Philippopoulos, Karsten Borgwardt

    Abstract: Frequent and structurally related subgraphs, also known as network motifs, are valuable features of many graph datasets. However, the high computational complexity of identifying motif sets in arbitrary datasets (motif mining) has limited their use in many real-world datasets. By automatically leveraging statistical properties of datasets, machine learning approaches have shown promise in several… ▽ More

    Submitted 7 June, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

  6. arXiv:2202.03036  [pdf, other

    stat.ML cs.LG

    Structure-Aware Transformer for Graph Representation Learning

    Authors: Dexiong Chen, Leslie O'Bray, Karsten Borgwardt

    Abstract: The Transformer architecture has gained growing attention in graph representation learning recently, as it naturally overcomes several limitations of graph neural networks (GNNs) by avoiding their strict structural inductive biases and instead only encoding the graph structure via positional encoding. Here, we show that the node representations generated by the Transformer with positional encoding… ▽ More

    Submitted 13 June, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: To appear in ICML 2022

  7. arXiv:2112.09992  [pdf, other

    cs.LG cs.DS cs.NE stat.ML

    Weisfeiler and Leman go Machine Learning: The Story so far

    Authors: Christopher Morris, Yaron Lipman, Haggai Maron, Bastian Rieck, Nils M. Kriege, Martin Grohe, Matthias Fey, Karsten Borgwardt

    Abstract: In recent years, algorithms and neural architectures based on the Weisfeiler--Leman algorithm, a well-known heuristic for the graph isomorphism problem, have emerged as a powerful tool for machine learning with graphs and relational data. Here, we give a comprehensive overview of the algorithm's use in a machine-learning setting, focusing on the supervised regime. We discuss the theoretical backgr… ▽ More

    Submitted 13 July, 2023; v1 submitted 18 December, 2021; originally announced December 2021.

    Comments: Accepted at JMLR

  8. arXiv:2107.05230  [pdf, other

    cs.LG

    Predicting sepsis in multi-site, multi-national intensive care cohorts using deep learning

    Authors: Michael Moor, Nicolas Bennet, Drago Plecko, Max Horn, Bastian Rieck, Nicolai Meinshausen, Peter Bühlmann, Karsten Borgwardt

    Abstract: Despite decades of clinical research, sepsis remains a global public health crisis with high mortality, and morbidity. Currently, when sepsis is detected and the underlying pathogen is identified, organ damage may have already progressed to irreversible stages. Effective sepsis management is therefore highly time-sensitive. By systematically analysing trends in the plethora of clinical data availa… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

  9. arXiv:2106.01098  [pdf, other

    cs.LG cs.SI stat.ML

    Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions

    Authors: Leslie O'Bray, Max Horn, Bastian Rieck, Karsten Borgwardt

    Abstract: Graph generative models are a highly active branch of machine learning. Given the steady development of new models of ever-increasing complexity, it is necessary to provide a principled way to evaluate and compare them. In this paper, we enumerate the desirable criteria for such a comparison metric and provide an overview of the status quo of graph generative model comparison in use today, which p… ▽ More

    Submitted 18 March, 2022; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted as a Spotlight presentation at ICLR 2022

  10. arXiv:2102.07835  [pdf, other

    cs.LG math.AT stat.ML

    Topological Graph Neural Networks

    Authors: Max Horn, Edward De Brouwer, Michael Moor, Yves Moreau, Bastian Rieck, Karsten Borgwardt

    Abstract: Graph neural networks (GNNs) are a powerful architecture for tackling graph learning tasks, yet have been shown to be oblivious to eminent substructures such as cycles. We present TOGL, a novel layer that incorporates global topological information of a graph using persistent homology. TOGL can be easily integrated into any type of GNN and is strictly more expressive (in terms the Weisfeiler--Lehm… ▽ More

    Submitted 17 March, 2022; v1 submitted 15 February, 2021; originally announced February 2021.

    Journal ref: Tenth International Conference on Learning Representations (ICLR), 2022

  11. arXiv:2011.03854  [pdf, other

    cs.LG stat.ML

    Graph Kernels: State-of-the-Art and Future Challenges

    Authors: Karsten Borgwardt, Elisabetta Ghisu, Felipe Llinares-López, Leslie O'Bray, Bastian Rieck

    Abstract: Graph-structured data are an integral part of many application domains, including chemoinformatics, computational biology, neuroimaging, and social network analysis. Over the last two decades, numerous graph kernels, i.e. kernel functions between graphs, have been proposed to solve the problem of assessing the similarity between graphs, thereby making it possible to perform predictions in both cla… ▽ More

    Submitted 10 November, 2020; v1 submitted 7 November, 2020; originally announced November 2020.

    Comments: Accepted by Foundations and Trends in Machine Learning, 2020

  12. arXiv:2009.06116  [pdf, other

    cs.CV cs.DB cs.DL cs.LG eess.IV

    Accelerating COVID-19 Differential Diagnosis with Explainable Ultrasound Image Analysis

    Authors: Jannis Born, Nina Wiedemann, Gabriel Brändle, Charlotte Buhre, Bastian Rieck, Karsten Borgwardt

    Abstract: Controlling the COVID-19 pandemic largely hinges upon the existence of fast, safe, and highly-available diagnostic tools. Ultrasound, in contrast to CT or X-Ray, has many practical advantages and can serve as a globally-applicable first-line examination technique. We provide the largest publicly available lung ultrasound (US) dataset for COVID-19 consisting of 106 videos from three classes (COVID-… ▽ More

    Submitted 13 September, 2020; originally announced September 2020.

    Comments: 8 pages, 4 figures

    Journal ref: Applied Sciences 2021 (special issue on: "Fighting COVID-19: Emerging Techniques and Aid Systems for Prevention, Forecasting and Diagnosis")

  13. arXiv:2006.07882  [pdf, other

    q-bio.NC cs.LG eess.IV math.AT stat.ML

    Uncovering the Topology of Time-Varying fMRI Data using Cubical Persistence

    Authors: Bastian Rieck, Tristan Yates, Christian Bock, Karsten Borgwardt, Guy Wolf, Nicholas Turk-Browne, Smita Krishnaswamy

    Abstract: Functional magnetic resonance imaging (fMRI) is a crucial technology for gaining insights into cognitive processes in humans. Data amassed from fMRI measurements result in volumetric data sets that vary over time. However, analysing such data presents a challenge due to the large degree of noise and person-to-person variation in how information is represented in the brain. To address this challeng… ▽ More

    Submitted 22 October, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

    Comments: Accepted at the Conference on Neural Information Processing Systems (NeurIPS) 2020; camera-ready version

  14. arXiv:2005.12359  [pdf, other

    cs.LG stat.ML

    Path Imputation Strategies for Signature Models of Irregular Time Series

    Authors: Michael Moor, Max Horn, Christian Bock, Karsten Borgwardt, Bastian Rieck

    Abstract: The signature transform is a 'universal nonlinearity' on the space of continuous vector-valued paths, and has received attention for use in machine learning on time series. However, real-world temporal data is typically observed at discrete points in time, and must first be transformed into a continuous path before signature techniques can be applied. We make this step explicit by characterising i… ▽ More

    Submitted 6 June, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

  15. arXiv:1909.12064  [pdf, other

    cs.LG stat.ML

    Set Functions for Time Series

    Authors: Max Horn, Michael Moor, Christian Bock, Bastian Rieck, Karsten Borgwardt

    Abstract: Despite the eminent successes of deep neural networks, many architectures are often hard to transfer to irregularly-sampled and asynchronous time series that commonly occur in real-world datasets, especially in healthcare applications. This paper proposes a novel approach for classifying irregularly-sampled time series with unaligned measurements, focusing on high scalability and data efficiency.… ▽ More

    Submitted 14 September, 2020; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: Accepted at the International Conference on Machine Learning (ICML) 2020

  16. arXiv:1909.05114  [pdf, other

    q-bio.BM cs.LG stat.ML

    PaccMann$^{RL}$: Designing anticancer drugs from transcriptomic data via reinforcement learning

    Authors: Jannis Born, Matteo Manica, Ali Oskooei, Joris Cadow, Karsten Borgwardt, María Rodríguez Martínez

    Abstract: With the advent of deep generative models in computational chemistry, in silico anticancer drug design has undergone an unprecedented transformation. While state-of-the-art deep learning approaches have shown potential in generating compounds with desired chemical properties, they disregard the genetic profile and properties of the target disease. Here, we introduce the first generative model capa… ▽ More

    Submitted 16 April, 2020; v1 submitted 29 August, 2019; originally announced September 2019.

    Comments: 18 pages total (12 pages main text, 4 pages references, 11 pages appendix) 8 figures

    Journal ref: International Conference on Research in Computational Molecular Biology 2020

  17. arXiv:1906.01277  [pdf, other

    cs.LG q-bio.MN stat.ML

    Wasserstein Weisfeiler-Lehman Graph Kernels

    Authors: Matteo Togninalli, Elisabetta Ghisu, Felipe Llinares-López, Bastian Rieck, Karsten Borgwardt

    Abstract: Most graph kernels are an instance of the class of $\mathcal{R}$-Convolution kernels, which measure the similarity of objects by comparing their substructures. Despite their empirical success, most graph kernels use a naive aggregation of the final set of substructures, usually a sum or average, thereby potentially discarding valuable information about the distribution of individual components. Fu… ▽ More

    Submitted 30 October, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: Accepted as a Spotlight talk at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  18. arXiv:1906.00722  [pdf, other

    cs.LG math.AT stat.ML

    Topological Autoencoders

    Authors: Michael Moor, Max Horn, Bastian Rieck, Karsten Borgwardt

    Abstract: We propose a novel approach for preserving topological structures of the input space in latent representations of autoencoders. Using persistent homology, a technique from topological data analysis, we calculate topological signatures of both the input and latent space to derive a topological loss term. Under weak theoretical assumptions, we construct this loss in a differentiable manner, such tha… ▽ More

    Submitted 31 May, 2021; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: Accepted at the International Conference on Machine Learning (ICML) 2020; camera-ready version

  19. arXiv:1904.07990  [pdf

    cs.LG stat.AP stat.ML

    Machine learning for early prediction of circulatory failure in the intensive care unit

    Authors: Stephanie L. Hyland, Martin Faltys, Matthias Hüser, Xinrui Lyu, Thomas Gumbsch, Cristóbal Esteban, Christian Bock, Max Horn, Michael Moor, Bastian Rieck, Marc Zimmermann, Dean Bodenham, Karsten Borgwardt, Gunnar Rätsch, Tobias M. Merz

    Abstract: Intensive care clinicians are presented with large quantities of patient information and measurements from a multitude of monitoring systems. The limited ability of humans to process such complex information hinders physicians to readily recognize and act on early signs of patient deterioration. We used machine learning to develop an early warning system for circulatory failure based on a high-res… ▽ More

    Submitted 19 April, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

    Comments: 5 main figures, 1 main table, 13 supplementary figures, 5 supplementary tables; 250ppi images

  20. arXiv:1902.01659  [pdf, other

    cs.LG stat.AP stat.ML

    Early Recognition of Sepsis with Gaussian Process Temporal Convolutional Networks and Dynamic Time Warping

    Authors: Michael Moor, Max Horn, Bastian Rieck, Damian Roqueiro, Karsten Borgwardt

    Abstract: Sepsis is a life-threatening host response to infection associated with high mortality, morbidity, and health costs. Its management is highly time-sensitive since each hour of delayed treatment increases mortality due to irreversible organ damage. Meanwhile, despite decades of clinical research, robust biomarkers for sepsis are missing. Therefore, detecting sepsis early by utilizing the affluence… ▽ More

    Submitted 15 October, 2020; v1 submitted 5 February, 2019; originally announced February 2019.

    Comments: Accepted at the Machine Learning for Healthcare 2019 Conference (MLHC). Camera-ready version

  21. arXiv:1812.09764  [pdf, other

    cs.LG math.AT stat.ML

    Neural Persistence: A Complexity Measure for Deep Neural Networks Using Algebraic Topology

    Authors: Bastian Rieck, Matteo Togninalli, Christian Bock, Michael Moor, Max Horn, Thomas Gumbsch, Karsten Borgwardt

    Abstract: While many approaches to make neural networks more fathomable have been proposed, they are restricted to interrogating the network with input data. Measures for characterizing and monitoring structural properties, however, have not been developed. In this work, we propose neural persistence, a complexity measure for neural network architectures based on topological data analysis on weighted strati… ▽ More

    Submitted 27 September, 2019; v1 submitted 23 December, 2018; originally announced December 2018.

    Comments: Published as a conference paper at ICLR 2019

  22. arXiv:1702.08694  [pdf, other

    stat.ML cs.LG stat.ME

    Finding Statistically Significant Interactions between Continuous Features

    Authors: Mahito Sugiyama, Karsten Borgwardt

    Abstract: The search for higher-order feature interactions that are statistically significantly associated with a class variable is of high relevance in fields such as Genetics or Healthcare, but the combinatorial explosion of the candidate space makes this problem extremely challenging in terms of computational efficiency and proper correction for multiple testing. While recent progress has been made regar… ▽ More

    Submitted 10 May, 2019; v1 submitted 28 February, 2017; originally announced February 2017.

    Comments: 13 pages, 5 figures, 2 tables, accepted to the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019)

  23. arXiv:1508.05803  [pdf, other

    stat.ML cs.LG

    Searching for significant patterns in stratified data

    Authors: Felipe Llinares-Lopez, Laetitia Papaxanthos, Dean Bodenham, Karsten Borgwardt

    Abstract: Significant pattern mining, the problem of finding itemsets that are significantly enriched in one class of objects, is statistically challenging, as the large space of candidate patterns leads to an enormous multiple testing problem. Recently, the concept of testability was proposed as one approach to correct for multiple testing in pattern mining while retaining statistical power. Still, these s… ▽ More

    Submitted 24 August, 2015; originally announced August 2015.

    Comments: 18 pages, 6 figures

  24. arXiv:1502.04315  [pdf, other

    stat.ML

    Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing

    Authors: Felipe Llinares López, Mahito Sugiyama, Laetitia Papaxanthos, Karsten M. Borgwardt

    Abstract: We present a novel algorithm, Westfall-Young light, for detecting patterns, such as itemsets and subgraphs, which are statistically significantly enriched in one of two classes. Our method corrects rigorously for multiple hypothesis testing and correlations between patterns through the Westfall-Young permutation procedure, which empirically estimates the null distribution of pattern frequencies in… ▽ More

    Submitted 15 February, 2015; originally announced February 2015.

  25. arXiv:1407.1176  [pdf, other

    stat.ML cs.LG

    Identifying Higher-order Combinations of Binary Features

    Authors: Felipe Llinares, Mahito Sugiyama, Karsten M. Borgwardt

    Abstract: Finding statistically significant interactions between binary variables is computationally and statistically challenging in high-dimensional settings, due to the combinatorial explosion in the number of hypotheses. Terada et al. recently showed how to elegantly address this multiple testing problem by excluding non-testable hypotheses. Still, it remains unclear how their approach scales to large d… ▽ More

    Submitted 4 July, 2014; originally announced July 2014.

  26. arXiv:1407.0316  [pdf, other

    stat.ME cs.LG stat.ML

    Significant Subgraph Mining with Multiple Testing Correction

    Authors: Mahito Sugiyama, Felipe Llinares López, Niklas Kasenburg, Karsten M. Borgwardt

    Abstract: The problem of finding itemsets that are statistically significantly enriched in a class of transactions is complicated by the need to correct for multiple hypothesis testing. Pruning untestable hypotheses was recently proposed as a strategy for this task of significant itemset mining. It was shown to lead to greater statistical power, the discovery of more truly significant itemsets, than the sta… ▽ More

    Submitted 30 January, 2015; v1 submitted 1 July, 2014; originally announced July 2014.

    Comments: 18 pages, 5 figure, accepted to the 2015 SIAM International Conference on Data Mining (SDM15)

  27. arXiv:1303.7390  [pdf, ps, other

    cs.CV

    Geometric tree kernels: Classification of COPD from airway tree geometry

    Authors: Aasa Feragen, Jens Petersen, Dominik Grimm, Asger Dirksen, Jesper Holst Pedersen, Karsten Borgwardt, Marleen de Bruijne

    Abstract: Methodological contributions: This paper introduces a family of kernels for analyzing (anatomical) trees endowed with vector valued measurements made along the tree. While state-of-the-art graph and tree kernels use combinatorial tree/graph structure with discrete node and edge labels, the kernels presented in this paper can include geometric information such as branch shape, branch radius or othe… ▽ More

    Submitted 8 April, 2013; v1 submitted 29 March, 2013; originally announced March 2013.

    Comments: 12 pages

    MSC Class: 68T10

  28. arXiv:1212.4788  [pdf, other

    q-bio.GN cs.CE cs.DL stat.AP

    easyGWAS: An integrated interspecies platform for performing genome-wide association studies

    Authors: Dominik Grimm, Bastian Greshake, Stefan Kleeberger, Christoph Lippert, Oliver Stegle, Bernhard Schölkopf, Detlef Weigel, Karsten Borgwardt

    Abstract: Motivation: The rapid growth in genome-wide association studies (GWAS) in plants and animals has brought about the need for a central resource that facilitates i) performing GWAS, ii) accessing data and results of other GWAS, and iii) enabling all users regardless of their background to exploit the latest statistical techniques without having to manage complex software and computing resources. R… ▽ More

    Submitted 19 December, 2012; originally announced December 2012.

  29. arXiv:1211.2315  [pdf, other

    stat.ML q-bio.QM

    Efficient network-guided multi-locus association mapping with graph cuts

    Authors: Chloé-Agathe Azencott, Dominik Grimm, Mahito Sugiyama, Yoshinobu Kawahara, Karsten M. Borgwardt

    Abstract: As an increasing number of genome-wide association studies reveal the limitations of attempting to explain phenotypic heritability by single genetic loci, there is growing interest for associating complex phenotypes with sets of genetic loci. While several methods for multi-locus mapping have been proposed, it is often unclear how to relate the detected loci to the growing knowledge about gene pat… ▽ More

    Submitted 18 April, 2013; v1 submitted 10 November, 2012; originally announced November 2012.

    Comments: 20 pages, 6 figures, accepted at ISMB (International Conference on Intelligent Systems for Molecular Biology) 2013

  30. arXiv:1210.2850  [pdf, other

    q-bio.GN q-bio.PE q-bio.QM

    A mixed model approach for joint genetic analysis of alternatively spliced transcript isoforms using RNA-Seq data

    Authors: Barbara Rakitsch, Christoph Lippert, Hande Topa, Karsten Borgwardt, Antti Honkela, Oliver Stegle

    Abstract: RNA-Seq technology allows for studying the transcriptional state of the cell at an unprecedented level of detail. Beyond quantification of whole-gene expression, it is now possible to disentangle the abundance of individual alternatively spliced transcript isoforms of a gene. A central question is to understand the regulatory processes that lead to differences in relative abundance variation due t… ▽ More

    Submitted 10 October, 2012; originally announced October 2012.

  31. arXiv:1205.6986  [pdf, ps, other

    q-bio.PE q-bio.GN q-bio.QM stat.AP

    LMM-Lasso: A Lasso Multi-Marker Mixed Model for Association Mapping with Population Structure Correction

    Authors: Barbara Rakitsch, Christoph Lippert, Oliver Stegle, Karsten Borgwardt

    Abstract: Exploring the genetic basis of heritable traits remains one of the central challenges in biomedical research. In simple cases, single polymorphic loci explain a significant fraction of the phenotype variability. However, many traits of interest appear to be subject to multifactorial control by groups of genetic loci instead. Accurate detection of such multivariate associations is nontrivial and of… ▽ More

    Submitted 21 September, 2012; v1 submitted 30 May, 2012; originally announced May 2012.

  32. arXiv:0906.4032  [pdf, ps, other

    cs.LG

    Bayesian two-sample tests

    Authors: Karsten M. Borgwardt, Zoubin Ghahramani

    Abstract: In this paper, we present two classes of Bayesian approaches to the two-sample problem. Our first class of methods extends the Bayesian t-test to include all parametric models in the exponential family and their conjugate priors. Our second class of methods uses Dirichlet process mixtures (DPM) of such conjugate-exponential distributions as flexible nonparametric priors over the unknown distribu… ▽ More

    Submitted 22 June, 2009; originally announced June 2009.

  33. arXiv:0807.0093  [pdf, other

    cs.LG

    Graph Kernels

    Authors: S. V. N. Vishwanathan, Karsten M. Borgwardt, Imre Risi Kondor, Nicol N. Schraudolph

    Abstract: We present a unified framework to study graph kernels, special cases of which include the random walk graph kernel \citep{GaeFlaWro03,BorOngSchVisetal05}, marginalized graph kernel \citep{KasTsuIno03,KasTsuIno04,MahUedAkuPeretal04}, and geometric kernel on graphs \citep{Gaertner02}. Through extensions of linear algebra to Reproducing Kernel Hilbert Spaces (RKHS) and reduction to a Sylvester equa… ▽ More

    Submitted 1 July, 2008; originally announced July 2008.

    Comments: http://jmlr.csail.mit.edu/papers/v11/vishwanathan10a.html

    Journal ref: Journal of Machine Learning Research 11 (Apr): 1201-1242, 2010

  34. arXiv:0805.2368  [pdf, ps, other

    cs.LG cs.AI

    A Kernel Method for the Two-Sample Problem

    Authors: Arthur Gretton, Karsten Borgwardt, Malte J. Rasch, Bernhard Scholkopf, Alexander J. Smola

    Abstract: We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS). We present two tests based on large deviation bounds for the test statistic, while a… ▽ More

    Submitted 15 May, 2008; originally announced May 2008.

    ACM Class: G.3; I.2.6

  35. arXiv:0704.2668  [pdf, other

    cs.LG

    Supervised Feature Selection via Dependence Estimation

    Authors: Le Song, Alex Smola, Arthur Gretton, Karsten Borgwardt, Justin Bedo

    Abstract: We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that good features should maximise such dependence. Feature selection for various supervised learning problems (including classification and regression) is unified under this framework, and the solutions can… ▽ More

    Submitted 20 April, 2007; originally announced April 2007.

    Comments: 9 pages