Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–14 of 14 results for author: Landa, B

.
  1. arXiv:2407.01718  [pdf, other

    stat.ML cs.LG math.ST

    Entropic Optimal Transport Eigenmaps for Nonlinear Alignment and Joint Embedding of High-Dimensional Datasets

    Authors: Boris Landa, Yuval Kluger, Rong Ma

    Abstract: Embedding high-dimensional data into a low-dimensional space is an indispensable component of data analysis. In numerous applications, it is necessary to align and jointly embed multiple datasets from different studies or experimental conditions. Such datasets may share underlying structures of interest but exhibit individual distortions, resulting in misaligned embeddings using traditional techni… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2306.11263  [pdf, ps, other

    math.ST

    The Dyson Equalizer: Adaptive Noise Stabilization for Low-Rank Signal Detection and Recovery

    Authors: Boris Landa, Yuval Kluger

    Abstract: Detecting and recovering a low-rank signal in a noisy data matrix is a fundamental task in data analysis. Typically, this task is addressed by inspecting and manipulating the spectrum of the observed data, e.g., thresholding the singular values of the data matrix at a certain critical level. This approach is well-established in the case of homoskedastic noise, where the noise variance is identical… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  3. arXiv:2209.08004  [pdf, ps, other

    math.ST cs.LG stat.ML

    Robust Inference of Manifold Density and Geometry by Doubly Stochastic Scaling

    Authors: Boris Landa, Xiuyuan Cheng

    Abstract: The Gaussian kernel and its traditional normalizations (e.g., row-stochastic) are popular approaches for assessing similarities between data points. Yet, they can be inaccurate under high-dimensional noise, especially if the noise magnitude varies considerably across the data, e.g., under heteroskedasticity or outliers. In this work, we investigate a more robust alternative -- the doubly stochasti… ▽ More

    Submitted 10 July, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

  4. arXiv:2206.11386  [pdf, ps, other

    math.ST cs.LG stat.ML

    Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise

    Authors: Xiuyuan Cheng, Boris Landa

    Abstract: Bi-stochastic normalization provides an alternative normalization of graph Laplacians in graph-based data analysis and can be computed efficiently by Sinkhorn-Knopp (SK) iterations. This paper proves the convergence of bi-stochastically normalized graph Laplacian to manifold (weighted-)Laplacian with rates, when $n$ data points are i.i.d. sampled from a general $d$-dimensional manifold embedded in… ▽ More

    Submitted 26 January, 2023; v1 submitted 22 June, 2022; originally announced June 2022.

  5. arXiv:2103.13840  [pdf, other

    math.ST cs.IT

    Biwhitening Reveals the Rank of a Count Matrix

    Authors: Boris Landa, Thomas T. C. K. Zhang, Yuval Kluger

    Abstract: Estimating the rank of a corrupted data matrix is an important task in data analysis, most notably for choosing the number of components in PCA. Significant progress on this task was achieved using random matrix theory by characterizing the spectral properties of large noise matrices. However, utilizing such tools is not straightforward when the data matrix consists of count random variables, e.g.… ▽ More

    Submitted 2 November, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

    MSC Class: 62H12; 62H25

  6. arXiv:2012.06393  [pdf, ps, other

    math.PR math.NA

    Scaling positive random matrices: concentration and asymptotic convergence

    Authors: Boris Landa

    Abstract: It is well known that any positive matrix can be scaled to have prescribed row and column sums by multiplying its rows and columns by certain positive scaling factors (which are unique up to a positive scalar). This procedure is known as matrix scaling, and has found numerous applications in operations research, economics, image processing, and machine learning. In this work, we investigate the be… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

    MSC Class: 60B20; 60F10; 65F35;

  7. arXiv:2011.03418  [pdf, other

    math.ST

    Local Two-Sample Testing over Graphs and Point-Clouds by Random-Walk Distributions

    Authors: Boris Landa, Rihao Qu, Joseph Chang, Yuval Kluger

    Abstract: Rejecting the null hypothesis in two-sample testing is a fundamental tool for scientific discovery. Yet, aside from concluding that two samples do not come from the same probability distribution, it is often of interest to characterize how the two distributions differ. Given samples from two densities $f_1$ and $f_0$, we consider the task of localizing occurrences of the inequality $f_1 > f_0$. To… ▽ More

    Submitted 7 September, 2021; v1 submitted 6 November, 2020; originally announced November 2020.

    MSC Class: 62G10

  8. arXiv:2006.00402  [pdf, ps, other

    stat.ML cs.IT cs.LG

    Doubly-Stochastic Normalization of the Gaussian Kernel is Robust to Heteroskedastic Noise

    Authors: Boris Landa, Ronald R. Coifman, Yuval Kluger

    Abstract: A fundamental step in many data-analysis techniques is the construction of an affinity matrix describing similarities between data points. When the data points reside in Euclidean space, a widespread approach is to from an affinity matrix by the Gaussian kernel with pairwise distances, and to follow with a certain normalization (e.g. the row-stochastic normalization or its symmetric variant). We d… ▽ More

    Submitted 25 January, 2021; v1 submitted 30 May, 2020; originally announced June 2020.

  9. arXiv:1912.06500  [pdf, other

    physics.data-an stat.ML

    KLT Picker: Particle Picking Using Data-Driven Optimal Templates

    Authors: Amitay Eldar, Boris Landa, Yoel Shkolnisky

    Abstract: Particle picking is currently a critical step in the cryo-EM single particle reconstruction pipeline. Despite extensive work on this problem, for many data sets it is still challenging, especially for low SNR micrographs. We present the KLT (Karhunen Loeve Transform) picker, which is fully automatic and requires as an input only the approximated particle size. In particular, it does not require an… ▽ More

    Submitted 12 December, 2019; originally announced December 2019.

  10. arXiv:1907.05377  [pdf, other

    math.NA math.OC q-bio.BM stat.AP

    Method of moments for 3-D single particle ab initio modeling with non-uniform distribution of viewing angles

    Authors: Nir Sharon, Joe Kileel, Yuehaw Khoo, Boris Landa, Amit Singer

    Abstract: Single-particle reconstruction in cryo-electron microscopy (cryo-EM) is an increasingly popular technique for determining the 3-D structure of a molecule from several noisy 2-D projections images taken at unknown viewing angles. Most reconstruction algorithms require a low-resolution initialization for the 3-D structure, which is the goal of ab initio modeling. Suggested by Zvi Kam in 1980, the me… ▽ More

    Submitted 23 November, 2019; v1 submitted 11 July, 2019; originally announced July 2019.

    Comments: 41 pages. v2: additional numerical experiments, appendices edited, other updates

  11. arXiv:1906.00211  [pdf, ps, other

    math.ST cs.DS cs.IT

    Multi-reference factor analysis: low-rank covariance estimation under unknown translations

    Authors: Boris Landa, Yoel Shkolnisky

    Abstract: We consider the problem of estimating the covariance matrix of a random signal observed through unknown translations (modeled by cyclic shifts) and corrupted by noise. Solving this problem allows to discover low-rank structures masked by the existence of translations (which act as nuisance parameters), with direct application to Principal Components Analysis (PCA). We assume that the underlying si… ▽ More

    Submitted 21 September, 2020; v1 submitted 1 June, 2019; originally announced June 2019.

  12. arXiv:1905.12442  [pdf, other

    math.ST cs.DS cs.IT

    Rank-one Multi-Reference Factor Analysis

    Authors: Yariv Aizenbud, Boris Landa, Yoel Shkolnisky

    Abstract: In recent years, there is a growing need for processing methods aimed at extracting useful information from large datasets. In many cases the challenge is to discover a low-dimensional structure in the data, often concealed by the existence of nuisance parameters and noise. Motivated by such challenges, we consider the problem of estimating a signal from its scaled, cyclically-shifted and noisy ob… ▽ More

    Submitted 4 June, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

  13. arXiv:1802.01894  [pdf, ps, other

    cs.CV cs.LG

    The steerable graph Laplacian and its application to filtering image data-sets

    Authors: Boris Landa, Yoel Shkolnisky

    Abstract: In recent years, improvements in various image acquisition techniques gave rise to the need for adaptive processing methods, aimed particularly for large datasets corrupted by noise and deformations. In this work, we consider datasets of images sampled from a low-dimensional manifold (i.e. an image-valued manifold), where the images can assume arbitrary planar rotations. To derive an adaptive and… ▽ More

    Submitted 7 August, 2018; v1 submitted 6 February, 2018; originally announced February 2018.

  14. arXiv:1608.02702  [pdf, ps, other

    cs.CV math.NA

    Steerable Principal Components for Space-Frequency Localized Images

    Authors: Boris Landa, Yoel Shkolnisky

    Abstract: This paper describes a fast and accurate method for obtaining steerable principal components from a large dataset of images, assuming the images are well localized in space and frequency. The obtained steerable principal components are optimal for expanding the images in the dataset and all of their rotations. The method relies upon first expanding the images using a series of two-dimensional Prol… ▽ More

    Submitted 9 August, 2018; v1 submitted 9 August, 2016; originally announced August 2016.