Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–17 of 17 results for author: Pe'er, I

.
  1. arXiv:2406.05227  [pdf, other

    cs.LG

    Mixed-Curvature Decision Trees and Random Forests

    Authors: Philippe Chlenski, Quentin Chu, Itsik Pe'er

    Abstract: We extend decision tree and random forest algorithms to product space manifolds: Cartesian products of Euclidean, hyperspherical, and hyperbolic manifolds. Such spaces have extremely expressive geometries capable of representing many arrangements of distances with low metric distortion. To date, all classifiers for product spaces fit a single linear decision boundary, and no regressor has been des… ▽ More

    Submitted 18 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  2. arXiv:2406.03242  [pdf, other

    cs.LG stat.CO

    Variational Pseudo Marginal Methods for Jet Reconstruction in Particle Physics

    Authors: Hanming Yang, Antonio Khalil Moretti, Sebastian Macaluso, Philippe Chlenski, Christian A. Naesseth, Itsik Pe'er

    Abstract: Reconstructing jets, which provide vital insights into the properties and histories of subatomic particles produced in high-energy collisions, is a main problem in data analyses in collider physics. This intricate task deals with estimating the latent structure of a jet (binary tree) and involves parameters such as particle energy, momentum, and types. While Bayesian methods offer a natural approa… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  3. arXiv:2406.01652  [pdf

    stat.ME cs.LG q-bio.QM

    Distributional bias compromises leave-one-out cross-validation

    Authors: George I. Austin, Itsik Pe'er, Tal Korem

    Abstract: Cross-validation is a common method for estimating the predictive performance of machine learning models. In a data-scarce regime, where one typically wishes to maximize the number of instances used for training the model, an approach called "leave-one-out cross-validation" is often used. In this design, a separate model is built for predicting each data instance after training on all other instan… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 20 pages, 5 figures, supplementary information

  4. arXiv:2310.13841  [pdf, other

    cs.LG

    Fast hyperboloid decision tree algorithms

    Authors: Philippe Chlenski, Ethan Turok, Antonio Moretti, Itsik Pe'er

    Abstract: Hyperbolic geometry is gaining traction in machine learning for its effectiveness at capturing hierarchical structures in real-world data. Hyperbolic spaces, where neighborhoods grow exponentially, offer substantial advantages and consistently deliver state-of-the-art results across diverse applications. However, hyperbolic classifiers often grapple with computational challenges. Methods reliant o… ▽ More

    Submitted 4 March, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Journal ref: International Conference on Learning Representations (2024)

  5. ICLR 2022 Challenge for Computational Geometry and Topology: Design and Results

    Authors: Adele Myers, Saiteja Utpala, Shubham Talbar, Sophia Sanborn, Christian Shewmake, Claire Donnat, Johan Mathe, Umberto Lupo, Rishi Sonthalia, Xinyue Cui, Tom Szwagier, Arthur Pignet, Andri Bergsson, Soren Hauberg, Dmitriy Nielsen, Stefan Sommer, David Klindt, Erik Hermansen, Melvin Vaupel, Benjamin Dunn, Jeffrey Xiong, Noga Aharony, Itsik Pe'er, Felix Ambellan, Martin Hanik , et al. (3 additional authors not shown)

    Abstract: This paper presents the computational challenge on differential geometry and topology that was hosted within the ICLR 2022 workshop ``Geometric and Topological Representation Learning". The competition asked participants to provide implementations of machine learning algorithms on manifolds that would respect the API of the open-source software Geomstats (manifold part) and Scikit-Learn (machine l… ▽ More

    Submitted 26 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

  6. arXiv:2106.00075  [pdf, other

    stat.ML cs.LG stat.CO

    Variational Combinatorial Sequential Monte Carlo Methods for Bayesian Phylogenetic Inference

    Authors: Antonio Khalil Moretti, Liyi Zhang, Christian A. Naesseth, Hadiah Venner, David Blei, Itsik Pe'er

    Abstract: Bayesian phylogenetic inference is often conducted via local or sequential search over topologies and branch lengths using algorithms such as random-walk Markov chain Monte Carlo (MCMC) or Combinatorial Sequential Monte Carlo (CSMC). However, when MCMC is used for evolutionary parameter learning, convergence requires long runs with inefficient exploration of the state space. We introduce Variation… ▽ More

    Submitted 17 June, 2021; v1 submitted 31 May, 2021; originally announced June 2021.

    Comments: 15 pages, 9 figures

  7. arXiv:1911.05531  [pdf, other

    q-bio.BM cs.LG stat.ML

    Accurate Protein Structure Prediction by Embeddings and Deep Learning Representations

    Authors: Iddo Drori, Darshan Thaker, Arjun Srivatsa, Daniel Jeong, Yueqi Wang, Linyong Nan, Fan Wu, Dimitri Leggas, Jinhao Lei, Weiyi Lu, Weilong Fu, Yuan Gao, Sashank Karri, Anand Kannan, Antonio Moretti, Mohammed AlQuraishi, Chen Keasar, Itsik Pe'er

    Abstract: Proteins are the major building blocks of life, and actuators of almost all chemical and biophysical events in living organisms. Their native structures in turn enable their biological functions which have a fundamental role in drug design. This motivates predicting the structure of a protein from its sequence of amino acids, a fundamental problem in computational biology. In this work, we demonst… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Journal ref: Machine Learning in Computational Biology, 2019

  8. arXiv:1909.09734  [pdf, other

    stat.ML cs.LG

    Particle Smoothing Variational Objectives

    Authors: Antonio Khalil Moretti, Zizhao Wang, Luhuan Wu, Iddo Drori, Itsik Pe'er

    Abstract: A body of recent work has focused on constructing a variational family of filtered distributions using Sequential Monte Carlo (SMC). Inspired by this work, we introduce Particle Smoothing Variational Objectives (SVO), a novel backward simulation technique and smoothed approximate posterior defined through a subsampling process. SVO augments support of the proposal and boosts particle diversity. Re… ▽ More

    Submitted 20 September, 2019; originally announced September 2019.

    Comments: 13 pages, 5 figures

  9. arXiv:1811.07143  [pdf, other

    cs.LG q-bio.QM stat.ML

    High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures

    Authors: Iddo Drori, Isht Dwivedi, Pranav Shrestha, Jeffrey Wan, Yueqi Wang, Yunchu He, Anthony Mazza, Hugh Krogh-Freeman, Dimitri Leggas, Kendal Sandridge, Linyong Nan, Kaveri Thakoor, Chinmay Joshi, Sonam Goenka, Chen Keasar, Itsik Pe'er

    Abstract: We tackle the problem of protein secondary structure prediction using a common task framework. This lead to the introduction of multiple ideas for neural architectures based on state of the art building blocks, used in this task for the first time. We take a principled machine learning approach, which provides genuine, unbiased performance measures, correcting longstanding errors in the applicatio… ▽ More

    Submitted 17 November, 2018; originally announced November 2018.

    Comments: NIPS 2018 Workshop on Machine Learning for Molecules and Materials, 10 pages

  10. arXiv:1808.10795  [pdf, other

    q-bio.QM

    Latent Space Temporal Model of Microbial Abundance to Predict Domination and Bacteremia

    Authors: Ruiqi Zhong, Tyler Joseph, Joao B Xavier, Itsik Pe'er

    Abstract: Gut microbial composition has been linked to multiple health outcomes. Yet, temporal analysis of this composition had been limited to deterministic models. In this paper, we introduce a probabilistic model for the dynamics of intestinal microbiomes that takes into account interaction among bacteria as well as external effects such as antibiotics. The model successfully deals with pragmatic issues… ▽ More

    Submitted 31 August, 2018; originally announced August 2018.

    Comments: Experiment code available at https://github.com/ZhongRuiqi1997/NIPS2017MLCB, software at https://github.com/ZhongRuiqi1997/Kalman-Filter-Intestinal-Microbiota

  11. arXiv:1711.04078  [pdf, other

    q-bio.QM cs.AI stat.ML

    Parkinson's Disease Digital Biomarker Discovery with Optimized Transitions and Inferred Markov Emissions

    Authors: Avinash Bukkittu, Baihan Lin, Trung Vu, Itsik Pe'er

    Abstract: We search for digital biomarkers from Parkinson's Disease by observing approximate repetitive patterns matching hypothesized step and stride periodic cycles. These observations were modeled as a cycle of hidden states with randomness allowing deviation from a canonical pattern of transitions and emissions, under the hypothesis that the averaged features of hidden states would serve to informativel… ▽ More

    Submitted 11 November, 2017; originally announced November 2017.

    Comments: 10th RECOMB/ISCB Conference on Regulatory & Systems Genomics with DREAM Challenges

  12. arXiv:1509.05904  [pdf, other

    q-bio.PE

    A note on the distribution of admixture segment lengths and ancestry proportions under pulse and two-wave admixture models

    Authors: Shai Carmi, James Xue, Itsik Pe'er

    Abstract: Admixed populations are formed by the merging of two or more ancestral populations, and the ancestry of each locus in an admixed genome derives from either source. Consider a simple "pulse" admixture model, where populations A and B merged t generations ago without subsequent gene flow. We derive the distribution of the proportion of an admixed chromosome that has A (or B) ancestry, as a function… ▽ More

    Submitted 19 September, 2015; originally announced September 2015.

    Comments: 12 pages, 3 figures

  13. A renewal theory approach to IBD sharing

    Authors: Shai Carmi, Peter Wilton, John Wakeley, Itsik Pe'er

    Abstract: A long genomic segment inherited by a pair of individuals from a single, recent common ancestor is said to be identical-by-descent (IBD). Shared IBD segments have numerous applications in genetics, from demographic inference to phasing, imputation, pedigree reconstruction, and disease mapping. Here, we provide a theoretical analysis of IBD sharing under Markovian approximations of the coalescent w… ▽ More

    Submitted 11 September, 2014; v1 submitted 5 March, 2014; originally announced March 2014.

    Comments: 35 pages, 9 figures

    Journal ref: Theoretical Population Biology 97, 35-48 (2014)

  14. arXiv:1308.2150  [pdf, other

    q-bio.QM q-bio.GN

    GeneZip: A software package for storage-efficient processing of genotype data

    Authors: Cameron Palmer, Itsik Pe'er

    Abstract: Genome wide association studies directly assay 10^6 single nucleotide polymorphisms (SNPs) across a study cohort. Probabilistic estimation of additional sites by genotype imputation can increase this set of variants by 10- to 40-fold. Even with modest sample sizes (10^3-10^4), these resulting imputed datasets, containing 10^10-10^11 double-precision values, are incompatible with simultaneous lossl… ▽ More

    Submitted 17 November, 2013; v1 submitted 9 August, 2013; originally announced August 2013.

    Comments: 6 pages, 1 figure Stylistic edits, added references, added author who joined project between versions; conclusions unchanged

  15. The variance of identity-by-descent sharing in the Wright-Fisher model

    Authors: Shai Carmi, Pier Francesco Palamara, Vladimir Vacic, Todd Lencz, Ariel Darvasi, Itsik Pe'er

    Abstract: Widespread sharing of long, identical-by-descent (IBD) genetic segments is a hallmark of populations that have experienced recent genetic drift. Detection of these IBD segments has recently become feasible, enabling a wide range of applications from phasing and imputation to demographic inference. Here, we study the distribution of IBD sharing in the Wright-Fisher model. Specifically, using coales… ▽ More

    Submitted 12 August, 2013; v1 submitted 20 June, 2012; originally announced June 2012.

    Comments: Includes Supplementary Material

    Journal ref: Genetics 193, 911--928 (2013)

  16. arXiv:1102.3720  [pdf

    q-bio.GN

    Low-pass Genomewide Sequencing and Variant Imputation Using Identity-by-descent in an Isolated Human Population

    Authors: A Gusev, MJ Shah, EE Kenny, A Ramachandran, JK Lowe, J Salit, CC Lee, EC Levandowsky, TN Weaver, QC Doan, HE Peckham, SF McLaughlin, MR Lyons, VN Sheth, M Stoffel, FM De La Vega, JM Friedman, JL Breslow, I Pe'er

    Abstract: Whole-genome sequencing in an isolated population with few founders directly ascertains variants from the population bottleneck that may be rare elsewhere. In such populations, shared haplotypes allow imputation of variants in unsequenced samples without resorting to statistical methods, as in studies of outbred cohorts. We focus on an isolated population cohort from the Pacific Island of Kosrae,… ▽ More

    Submitted 17 February, 2011; originally announced February 2011.

  17. arXiv:0911.0215  [pdf, ps, other

    q-bio.GN

    Age, Sex, and Genetic Architecture of Human Gene Expression in EBV Transformed Cell Lines

    Authors: Manuel A. Rivas, Mark J. Daly, Itsik Pe'er

    Abstract: Individual expression profiles from EBV transformed cell lines are an emerging resource for genomic investigation. In this study we characterize the effects of age, sex, and genetic variation on gene expression by surveying public datasets of such profiles. We establish that the expression space of cell lines maintains genetic as well as non-germline information, in an individual-specific and cr… ▽ More

    Submitted 1 November, 2009; originally announced November 2009.

    Comments: 27 pages, 3 figures