Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–10 of 10 results for author: Aghazadeh, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.06645  [pdf, other

    q-bio.BM cs.AI cs.LG

    On Recovering Higher-order Interactions from Protein Language Models

    Authors: Darin Tsui, Amirali Aghazadeh

    Abstract: Protein language models leverage evolutionary information to perform state-of-the-art 3D structure and zero-shot variant prediction. Yet, extracting and explaining all the mutational interactions that govern model predictions remains difficult as it requires querying the entire amino acid space for $n$ sites using $20^n$ sequences, which is computationally expensive even for moderate values of… ▽ More

    Submitted 15 March, 2024; originally announced May 2024.

  2. arXiv:2307.10343  [pdf, other

    q-bio.GN cs.LG

    ProtiGeno: a prokaryotic short gene finder using protein language models

    Authors: Tony Tu, Gautham Krishna, Amirali Aghazadeh

    Abstract: Prokaryotic gene prediction plays an important role in understanding the biology of organisms and their function with applications in medicine and biotechnology. Although the current gene finders are highly sensitive in finding long genes, their sensitivity decreases noticeably in finding shorter genes (<180 nts). The culprit is insufficient annotated gene data to identify distinguishing features… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted at the 2023 ICML Workshop on Computational Biology

    ACM Class: I.2.1; J.3

  3. arXiv:2301.06200  [pdf, other

    eess.SP cs.LG

    Efficiently Computing Sparse Fourier Transforms of $q$-ary Functions

    Authors: Yigit Efe Erginbas, Justin Singh Kang, Amirali Aghazadeh, Kannan Ramchandran

    Abstract: Fourier transformations of pseudo-Boolean functions are popular tools for analyzing functions of binary sequences. Real-world functions often have structures that manifest in a sparse Fourier transform, and previous works have shown that under the assumption of sparsity the transform can be computed efficiently. But what if we want to compute the Fourier transform of functions defined over a $q$-a… ▽ More

    Submitted 15 January, 2023; originally announced January 2023.

    Comments: 29 pages, 3 figures

  4. arXiv:2210.02604  [pdf, other

    stat.ML cs.LG

    Spectral Regularization Allows Data-frugal Learning over Combinatorial Spaces

    Authors: Amirali Aghazadeh, Nived Rajaraman, Tony Tu, Kannan Ramchandran

    Abstract: Data-driven machine learning models are being increasingly employed in several important inference problems in biology, chemistry, and physics which require learning over combinatorial spaces. Recent empirical evidence (see, e.g., [1], [2], [3]) suggests that regularizing the spectral representation of such models improves their generalization power when labeled data is scarce. However, despite th… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  5. arXiv:2106.10324  [pdf, other

    cs.LG stat.ML

    Group-Structured Adversarial Training

    Authors: Farzan Farnia, Amirali Aghazadeh, James Zou, David Tse

    Abstract: Robust training methods against perturbations to the input data have received great attention in the machine learning literature. A standard approach in this direction is adversarial training which learns a model using adversarially-perturbed training samples. However, adversarial training performs suboptimally against perturbations structured across samples such as universal and group-sparse shif… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  6. arXiv:2010.13829  [pdf, other

    cs.LG

    BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory

    Authors: Amirali Aghazadeh, Vipul Gupta, Alex DeWeese, O. Ozan Koyluoglu, Kannan Ramchandran

    Abstract: We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develo… ▽ More

    Submitted 26 May, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

  7. arXiv:2005.05824  [pdf, other

    cs.CV

    A Distributed Approximate Nearest Neighbor Method for Real-Time Face Recognition

    Authors: Aysan Aghazadeh, Maryam Amirmazlaghani

    Abstract: Nowadays, face recognition and more generally image recognition have many applications in the modern world and are widely used in our daily tasks. This paper aims to propose a distributed approximate nearest neighbor (ANN) method for real-time face recognition using a big dataset that involves a lot of classes. The proposed approach is based on using a clustering method to separate the dataset int… ▽ More

    Submitted 27 August, 2020; v1 submitted 12 May, 2020; originally announced May 2020.

  8. arXiv:1806.04310  [pdf, other

    cs.DS cs.LG stat.ML

    MISSION: Ultra Large-Scale Feature Selection using Count-Sketches

    Authors: Amirali Aghazadeh, Ryan Spring, Daniel LeJeune, Gautam Dasarathy, Anshumali Shrivastava, Richard G. Baraniuk

    Abstract: Feature selection is an important challenge in machine learning. It plays a crucial role in the explainability of machine-driven decisions that are rapidly permeating throughout modern society. Unfortunately, the explosion in the size and dimensionality of real-world datasets poses a severe challenge to standard feature selection algorithms. Today, it is not uncommon for datasets to have billions… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

  9. arXiv:1702.07670  [pdf, other

    cs.OH

    Insense: Incoherent Sensor Selection for Sparse Signals

    Authors: Amirali Aghazadeh, Mohammad Golbabaee, Andrew S. Lan, Richard G. Baraniuk

    Abstract: Sensor selection refers to the problem of intelligently selecting a small subset of a collection of available sensors to reduce the sensing cost while preserving signal acquisition performance. The majority of sensor selection algorithms find the subset of sensors that best recovers an arbitrary signal from a number of linear measurements that is larger than the dimension of the signal. In this pa… ▽ More

    Submitted 16 February, 2017; originally announced February 2017.

  10. arXiv:1603.03836  [pdf, other

    cs.DS

    Near-Isometric Binary Hashing for Large-scale Datasets

    Authors: Amirali Aghazadeh, Andrew Lan, Anshumali Shrivastava, Richard Baraniuk

    Abstract: We develop a scalable algorithm to learn binary hash codes for indexing large-scale datasets. Near-isometric binary hashing (NIBH) is a data-dependent hashing scheme that quantizes the output of a learned low-dimensional embedding to obtain a binary hash code. In contrast to conventional hashing schemes, which typically rely on an $\ell_2$-norm (i.e., average distortion) minimization, NIBH is base… ▽ More

    Submitted 11 March, 2016; originally announced March 2016.