Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 166 results for author: Abbas, A

.
  1. arXiv:2407.04717  [pdf, other

    cs.ET cs.AI cs.NE cs.RO nlin.CD physics.flu-dyn quant-ph

    Classical and Quantum Physical Reservoir Computing for Onboard Artificial Intelligence Systems: A Perspective

    Authors: A. H. Abbas, Hend Abdel-Ghani, Ivan S. Maksymov

    Abstract: Artificial intelligence (AI) systems of autonomous systems such as drones, robots and self-driving cars may consume up to 50% of total power available onboard, thereby limiting the vehicle's range of functions and considerably reducing the distance the vehicle can travel on a single charge. Next-generation onboard AI systems need an even higher power since they collect and process even larger amou… ▽ More

    Submitted 14 June, 2024; originally announced July 2024.

    Comments: review article

  2. arXiv:2407.04335  [pdf, ps, other

    cs.LG cs.AI

    Geometrically Inspired Kernel Machines for Collaborative Learning Beyond Gradient Descent

    Authors: Mohit Kumar, Alexander Valentinitsch, Magdalena Fuchs, Mathias Brucker, Juliana Bowles, Adnan Husakovic, Ali Abbas, Bernhard A. Moser

    Abstract: This paper develops a novel mathematical framework for collaborative learning by means of geometrically inspired kernel machines which includes statements on the bounds of generalisation and approximation errors, and sample complexity. For classification problems, this approach allows us to learn bounded geometric structures around given data points and hence solve the global model learning proble… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  3. arXiv:2407.02231  [pdf, other

    cs.RO cs.LG

    Safety-Driven Deep Reinforcement Learning Framework for Cobots: A Sim2Real Approach

    Authors: Ammar N. Abbas, Shakra Mehak, Georgios C. Chasparis, John D. Kelleher, Michael Guilfoyle, Maria Chiara Leva, Aswin K Ramasubramanian

    Abstract: This study presents a novel methodology incorporating safety constraints into a robotic simulation during the training of deep reinforcement learning (DRL). The framework integrates specific parts of the safety requirements, such as velocity constraints, as specified by ISO 10218, directly within the DRL model that becomes a part of the robot's learning algorithm. The study then evaluated the effi… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted for publication in the proceedings of the IEEE/IFAC International Conference on Control, Decision, and Information Technologies (CoDIT), 2024

  4. arXiv:2406.11794  [pdf, other

    cs.LG cs.CL

    DataComp-LM: In search of the next generation of training sets for language models

    Authors: Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner , et al. (34 additional authors not shown)

    Abstract: We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Project page: https://www.datacomp.ai/dclm/

  5. arXiv:2406.07485  [pdf, other

    cs.HC

    PITCH: Productivity and Mental Well-being Coaching through Daily Conversational Interaction

    Authors: Adnan Abbas, Sang Won Lee

    Abstract: Efficient task planning is essential for productivity and mental well-being, yet individuals often struggle to create realistic plans and reflect upon their productivity. Leveraging the advancement in artificial intelligence (AI), conversational agents have emerged as a promising tool for enhancing productivity. Our work focuses on externalizing plans through conversation, aiming to solidify inten… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  6. arXiv:2406.04927  [pdf, other

    eess.AS cs.CL

    LLM-based speaker diarization correction: A generalizable approach

    Authors: Georgios Efstathiadis, Vijay Yadav, Anzar Abbas

    Abstract: Speaker diarization is necessary for interpreting conversations transcribed using automated speech recognition (ASR) tools. Despite significant developments in diarization methods, diarization accuracy remains an issue. Here, we investigate the use of large language models (LLMs) for diarization correction as a post-processing step. LLMs were fine-tuned using the Fisher corpus, a large dataset of… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  7. arXiv:2405.02321  [pdf, other

    cs.AI cs.IR

    Accelerating Medical Knowledge Discovery through Automated Knowledge Graph Generation and Enrichment

    Authors: Mutahira Khalid, Raihana Rahman, Asim Abbas, Sushama Kumari, Iram Wajahat, Syed Ahmad Chan Bukhari

    Abstract: Knowledge graphs (KGs) serve as powerful tools for organizing and representing structured knowledge. While their utility is widely recognized, challenges persist in their automation and completeness. Despite efforts in automation and the utilization of expert-created ontologies, gaps in connectivity remain prevalent within KGs. In response to these challenges, we propose an innovative approach ter… ▽ More

    Submitted 21 April, 2024; originally announced May 2024.

    Comments: 18 pages, 5 figures

  8. arXiv:2404.06650  [pdf

    physics.geo-ph

    A Frequency-Domain Beamforming Procedure for Extracting Rayleigh Wave Attenuation Coefficients and Small-Strain Damping Ratio from 2D Ambient Noise Array Measurements

    Authors: Aser Abbas, Mauro Aimar, Brady R. Cox, Sebastiano Foti

    Abstract: The small-strain damping ratio plays a crucial role in assessing the response of soil deposits to earthquake-induced ground motions and general dynamic loading. The damping ratio can theoretically be inverted for after extracting frequency-dependent Rayleigh wave attenuation coefficients from wavefields collected during surface wave testing. However, determining reliable estimates of in-situ atten… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 42 pages, 12 figures

  9. arXiv:2403.13184  [pdf, other

    astro-ph.GA

    The Mass Density of MgII Absorbers from the Australian Dark Energy Survey

    Authors: Asif Abbas, Christopher W. Churchill, Glenn G. Kacprzak, Christopher Lidman, Susanna Guatelli, Sabine Bellstedt

    Abstract: We present an all-southern sky survey for MgII doublet absorbers in 951 z < 4 AGN/quasar spectra from the Australian Dark Energy Survey (OzDES). The spectral resolution ranges from R = 1400-1700 over the wavelengths 3700 A-8800 A. The survey has a 5sigma detection completeness of 50% and above for rest-frame equivalent widths W_r(2796) >= 0.3 A. We studied 656 MgII absorption systems over the reds… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 27 pages, 15 figures, 3 tables. Accepted on Pi Day 2024

  10. arXiv:2403.01024  [pdf, other

    cs.NE cs.AI quant-ph

    Reservoir Computing Using Measurement-Controlled Quantum Dynamics

    Authors: A. H. Abbas, Ivan S. Maksymov

    Abstract: Physical reservoir computing (RC) is a machine learning algorithm that employs the dynamics of a physical system to forecast highly nonlinear and chaotic phenomena. In this paper, we introduce a quantum RC system that employs the dynamics of a probed atom in a cavity. The atom experiences coherent driving at a particular rate, leading to a measurement-controlled quantum evolution. The proposed qua… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  11. arXiv:2402.18065  [pdf, other

    cs.RO

    A Probabilistic Motion Model for Skid-Steer Wheeled Mobile Robot Navigation on Off-Road Terrains

    Authors: Ananya Trivedi, Mark Zolotas, Adeeb Abbas, Sarvesh Prajapati, Salah Bazzi, Taskın Padır

    Abstract: Skid-Steer Wheeled Mobile Robots (SSWMRs) are increasingly being used for off-road autonomy applications. When turning at high speeds, these robots tend to undergo significant skidding and slipping. In this work, using Gaussian Process Regression (GPR) and Sigma-Point Transforms, we estimate the non-linear effects of tire-terrain interaction on robot velocities in a probabilistic fashion. Using th… ▽ More

    Submitted 29 February, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted for publication at IEEE ICRA 2024

  12. arXiv:2402.13219  [pdf, other

    cs.AI cs.HC cs.LG cs.MA eess.SY

    Analyzing Operator States and the Impact of AI-Enhanced Decision Support in Control Rooms: A Human-in-the-Loop Specialized Reinforcement Learning Framework for Intervention Strategies

    Authors: Ammar N. Abbas, Chidera W. Amazu, Joseph Mietkiewicz, Houda Briwa, Andres Alonzo Perez, Gabriele Baldissone, Micaela Demichela, Georgios G. Chasparis, John D. Kelleher, Maria Chiara Leva

    Abstract: In complex industrial and chemical process control rooms, effective decision-making is crucial for safety and efficiency. The experiments in this paper evaluate the impact and applications of an AI-based decision support system integrated into an improved human-machine interface, using dynamic influence diagrams, a hidden Markov model, and deep reinforcement learning. The enhanced support system a… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  13. arXiv:2402.08093  [pdf, other

    cs.LG cs.CL eess.AS

    BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

    Authors: Mateusz Łajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszyńska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman

    Abstract: We introduce a text-to-speech (TTS) model called BASE TTS, which stands for $\textbf{B}$ig $\textbf{A}$daptive $\textbf{S}$treamable TTS with $\textbf{E}$mergent abilities. BASE TTS is the largest TTS model to-date, trained on 100K hours of public domain speech data, achieving a new state-of-the-art in speech naturalness. It deploys a 1-billion-parameter autoregressive Transformer that converts ra… ▽ More

    Submitted 15 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: v1.1 (fixed typos)

  14. arXiv:2402.03973  [pdf, other

    cs.CV cs.LG

    Humans Beat Deep Networks at Recognizing Objects in Unusual Poses, Given Enough Time

    Authors: Netta Ollikka, Amro Abbas, Andrea Perin, Markku Kilpeläinen, Stéphane Deny

    Abstract: Deep learning is closing the gap with humans on several object recognition benchmarks. Here we investigate this gap in the context of challenging images where objects are seen from unusual viewpoints. We find that humans excel at recognizing objects in unusual poses, in contrast with state-of-the-art pretrained networks (EfficientNet, SWAG, ViT, SWIN, BEiT, ConvNext) which are systematically britt… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  15. arXiv:2401.04578  [pdf, other

    cs.CV

    Effective pruning of web-scale datasets based on complexity of concept clusters

    Authors: Amro Abbas, Evgenia Rusak, Kushal Tirumala, Wieland Brendel, Kamalika Chaudhuri, Ari S. Morcos

    Abstract: Utilizing massive web-scale datasets has led to unprecedented performance gains in machine learning models, but also imposes outlandish compute requirements for their training. In order to improve training and data efficiency, we here push the limits of pruning large-scale multimodal datasets for training CLIP-style models. Today's most effective pruning method on ImageNet clusters data samples in… ▽ More

    Submitted 12 March, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted at ICLR 2024, code available at https://github.com/amro-kamal/effective_pruning

  16. arXiv:2312.07810  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Efficient Up-Conversion in CsPbBr3 Nanocrystals via Phonon-Driven Exciton-Polaron Formation

    Authors: Abdullah S. Abbas, Beiye C. Li, Richard D. Schaller, Vitali B. Prakapenka, Stella Chariton, Gregory S. Engel, A. Paul Alivisatos

    Abstract: Lead halide perovskite nanocrystals demonstrate efficient up-conversion, although the precise mechanism remains a subject of active research. This study utilizes steady-state and time-resolved spectroscopy methods to unravel the mechanism driving the up-conversion process in CsPbBr3 nanocrystals. Employing above- and below-gap photoluminescence measurements, we extract a distinct phonon mode with… ▽ More

    Submitted 19 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Main text has 6 figures, supporting information has 7 figures. total number of pages 39

  17. arXiv:2312.03189  [pdf, other

    cond-mat.quant-gas quant-ph

    $n$-body anti-bunching in a degenerate Fermi gas of $^3$He* atoms

    Authors: Kieran F. Thomas, Shijie Li, A. H. Abbas, Andrew G. Truscott, Sean. S. Hodgman

    Abstract: A key observable in investigations into quantum systems are the $n$-body correlation functions, which provide a powerful tool for experimentally determining coherence and directly probing the many-body wavefunction. While the (bosonic) correlations of photonic systems are well explored, the correlations present in matter-wave systems, particularly for fermionic atoms, are still an emerging field.… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 11 pages, 7 figures

    Journal ref: Phys. Rev. Research 6, L022003 (2024)

  18. arXiv:2312.02279  [pdf, other

    quant-ph math.OC

    Quantum Optimization: Potential, Challenges, and the Path Forward

    Authors: Amira Abbas, Andris Ambainis, Brandon Augustino, Andreas Bärtschi, Harry Buhrman, Carleton Coffrin, Giorgio Cortiana, Vedran Dunjko, Daniel J. Egger, Bruce G. Elmegreen, Nicola Franco, Filippo Fratini, Bryce Fuller, Julien Gacon, Constantin Gonciulea, Sander Gribling, Swati Gupta, Stuart Hadfield, Raoul Heese, Gerhard Kircher, Thomas Kleinert, Thorsten Koch, Georgios Korpas, Steve Lenk, Jakub Marecek , et al. (21 additional authors not shown)

    Abstract: Recent advances in quantum computers are demonstrating the ability to solve problems at a scale beyond brute force classical simulation. As such, a widespread interest in quantum algorithms has developed in many areas, with optimization being one of the most pronounced domains. Across computer science and physics, there are a number of algorithmic approaches, often with little linkage. This is fur… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 70 pages, 9 Figures, 4 Tables

  19. arXiv:2310.18811  [pdf

    cs.AI cs.LG eess.SY

    Hierarchical Framework for Interpretable and Probabilistic Model-Based Safe Reinforcement Learning

    Authors: Ammar N. Abbas, Georgios C. Chasparis, John D. Kelleher

    Abstract: The difficulty of identifying the physical model of complex systems has led to exploring methods that do not rely on such complex modeling of the systems. Deep reinforcement learning has been the pioneer for solving this problem without the need for relying on the physical model of complex systems by just interacting with it. However, it uses a black-box learning approach that makes it difficult t… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:2206.13433

    Journal ref: Data & Knowledge Engineering, 2023

  20. arXiv:2310.14788  [pdf

    cs.LG cs.AI eess.SY

    Specialized Deep Residual Policy Safe Reinforcement Learning-Based Controller for Complex and Continuous State-Action Spaces

    Authors: Ammar N. Abbas, Georgios C. Chasparis, John D. Kelleher

    Abstract: Traditional controllers have limitations as they rely on prior knowledge about the physics of the problem, require modeling of dynamics, and struggle to adapt to abnormal situations. Deep reinforcement learning has the potential to address these problems by learning optimal control policies through exploration in an environment. For safety-critical environments, it is impractical to explore random… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

  21. arXiv:2310.08230  [pdf, other

    cs.CV

    Fast Discrete Optimisation for Geometrically Consistent 3D Shape Matching

    Authors: Paul Roetzer, Ahmed Abbas, Dongliang Cao, Florian Bernard, Paul Swoboda

    Abstract: In this work we propose to combine the advantages of learning-based and combinatorial formalisms for 3D shape matching. While learning-based shape matching solutions lead to state-of-the-art matching performance, they do not ensure geometric consistency, so that obtained matchings are locally unsmooth. On the contrary, axiomatic methods allow to take geometric consistency into account by explicitl… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Paul Roetzer and Ahmed Abbas contributed equally

  22. arXiv:2310.02110  [pdf, other

    cs.CV

    Sieve: Multimodal Dataset Pruning Using Image Captioning Models

    Authors: Anas Mahmoud, Mostafa Elhoushi, Amro Abbas, Yu Yang, Newsha Ardalani, Hugh Leather, Ari Morcos

    Abstract: Vision-Language Models (VLMs) are pretrained on large, diverse, and noisy web-crawled datasets. This underscores the critical need for dataset pruning, as the quality of these datasets is strongly correlated with the performance of VLMs on downstream tasks. Using CLIPScore from a pretrained model to only train models using highly-aligned samples is one of the most successful methods for pruning. W… ▽ More

    Submitted 10 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Accepted in CVPR 2024

  23. arXiv:2309.15940  [pdf, other

    cs.RO cs.CV

    Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

    Authors: Haonan Chang, Kowndinya Boyalakuntla, Shiyang Lu, Siwei Cai, Eric Jing, Shreesh Keskar, Shijie Geng, Adeeb Abbas, Lifeng Zhou, Kostas Bekris, Abdeslam Boularias

    Abstract: We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as ``pick up a cup on a kitchen table" or ``navigate to a… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: The code and dataset used for evaluation can be found at https://github.com/changhaonan/OVSG}{https://github.com/changhaonan/OVSG. This paper has been accepted by CoRL2023

  24. arXiv:2309.14233  [pdf

    cs.CL cs.LG

    Urdu Poetry Generated by Using Deep Learning Techniques

    Authors: Muhammad Shoaib Farooq, Ali Abbas

    Abstract: This study provides Urdu poetry generated using different deep-learning techniques and algorithms. The data was collected through the Rekhta website, containing 1341 text files with several couplets. The data on poetry was not from any specific genre or poet. Instead, it was a collection of mixed Urdu poems and Ghazals. Different deep learning techniques, such as the model applied Long Short-term… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: 11 pages, 2 figures

  25. arXiv:2307.16679  [pdf, other

    eess.AS cs.CL cs.LG

    Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

    Authors: Guangyan Zhang, Thomas Merritt, Manuel Sam Ribeiro, Biel Tura-Vecino, Kayoko Yanagisawa, Kamil Pokora, Abdelhamid Ezzerg, Sebastian Cygert, Ammar Abbas, Piotr Bilinski, Roberto Barra-Chicote, Daniel Korzekwa, Jaime Lorenzo-Trueba

    Abstract: Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong assumptions about the distributions of the target data space. Aiming to improve those assumptions, Normalizing Flows and Diffusion Probabilistic Models were recently proposed as alternatives. In this paper, we compare traditional L1/L2-based approaches to diffusion and flow-based approaches for the tasks of prosod… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: 5 pages, 2 figures, 5 tables. Interspeech 2023

  26. arXiv:2307.07062  [pdf, other

    eess.AS cs.LG cs.SD

    Controllable Emphasis with zero data for text-to-speech

    Authors: Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova

    Abstract: We present a scalable method to produce high quality emphasis for text-to-speech (TTS) that does not require recordings or annotations. Many TTS models include a phoneme duration model. A simple but effective method to achieve emphasized speech consists in increasing the predicted duration of the emphasised word. We show that this is significantly better than spectrogram modification techniques im… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: In proceeding of 12th Speech Synthesis Workshop (SSW) 2023

  27. arXiv:2306.11327  [pdf, other

    eess.AS cs.SD

    eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer

    Authors: Ammar Abbas, Sri Karlapati, Bastian Schnell, Penny Karanasou, Marcel Granero Moya, Amith Nagaraj, Ayman Boustati, Nicole Peinelt, Alexis Moinet, Thomas Drugman

    Abstract: We present eCat, a novel end-to-end multispeaker model capable of: a) generating long-context speech with expressive and contextually appropriate prosody, and b) performing fine-grained prosody transfer between any pair of seen speakers. eCat is trained using a two-stage training approach. In Stage I, the model learns speaker-independent word-level prosody representations in an end-to-end fashion… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted to be published in the Proceedings of InterSpeech 2023

  28. arXiv:2305.15578  [pdf

    physics.geo-ph

    An Open-Access Database of Active-source and Passive-wavefield DAS and Nodal Station Measurements at the Newberry Florida Site

    Authors: Aser Abbas, Brady R. Cox, Khiem T. Tran, Isabella Corey, Nishkarsha Dawadi

    Abstract: This paper documents a comprehensive subsurface imaging experiment using stress waves in Newberry, Florida, at a site known for significant spatial variability, karstic voids, and underground anomalies. The experiment utilized advanced sensing technologies, including approximately two kilometers of distributed acoustic sensing (DAS) fiber optic cable, forming a dense 2D array of 1920 channels, and… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 33 pages, 12 figures, dataset paper

  29. arXiv:2305.13362  [pdf, other

    quant-ph cs.LG

    On quantum backpropagation, information reuse, and cheating measurement collapse

    Authors: Amira Abbas, Robbie King, Hsin-Yuan Huang, William J. Huggins, Ramis Movassagh, Dar Gilboa, Jarrod R. McClean

    Abstract: The success of modern deep learning hinges on the ability to train neural networks at scale. Through clever reuse of intermediate information, backpropagation facilitates training through gradient computation at a total cost roughly proportional to running the function, rather than incurring an additional factor proportional to the number of parameters - which can now be in the trillions. Naively,… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 29 pages, 2 figures

    Journal ref: Advances in Neural Information Processing Systems 36 (2024)

  30. arXiv:2303.09540  [pdf, other

    cs.LG cs.AI cs.CV

    SemDeDup: Data-efficient learning at web-scale through semantic deduplication

    Authors: Amro Abbas, Kushal Tirumala, Dániel Simig, Surya Ganguli, Ari S. Morcos

    Abstract: Progress in machine learning has been driven in large part by massive increases in data. However, large web-scale datasets such as LAION are largely uncurated beyond searches for exact duplicates, potentially leaving much redundancy. Here, we introduce SemDeDup, a method which leverages embeddings from pre-trained models to identify and remove semantic duplicates: data pairs which are semantically… ▽ More

    Submitted 22 March, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

  31. arXiv:2301.12159  [pdf, ps, other

    cs.CV cs.LG

    ClusterFuG: Clustering Fully connected Graphs by Multicut

    Authors: Ahmed Abbas, Paul Swoboda

    Abstract: We propose a graph clustering formulation based on multicut (a.k.a. weighted correlation clustering) on the complete graph. Our formulation does not need specification of the graph topology as in the original sparse formulation of multicut, making our approach simpler and potentially better performing. In contrast to unweighted correlation clustering we allow for a more expressive weighted cost st… ▽ More

    Submitted 5 June, 2023; v1 submitted 28 January, 2023; originally announced January 2023.

    Comments: ICML 2023

  32. arXiv:2207.09580  [pdf

    cs.LG eess.SP physics.geo-ph

    A Frequency-Velocity CNN for Developing Near-Surface 2D Vs Images from Linear-Array, Active-Source Wavefield Measurements

    Authors: Aser Abbas, Joseph P. Vantassel, Brady R. Cox, Krishna Kumar, Jodie Crocker

    Abstract: This paper presents a frequency-velocity convolutional neural network (CNN) for rapid, non-invasive 2D shear wave velocity (Vs) imaging of near-surface geo-materials. Operating in the frequency-velocity domain allows for significant flexibility in the linear-array, active-source experimental testing configurations used for generating the CNN input, which are normalized dispersion images. Unlike wa… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: 34 pages, 13 figures, 2 tables

  33. arXiv:2207.08034  [pdf, other

    cs.CV

    Progress and limitations of deep networks to recognize objects in unusual poses

    Authors: Amro Abbas, Stéphane Deny

    Abstract: Deep networks should be robust to rare events if they are to be successfully deployed in high-stakes real-world applications (e.g., self-driving cars). Here we study the capability of deep networks to recognize objects in unusual poses. We create a synthetic dataset of images of objects in unusual orientations, and evaluate the robustness of a collection of 38 recent and competitive deep networks… ▽ More

    Submitted 16 July, 2022; originally announced July 2022.

  34. arXiv:2206.14643  [pdf, other

    eess.AS cs.CL

    Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody

    Authors: Peter Makarov, Ammar Abbas, Mateusz Łajszczak, Arnaud Joly, Sri Karlapati, Alexis Moinet, Thomas Drugman, Penny Karanasou

    Abstract: Generating expressive and contextually appropriate prosody remains a challenge for modern text-to-speech (TTS) systems. This is particularly evident for long, multi-sentence inputs. In this paper, we examine simple extensions to a Transformer-based FastSpeech-like system, with the goal of improving prosody for multi-sentence TTS. We find that long context, powerful text features, and training on m… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: Accepted to be published in the Proceedings of InterSpeech 2022

  35. arXiv:2206.14165  [pdf, other

    eess.AS cs.SD

    Expressive, Variable, and Controllable Duration Modelling in TTS

    Authors: Ammar Abbas, Thomas Merritt, Alexis Moinet, Sri Karlapati, Ewa Muszynska, Simon Slangen, Elia Gatti, Thomas Drugman

    Abstract: Duration modelling has become an important research problem once more with the rise of non-attention neural text-to-speech systems. The current approaches largely fall back to relying on previous statistical parametric speech synthesis technology for duration prediction, which poorly models the expressiveness and variability in speech. In this paper, we propose two alternate approaches to improve… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: Accepted to be published in the Proceedings of InterSpeech 2022

  36. arXiv:2206.13443  [pdf, other

    eess.AS cs.SD

    CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer

    Authors: Sri Karlapati, Penny Karanasou, Mateusz Lajszczak, Ammar Abbas, Alexis Moinet, Peter Makarov, Ray Li, Arent van Korlaar, Simon Slangen, Thomas Drugman

    Abstract: In this paper, we present CopyCat2 (CC2), a novel model capable of: a) synthesizing speech with different speaker identities, b) generating speech with expressive and contextually appropriate prosody, and c) transferring prosody at fine-grained level between any pair of seen speakers. We do this by activating distinct parts of the network for different tasks. We train our model using a novel appro… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted to be published in the Proceedings of InterSpeech 2022

  37. Interpretable Hidden Markov Model-Based Deep Reinforcement Learning Hierarchical Framework for Predictive Maintenance of Turbofan Engines

    Authors: Ammar N. Abbas, Georgios Chasparis, John D. Kelleher

    Abstract: An open research question in deep reinforcement learning is how to focus the policy learning of key decisions within a sparse domain. This paper emphasizes combining the advantages of inputoutput hidden Markov models and reinforcement learning towards interpretable maintenance decisions. We propose a novel hierarchical-modeling methodology that, at a high level, detects and interprets the root cau… ▽ More

    Submitted 11 January, 2023; v1 submitted 27 June, 2022; originally announced June 2022.

    Journal ref: Preprint: International Conference on Big Data Analytics and Knowledge Discovery Proceedings, 2022

  38. arXiv:2205.11638  [pdf, other

    cs.LG math.OC

    DOGE-Train: Discrete Optimization on GPU with End-to-end Training

    Authors: Ahmed Abbas, Paul Swoboda

    Abstract: We present a fast, scalable, data-driven approach for solving relaxations of 0-1 integer linear programs. We use a combination of graph neural networks (GNN) and the Lagrange decomposition based algorithm FastDOG (Abbas and Swoboda 2022b). We make the latter differentiable for end-to-end training and use GNNs to predict its algorithmic parameters. This allows to retain the algorithm's theoretical… ▽ More

    Submitted 28 December, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: AAAI 2024. Alert before printing: pg. 16-20 only contain per instance results, can possibly be skipped

  39. arXiv:2202.03574  [pdf, other

    cs.LG cs.CV

    Structured Prediction Problem Archive

    Authors: Paul Swoboda, Bjoern Andres, Andrea Hornakova, Florian Bernard, Jannik Irmai, Paul Roetzer, Bogdan Savchynskyy, David Stein, Ahmed Abbas

    Abstract: Structured prediction problems are one of the fundamental tools in machine learning. In order to facilitate algorithm development for their numerical solution, we collect in one place a large number of datasets in easy to read formats for a diverse set of problem classes. We provide archival links to datasets, description of the considered problems and problem formats, and a short summary of probl… ▽ More

    Submitted 17 November, 2023; v1 submitted 4 February, 2022; originally announced February 2022.

    Comments: Added multicast instances from Andres group

  40. arXiv:2201.00855  [pdf

    cs.CY cs.CL

    AI & Racial Equity: Understanding Sentiment Analysis Artificial Intelligence, Data Security, and Systemic Theory in Criminal Justice Systems

    Authors: Alia Abbas

    Abstract: Various forms of implications of artificial intelligence that either exacerbate or decrease racial systemic injustice have been explored in this applied research endeavor. Taking each thematic area of identifying, analyzing, and debating an systemic issue have been leveraged in investigating merits and drawbacks of using algorithms to automate human decision making in racially sensitive environmen… ▽ More

    Submitted 3 January, 2022; originally announced January 2022.

    Comments: 25 pages

  41. arXiv:2112.04807  [pdf, other

    cs.LG stat.ML

    Effective dimension of machine learning models

    Authors: Amira Abbas, David Sutter, Alessio Figalli, Stefan Woerner

    Abstract: Making statements about the performance of trained models on tasks involving new data is one of the primary goals of machine learning, i.e., to understand the generalization power of a model. Various capacity measures try to capture this ability, but usually fall short in explaining important characteristics of models that we observe in practice. In this study, we propose the local effective dimen… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: 17 pages, 2 figures

  42. arXiv:2111.10270  [pdf, other

    math.OC cs.CV cs.DC cs.GT

    FastDOG: Fast Discrete Optimization on GPU

    Authors: Ahmed Abbas, Paul Swoboda

    Abstract: We present a massively parallel Lagrange decomposition method for solving 0--1 integer linear programs occurring in structured prediction. We propose a new iterative update scheme for solving the Lagrangean dual and a perturbation technique for decoding primal solutions. For representing subproblems we follow Lange et al. (2021) and use binary decision diagrams (BDDs). Our primal and dual algorith… ▽ More

    Submitted 19 April, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

    Comments: Published at CVPR 2022. Alert before printing: last 10 pages just contains detailed results table

  43. arXiv:2109.01838  [pdf, other

    cs.DC cs.CV cs.DS cs.LG

    RAMA: A Rapid Multicut Algorithm on GPU

    Authors: Ahmed Abbas, Paul Swoboda

    Abstract: We propose a highly parallel primal-dual algorithm for the multicut (a.k.a. correlation clustering) problem, a classical graph clustering problem widely used in machine learning and computer vision. Our algorithm consists of three steps executed recursively: (1) Finding conflicted cycles that correspond to violated inequalities of the underlying multicut relaxation, (2) Performing message passing… ▽ More

    Submitted 11 March, 2022; v1 submitted 4 September, 2021; originally announced September 2021.

    Comments: Published in CVPR 2022

  44. arXiv:2106.15649  [pdf, other

    eess.AS cs.LG cs.SD

    Multi-Scale Spectrogram Modelling for Neural Text-to-Speech

    Authors: Ammar Abbas, Bajibabu Bollepalli, Alexis Moinet, Arnaud Joly, Penny Karanasou, Peter Makarov, Simon Slangens, Sri Karlapati, Thomas Drugman

    Abstract: We propose a novel Multi-Scale Spectrogram (MSS) modelling approach to synthesise speech with an improved coarse and fine-grained prosody. We present a generic multi-scale spectrogram prediction mechanism where the system first predicts coarser scale mel-spectrograms that capture the suprasegmental information in speech, and later uses these coarser scale mel-spectrograms to predict finer scale me… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

    Comments: Accepted for the 11th ISCA Speech Synthesis Workshop (SSW11)

  45. arXiv:2106.10229  [pdf, other

    eess.AS cs.LG cs.SD

    A learned conditional prior for the VAE acoustic space of a TTS system

    Authors: Penny Karanasou, Sri Karlapati, Alexis Moinet, Arnaud Joly, Ammar Abbas, Simon Slangen, Jaime Lorenzo Trueba, Thomas Drugman

    Abstract: Many factors influence speech yielding different renditions of a given sentence. Generative models, such as variational autoencoders (VAEs), capture this variability and allow multiple renditions of the same sentence via sampling. The degree of prosodic variability depends heavily on the prior that is used when sampling. In this paper, we propose a novel method to compute an informative prior for… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: in Proceedings of Interspeech 2021

  46. arXiv:2106.07015  [pdf

    cs.CV cs.AI

    Siamese Network Training Using Artificial Triplets By Sampling and Image Transformation

    Authors: Ammar N. Abbas, David Moser

    Abstract: The device used in this work detects the objects over the surface of the water using two thermal cameras which aid the users to detect and avoid the objects in scenarios where the human eyes cannot (night, fog, etc.). To avoid the obstacle collision autonomously, it is required to track the objects in real-time and assign a specific identity to each object to determine its dynamics (trajectory, ve… ▽ More

    Submitted 26 September, 2021; v1 submitted 13 June, 2021; originally announced June 2021.

  47. arXiv:2106.07003  [pdf

    cs.RO cs.AI cs.CV

    Experimental Analysis of Trajectory Control Using Computer Vision and Artificial Intelligence for Autonomous Vehicles

    Authors: Ammar N. Abbas, Muhammad Asad Irshad, Hossam Hassan Ammar

    Abstract: Perception of the lane boundaries is crucial for the tasks related to autonomous trajectory control. In this paper, several methodologies for lane detection are discussed with an experimental illustration: Hough transformation, Blob analysis, and Bird's eye view. Following the abstraction of lane marks from the boundary, the next approach is applying a control law based on the perception to contro… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  48. arXiv:2106.03188  [pdf, other

    cs.CV

    Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach

    Authors: Ahmed Abbas, Paul Swoboda

    Abstract: We propose a fully differentiable architecture for simultaneous semantic and instance segmentation (a.k.a. panoptic segmentation) consisting of a convolutional neural network and an asymmetric multiway cut problem solver. The latter solves a combinatorial optimization problem that elegantly incorporates semantic and boundary predictions to produce a panoptic labeling. Our formulation allows to dir… ▽ More

    Submitted 25 October, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: To be presented at NeurIPS 2021

  49. arXiv:2105.01133  [pdf, other

    cs.CV

    Prediction of clinical tremor severity using Rank Consistent Ordinal Regression

    Authors: Li Zhang, Vijay Yadav, Vidya Koesmahargyo, Anzar Abbas, Isaac Galatzer-Levy

    Abstract: Tremor is a key diagnostic feature of Parkinson's Disease (PD), Essential Tremor (ET), and other central nervous system (CNS) disorders. Clinicians or trained raters assess tremor severity with TETRAS scores by observing patients. Lacking quantitative measures, inter- or intra- observer variabilities are almost inevitable as the distinction between adjacent tremor scores is subtle. Moreover, clini… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  50. Rapid generation of metastable helium Bose-Einstein condensates

    Authors: A. H. Abbas, X. Meng, R. S. Patil, J. A. Ross, A. G. Truscott, S. S. Hodgman

    Abstract: We report the realisation of Bose-Einstein condensation (BEC) of metastable helium atoms using an in-vacuum coil magnetic trap and a crossed beam optical dipole trap. A novel quadrupole-Ioffe configuration (QUIC) magnetic trap made from in-vacuum hollow copper tubes provides fast switching times while generating traps with a 10G bias, without compromising optical access. The bias enables in-trap 1… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: 6 pages, 4 figures

    Journal ref: Phys. Rev. A 103, 053317 (2021)