Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–25 of 25 results for author: Nayak, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06868  [pdf, other

    cs.IT cs.LG eess.SP

    Energy Efficient Fair STAR-RIS for Mobile Users

    Authors: Ashok S. Kumar, Nancy Nayak, Sheetal Kalyani, Himal A. Suraweera

    Abstract: In this work, we propose a method to improve the energy efficiency and fairness of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) for mobile users, ensuring reduced power consumption while maintaining reliable communication. To achieve this, we introduce a new parameter known as the subsurface assignment variable, which determines the number of STAR-RIS e… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2406.10491  [pdf, other

    cs.AR

    FuseMax: Leveraging Extended Einsums to Optimize Attention Accelerator Design

    Authors: Nandeeka Nayak, Xinrui Wu, Toluwanimi O. Odemuyiwa, Michael Pellauer, Joel S. Emer, Christopher W. Fletcher

    Abstract: Attention for transformers is a critical workload that has recently received significant "attention" as a target for custom acceleration. Yet, while prior work succeeds in reducing attention's memory-bandwidth requirements, it creates load imbalance between attention operators (resulting in severe compute under-utilization) and requires on-chip memory that scales with sequence length (which is exp… ▽ More

    Submitted 25 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: 15 pages, 10 figures

  3. arXiv:2402.18334  [pdf, other

    cs.CL cs.LG

    Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation

    Authors: Nihal V. Nayak, Yiyang Nan, Avi Trost, Stephen H. Bach

    Abstract: We introduce Bonito, an open-source model for conditional task generation that converts unannotated text into task-specific training datasets for instruction tuning. We aim to enable zero-shot task adaptation of large language models on users' specialized, private data. We train Bonito by fine-tuning a pretrained large language model on a new large-scale dataset with 1.65M examples created by remi… ▽ More

    Submitted 6 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: ACL Findings 2024

  4. TeAAL: A Declarative Framework for Modeling Sparse Tensor Accelerators

    Authors: Nandeeka Nayak, Toluwanimi O. Odemuyiwa, Shubham Ugare, Christopher W. Fletcher, Michael Pellauer, Joel S. Emer

    Abstract: Over the past few years, the explosion in sparse tensor algebra workloads has led to a corresponding rise in domain-specific accelerators to service them. Due to the irregularity present in sparse tensors, these accelerators employ a wide variety of novel solutions to achieve good performance. At the same time, prior work on design-flexible sparse accelerator modeling does not express this full ra… ▽ More

    Submitted 11 June, 2024; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: 17 pages, 13 figures

  5. arXiv:2212.13854  [pdf, other

    cs.IT eess.SP

    A DRL Approach for RIS-Assisted Full-Duplex UL and DL Transmission: Beamforming, Phase Shift and Power Optimization

    Authors: Nancy Nayak, Sheetal Kalyani, Himal A. Suraweera

    Abstract: We propose a deep reinforcement learning (DRL) approach for a full-duplex (FD) transmission that predicts the phase shifts of the reconfigurable intelligent surface (RIS), base station (BS) active beamformers, and the transmit powers to maximize the weighted sum rate of uplink and downlink users. Existing methods require channel state information (CSI) and residual self-interference (SI) knowledge… ▽ More

    Submitted 20 June, 2024; v1 submitted 28 December, 2022; originally announced December 2022.

  6. arXiv:2212.10537  [pdf, other

    cs.CV cs.AI cs.CL

    Does CLIP Bind Concepts? Probing Compositionality in Large Image Models

    Authors: Martha Lewis, Nihal V. Nayak, Peilin Yu, Qinan Yu, Jack Merullo, Stephen H. Bach, Ellie Pavlick

    Abstract: Large-scale neural network models combining text and images have made incredible progress in recent years. However, it remains an open question to what extent such models encode compositional representations of the concepts over which they operate, such as correctly identifying ''red cube'' by reasoning over the constituents ''red'' and ''cube''. In this work, we focus on the ability of a large pr… ▽ More

    Submitted 29 March, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  7. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  8. arXiv:2210.00064  [pdf, other

    cs.LG

    CEREAL: Few-Sample Clustering Evaluation

    Authors: Nihal V. Nayak, Ethan R. Elenberg, Clemens Rosenbaum

    Abstract: Evaluating clustering quality with reliable evaluation metrics like normalized mutual information (NMI) requires labeled data that can be expensive to annotate. We focus on the underexplored problem of estimating clustering quality with limited labels. We adapt existing approaches from the few-sample model evaluation literature to actively sub-sample, with a learned surrogate model, the most infor… ▽ More

    Submitted 30 September, 2022; originally announced October 2022.

  9. arXiv:2206.00488  [pdf, other

    cs.LG

    Rotate the ReLU to implicitly sparsify deep networks

    Authors: Nancy Nayak, Sheetal Kalyani

    Abstract: In the era of Deep Neural Network based solutions for a variety of real-life tasks, having a compact and energy-efficient deployable model has become fairly important. Most of the existing deep architectures use Rectifier Linear Unit (ReLU) activation. In this paper, we propose a novel idea of rotating the ReLU activation to give one more degree of freedom to the architecture. We show that this ac… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  10. arXiv:2205.04883  [pdf, other

    cs.CV cs.SE

    Identical Image Retrieval using Deep Learning

    Authors: Sayan Nath, Nikhil Nayak

    Abstract: In recent years, we know that the interaction with images has increased. Image similarity involves fetching similar-looking images abiding by a given reference image. The target is to find out whether the image searched as a query can result in similar pictures. We are using the BigTransfer Model, which is a state-of-art model itself. BigTransfer(BiT) is essentially a ResNet but pre-trained on a l… ▽ More

    Submitted 18 May, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

  11. arXiv:2204.03574  [pdf, other

    cs.LG cs.CL cs.CV

    Learning to Compose Soft Prompts for Compositional Zero-Shot Learning

    Authors: Nihal V. Nayak, Peilin Yu, Stephen H. Bach

    Abstract: We introduce compositional soft prompting (CSP), a parameter-efficient learning technique to improve the zero-shot compositionality of large-scale pretrained vision-language models (VLMs) like CLIP. We develop CSP for compositional zero-shot learning, the task of predicting unseen attribute-object compositions (e.g., old cat and young tiger). VLMs have a flexible text encoder that can represent ar… ▽ More

    Submitted 24 April, 2023; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: ICLR 2023

  12. arXiv:2202.01279  [pdf, other

    cs.LG cs.CL

    PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

    Authors: Stephen H. Bach, Victor Sanh, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason Alan Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Dragomir Radev , et al. (2 additional authors not shown)

    Abstract: PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a dataset to a natural language input and target output. Using prompts to train and query language models is an emerging area in NLP that requires new tools that let users develop and refine these prompts collaboratively. PromptSource addresses the emergent challenges… ▽ More

    Submitted 29 March, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: ACL 2022 Demo

  13. arXiv:2111.04798  [pdf, other

    cs.LG cs.CV

    TAGLETS: A System for Automatic Semi-Supervised Learning with Auxiliary Data

    Authors: Wasu Piriyakulkij, Cristina Menghini, Ross Briden, Nihal V. Nayak, Jeffrey Zhu, Elaheh Raisi, Stephen H. Bach

    Abstract: Machine learning practitioners often have access to a spectrum of data: labeled data for the target task (which is often limited), unlabeled data, and auxiliary data, the many available labeled datasets for other tasks. We describe TAGLETS, a system built to study techniques for automatically exploiting all three types of data and creating high-quality, servable classifiers. The key components of… ▽ More

    Submitted 5 May, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

    Comments: Paper published at MLSys 2022. It passed the artifact evaluation earning two ACM badges: (1) Artifacts Evaluated Functional v1.1 and (2) Artifacts Available v1.1

  14. arXiv:2110.14357  [pdf, other

    cs.IT cs.AI eess.SP

    Binarized ResNet: Enabling Robust Automatic Modulation Classification at the resource-constrained Edge

    Authors: Deepsayan Sadhukhan, Nitin Priyadarshini Shankar, Nancy Nayak, Thulasi Tholeti, Sheetal Kalyani

    Abstract: Recently, deep neural networks (DNNs) have been used extensively for automatic modulation classification (AMC), and the results have been quite promising. However, DNNs have high memory and computation requirements making them impractical for edge networks where the devices are resource-constrained. They are also vulnerable to adversarial attacks, which is a significant security concern. This work… ▽ More

    Submitted 17 April, 2023; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: This version has a total of 8 figures and 3 tables. It has extra content on the adversarial robustness of the proposed method that was not present in the previous submission. Also one more ensemble method called RBLResNet-MCK is proposed to improve the performance further

  15. arXiv:2110.08207  [pdf, other

    cs.LG cs.CL

    Multitask Prompted Training Enables Zero-Shot Task Generalization

    Authors: Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen , et al. (16 additional authors not shown)

    Abstract: Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models' pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale,… ▽ More

    Submitted 17 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ICLR 2022 Spotlight (with extended discussion)

  16. arXiv:2110.07992  [pdf, ps, other

    eess.SP cs.LG

    BayesAoA: A Bayesian method for Computation Efficient Angle of Arrival Estimation

    Authors: Akshay Sharma, Nancy Nayak, Sheetal Kalyani

    Abstract: The angle of Arrival (AoA) estimation is of great interest in modern communication systems. Traditional maximum likelihood-based iterative algorithms are sensitive to initialization and cannot be used online. We propose a Bayesian method to find AoA that is insensitive towards initialization. The proposed method is less complex and needs fewer computing resources than traditional deep learning-bas… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

  17. arXiv:2110.03781  [pdf

    cs.LG cs.NI

    5G Traffic Prediction with Time Series Analysis

    Authors: Nikhil Nayak, Rujula Singh R

    Abstract: In todays day and age, a mobile phone has become a basic requirement needed for anyone to thrive. With the cellular traffic demand increasing so dramatically, it is now necessary to accurately predict the user traffic in cellular networks, so as to improve the performance in terms of resource allocation and utilisation. By leveraging the power of machine learning and identifying its usefulness in… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

  18. arXiv:2106.09925  [pdf, other

    cs.IT eess.SP

    Realizing Neural Decoder at the Edge with Ensembled BNN

    Authors: Devannagari Vikas, Nancy Nayak, Sheetal Kalyani

    Abstract: In this work, we propose extreme compression techniques like binarization, ternarization for Neural Decoders such as TurboAE. These methods reduce memory and computation by a factor of 64 with a performance better than the quantized (with 1-bit or 2-bits) Neural Decoders. However, because of the limited representation capability of the Binary and Ternary networks, the performance is not as good as… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  19. arXiv:2006.10713  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    Zero-Shot Learning with Common Sense Knowledge Graphs

    Authors: Nihal V. Nayak, Stephen H. Bach

    Abstract: Zero-shot learning relies on semantic class representations such as hand-engineered attributes or learned embeddings to predict classes without any labeled examples. We propose to learn class representations by embedding nodes from common sense knowledge graphs in a vector space. Common sense knowledge graphs are an untapped source of explicit high-level knowledge that requires little human effort… ▽ More

    Submitted 25 August, 2022; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: Paper published in TMLR

  20. arXiv:2006.07522  [pdf, other

    cs.LG stat.ML

    Understanding Learning Dynamics of Binary Neural Networks via Information Bottleneck

    Authors: Vishnu Raj, Nancy Nayak, Sheetal Kalyani

    Abstract: Compact neural networks are essential for affordable and power efficient deep learning solutions. Binary Neural Networks (BNNs) take compactification to the extreme by constraining both weights and activations to two levels, $\{+1, -1\}$. However, training BNNs are not easy due to the discontinuity in activation functions, and the training dynamics of BNNs is not well understood. In this paper, we… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

  21. arXiv:2003.09446  [pdf, ps, other

    cs.IT cs.LG eess.SP

    Green DetNet: Computation and Memory efficient DetNet using Smart Compression and Training

    Authors: Nancy Nayak, Thulasi Tholeti, Muralikrishnan Srinivasan, Sheetal Kalyani

    Abstract: This paper introduces an incremental training framework for compressing popular Deep Neural Network (DNN) based unfolded multiple-input-multiple-output (MIMO) detection algorithms like DetNet. The idea of incremental training is explored to select the optimal depth while training. To reduce the computation requirements or the number of FLoating point OPerations (FLOPs) and enforce sparsity in weig… ▽ More

    Submitted 16 April, 2021; v1 submitted 20 March, 2020; originally announced March 2020.

  22. arXiv:2003.08553  [pdf, other

    cs.IR cs.CL

    QnAMaker: Data to Bot in 2 Minutes

    Authors: Parag Agrawal, Tulasi Menon, Aya Kamel, Michel Naim, Chaikesh Chouragade, Gurvinder Singh, Rohan Kulkarni, Anshuman Suri, Sahithi Katakam, Vineet Pratik, Prakul Bansal, Simerpreet Kaur, Neha Rajput, Anand Duggal, Achraf Chalabi, Prashant Choudhari, Reddy Satti, Niranjan Nayak

    Abstract: Having a bot for seamless conversations is a much-desired feature that products and services today seek for their websites and mobile apps. These bots help reduce traffic received by human support significantly by handling frequent and directly answerable known questions. Many such services have huge reference documents such as FAQ pages, which makes it hard for users to browse through this data.… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: Published at The Web Conference 2020 in the demo track

  23. arXiv:2001.09251  [pdf, other

    eess.SP cs.IT cs.LG

    Deep Reinforcement Learning based Blind mmWave MIMO Beam Alignment

    Authors: Vishnu Raj, Nancy Nayak, Sheetal Kalyani

    Abstract: Directional beamforming is a crucial component for realizing robust wireless communication systems using millimeter wave (mmWave) technology. Beam alignment using brute-force search of the space introduces time overhead while location aided blind beam alignment adds additional hardware requirements to the system. In this paper, we introduce a method for blind beam alignment based on the RF fingerp… ▽ More

    Submitted 20 February, 2021; v1 submitted 24 January, 2020; originally announced January 2020.

  24. arXiv:1907.07201  [pdf, other

    cs.IT eess.SP

    Leveraging online learning for CSS in frugal IoT network

    Authors: Nancy Nayak, Vishnu Raj, Sheetal Kalyani

    Abstract: We present a novel method for centralized collaborative spectrum sensing for IoT network leveraging cognitive radio network. Based on an online learning framework, we propose an algorithm to efficiently combine the individual sensing results based on the past performance of each detector. Additionally, we show how to utilize the learned normalized weights as a proxy metric of detection accuracy an… ▽ More

    Submitted 24 January, 2020; v1 submitted 16 July, 2019; originally announced July 2019.

  25. arXiv:1801.04871  [pdf, other

    cs.AI cs.CL

    Building a Conversational Agent Overnight with Dialogue Self-Play

    Authors: Pararth Shah, Dilek Hakkani-Tür, Gokhan Tür, Abhinav Rastogi, Ankur Bapna, Neha Nayak, Larry Heck

    Abstract: We propose Machines Talking To Machines (M2M), a framework combining automation and crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues in arbitrary domains. M2M scales to new tasks with just a task schema and an API client from the dialogue system developer, but it is also customizable to cater to task-specific interactions. Compared to the Wizard-of-Oz appro… ▽ More

    Submitted 15 January, 2018; originally announced January 2018.

    Comments: 11 pages, 4 figures