Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 211 results for author: Paul, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16612  [pdf, other

    cs.RO cs.MA

    Towards Physically Talented Aerial Robots with Tactically Smart Swarm Behavior thereof: An Efficient Co-design Approach

    Authors: Prajit KrisshnaKumar, Steve Paul, Hemanth Manjunatha, Mary Corra, Ehsan Esfahani, Souma Chowdhury

    Abstract: The collective performance or capacity of collaborative autonomous systems such as a swarm of robots is jointly influenced by the morphology and the behavior of individual systems in that collective. In that context, this paper explores how morphology impacts the learned tactical behavior of unmanned aerial/ground robots performing reconnaissance and search & rescue. This is achieved by presenting… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted for presentation in proceedings of ASME IDETC-CIE 2024

  2. arXiv:2406.10328  [pdf, other

    cs.CV cs.CL cs.LG

    From Pixels to Prose: A Large Dataset of Dense Image Captions

    Authors: Vasu Singla, Kaiyu Yue, Sukriti Paul, Reza Shirkavand, Mayuka Jayawardhana, Alireza Ganjdanesh, Heng Huang, Abhinav Bhatele, Gowthami Somepalli, Tom Goldstein

    Abstract: Training large vision-language models requires extensive, high-quality image-text pairs. Existing web-scraped datasets, however, are noisy and lack detailed image descriptions. To bridge this gap, we introduce PixelProse, a comprehensive dataset of over 16M (million) synthetically generated captions, leveraging cutting-edge vision-language models for detailed and accurate descriptions. To ensure d… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: pixelprose 16M dataset

  3. arXiv:2406.06424  [pdf, other

    cs.CV

    Margin-aware Preference Optimization for Aligning Diffusion Models without Reference

    Authors: Jiwoo Hong, Sayak Paul, Noah Lee, Kashif Rasul, James Thorne, Jongheon Jeong

    Abstract: Modern alignment techniques based on human preferences, such as RLHF and DPO, typically employ divergence regularization relative to the reference model to ensure training stability. However, this often limits the flexibility of models during alignment, especially when there is a clear distributional discrepancy between the preference data and the reference model. In this paper, we focus on the al… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Preprint

  4. arXiv:2406.00375  [pdf, other

    cs.RO

    Teledrive: An Embodied AI based Telepresence System

    Authors: Snehasis Banerjee, Sayan Paul, Ruddradev Roychoudhury, Abhijan Bhattacharya, Chayan Sarkar, Ashis Sau, Pradip Pramanick, Brojeshwar Bhowmick

    Abstract: This article presents Teledrive, a telepresence robotic system with embodied AI features that empowers an operator to navigate the telerobot in any unknown remote place with minimal human intervention. We conceive Teledrive in the context of democratizing remote care-giving for elderly citizens as well as for isolated patients, affected by contagious diseases. In particular, this paper focuses on… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: Accepted in Journal of Intelligent Robotic System

    Journal ref: Journal of Intelligent Robotic System 2024

  5. arXiv:2405.16517  [pdf, other

    cs.CV

    Sp2360: Sparse-view 360 Scene Reconstruction using Cascaded 2D Diffusion Priors

    Authors: Soumava Paul, Christopher Wewer, Bernt Schiele, Jan Eric Lenssen

    Abstract: We aim to tackle sparse-view reconstruction of a 360 3D scene using priors from latent diffusion models (LDM). The sparse-view setting is ill-posed and underconstrained, especially for scenes where the camera rotates 360 degrees around a point, as no visual information is available beyond some frontal views focused on the central object(s) of interest. In this work, we show that pretrained 2D diff… ▽ More

    Submitted 2 June, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

    Comments: 18 pages, 11 figures, 4 tables

  6. arXiv:2405.01421  [pdf, ps, other

    cs.IT

    Systematic Construction of Golay Complementary Sets of Arbitrary Lengths and Alphabet Sizes

    Authors: Abhishek Roy, Sudhan Majhi, Subhabrata Paul

    Abstract: One of the important applications of Golay complementary sets (GCSs) is the reduction of peak-to-mean envelope power ratio (PMEPR) in orthogonal frequency division multiplexing (OFDM) systems. OFDM has played a major role in modern wireless systems such as long-term-evolution (LTE), 5th generation (5G) wireless standards, etc. This paper searches for systematic constructions of GCSs of arbitrary l… ▽ More

    Submitted 8 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    MSC Class: 94A55; 94A15; 94D10

  7. arXiv:2404.02447  [pdf

    cs.CV cs.AI

    A Novel Approach to Breast Cancer Histopathological Image Classification Using Cross-Colour Space Feature Fusion and Quantum-Classical Stack Ensemble Method

    Authors: Sambit Mallick, Snigdha Paul, Anindya Sen

    Abstract: Breast cancer classification stands as a pivotal pillar in ensuring timely diagnosis and effective treatment. This study with histopathological images underscores the profound significance of harnessing the synergistic capabilities of colour space ensembling and quantum-classical stacking to elevate the precision of breast cancer classification. By delving into the distinct colour spaces of RGB, H… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  8. arXiv:2404.01197  [pdf, other

    cs.CV

    Getting it Right: Improving Spatial Consistency in Text-to-Image Models

    Authors: Agneet Chatterjee, Gabriela Ben Melech Stan, Estelle Aflalo, Sayak Paul, Dhruba Ghosh, Tejas Gokhale, Ludwig Schmidt, Hannaneh Hajishirzi, Vasudev Lal, Chitta Baral, Yezhou Yang

    Abstract: One of the key shortcomings in current text-to-image (T2I) models is their inability to consistently generate images which faithfully follow the spatial relationships specified in the text prompt. In this paper, we offer a comprehensive investigation of this limitation, while also developing datasets and methods that achieve state-of-the-art performance. First, we find that current vision-language… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: project webpage : https://spright-t2i.github.io/

  9. arXiv:2403.14687  [pdf, other

    cs.LG cs.AI

    On the Performance of Imputation Techniques for Missing Values on Healthcare Datasets

    Authors: Luke Oluwaseye Joel, Wesley Doorsamy, Babu Sena Paul

    Abstract: Missing values or data is one popular characteristic of real-world datasets, especially healthcare data. This could be frustrating when using machine learning algorithms on such datasets, simply because most machine learning models perform poorly in the presence of missing values. The aim of this study is to compare the performance of seven imputation techniques, namely Mean imputation, Median Imp… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  10. arXiv:2403.07131  [pdf, other

    cs.AI cs.MA

    Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot Task Allocation

    Authors: Steve Paul, Nathan Maurer, Souma Chowdhury

    Abstract: Most real-world Multi-Robot Task Allocation (MRTA) problems require fast and efficient decision-making, which is often achieved using heuristics-aided methods such as genetic algorithms, auction-based methods, and bipartite graph matching methods. These methods often assume a form that lends better explainability compared to an end-to-end (learnt) neural network based policy for MRTA. However, der… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: This paper was accepted for presentation in proceedings of IEEE International Conference on Robotics and Automation 2024

  11. arXiv:2403.04962  [pdf, other

    eess.IV cs.CV cs.LG

    C2P-GCN: Cell-to-Patch Graph Convolutional Network for Colorectal Cancer Grading

    Authors: Sudipta Paul, Bulent Yener, Amanda W. Lund

    Abstract: Graph-based learning approaches, due to their ability to encode tissue/organ structure information, are increasingly favored for grading colorectal cancer histology images. Recent graph-based techniques involve dividing whole slide images (WSIs) into smaller or medium-sized patches, and then building graphs on each patch for direct use in training. This method, however, fails to capture the tissue… ▽ More

    Submitted 13 May, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Accepted at IEEE EMBC 2024

  12. arXiv:2403.04537  [pdf

    cs.RO

    VLSI Architectures of Forward Kinematic Processor for Robotics Applications

    Authors: Sourav Roy, Subhadeep Paul, Tapas Kumar Maiti

    Abstract: This paper aims to get a comprehensive review of current-day robotic computation technologies at VLSI architecture level. We studied several repots in the domain of robotic processor architecture. In this work, we focused on the forward kinematics architectures which consider CORDIC algorithms, VLSI circuits of WE DSP16 chip, parallel processing and pipelined architecture, and lookup table formula… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 8 pages, 22 figures

  13. arXiv:2402.17412  [pdf, other

    cs.CV

    DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models

    Authors: Shyam Marjit, Harshit Singh, Nityanand Mathur, Sayak Paul, Chia-Mu Yu, Pin-Yu Chen

    Abstract: In the realm of subject-driven text-to-image (T2I) generative models, recent developments like DreamBooth and BLIP-Diffusion have led to impressive results yet encounter limitations due to their intensive fine-tuning demands and substantial parameter requirements. While the low-rank adaptation (LoRA) module within DreamBooth offers a reduction in trainable parameters, it introduces a pronounced se… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Project Page: https://diffusekrona.github.io/

  14. arXiv:2402.09757  [pdf, ps, other

    cs.IT math.CO

    Construction of CCC and ZCCS Through Additive Characters Over Galois Field

    Authors: Gobinda Ghosh, Sudhan Majhi, Subhabrata Paul

    Abstract: The rapid progression in wireless communication technologies, especially in multicarrier code-division multiple access (MC-CDMA), there is a need of advanced code construction methods. Traditional approaches, mainly based on generalized Boolean functions, have limitations in code length versatility. This paper introduces a novel approach to constructing complete complementary codes (CCC) and Z-com… ▽ More

    Submitted 18 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  15. arXiv:2402.04814  [pdf, other

    cs.LG

    BOWLL: A Deceptively Simple Open World Lifelong Learner

    Authors: Roshni Kamath, Rupert Mitchell, Subarnaduti Paul, Kristian Kersting, Martin Mundt

    Abstract: The quest to improve scalar performance numbers on predetermined benchmarks seems to be deeply engraved in deep learning. However, the real world is seldom carefully curated and applications are seldom limited to excelling on test sets. A practical system is generally required to recognize novel concepts, refrain from actively including uninformative data, and retain previously acquired knowledge… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  16. arXiv:2402.03388  [pdf, other

    cs.AI cs.IR cs.LG

    Delivery Optimized Discovery in Behavioral User Segmentation under Budget Constraint

    Authors: Harshita Chopra, Atanu R. Sinha, Sunav Choudhary, Ryan A. Rossi, Paavan Kumar Indela, Veda Pranav Parwatala, Srinjayee Paul, Aurghya Maiti

    Abstract: Users' behavioral footprints online enable firms to discover behavior-based user segments (or, segments) and deliver segment specific messages to users. Following the discovery of segments, delivery of messages to users through preferred media channels like Facebook and Google can be challenging, as only a portion of users in a behavior segment find match in a medium, and only a fraction of those… ▽ More

    Submitted 15 March, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  17. arXiv:2402.00637  [pdf, other

    cs.CV

    Fisheye Camera and Ultrasonic Sensor Fusion For Near-Field Obstacle Perception in Bird's-Eye-View

    Authors: Arindam Das, Sudarshan Paul, Niko Scholz, Akhilesh Kumar Malviya, Ganesh Sistu, Ujjwal Bhattacharya, Ciarán Eising

    Abstract: Accurate obstacle identification represents a fundamental challenge within the scope of near-field perception for autonomous driving. Conventionally, fisheye cameras are frequently employed for comprehensive surround-view perception, including rear-view obstacle localization. However, the performance of such cameras can significantly deteriorate in low-light conditions, during nighttime, or when s… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 16 pages, 12 Figures, 6 tables

  18. arXiv:2401.05252  [pdf, other

    cs.CV

    PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

    Authors: Junsong Chen, Yue Wu, Simian Luo, Enze Xie, Sayak Paul, Ping Luo, Hang Zhao, Zhenguo Li

    Abstract: This technical report introduces PIXART-δ, a text-to-image synthesis framework that integrates the Latent Consistency Model (LCM) and ControlNet into the advanced PIXART-α model. PIXART-α is recognized for its ability to generate high-quality images of 1024px resolution through a remarkably efficient training process. The integration of LCM in PIXART-δ significantly accelerates the inference speed… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: Technical Report

  19. arXiv:2401.04851  [pdf, other

    cs.MA cs.AI cs.LG

    Graph Learning-based Fleet Scheduling for Urban Air Mobility under Operational Constraints, Varying Demand & Uncertainties

    Authors: Steve Paul, Jhoel Witter, Souma Chowdhury

    Abstract: This paper develops a graph reinforcement learning approach to online planning of the schedule and destinations of electric aircraft that comprise an urban air mobility (UAM) fleet operating across multiple vertiports. This fleet scheduling problem is formulated to consider time-varying demand, constraints related to vertiport capacity, aircraft capacity and airspace safety guidelines, uncertainti… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: This paper is accepted to be presented at the ACM Symposium on Applied Computing 2024

  20. arXiv:2401.02677  [pdf, other

    cs.CV cs.AI

    Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss

    Authors: Yatharth Gupta, Vishnu V. Jaddipal, Harish Prabhala, Sayak Paul, Patrick Von Platen

    Abstract: Stable Diffusion XL (SDXL) has become the best open source text-to-image model (T2I) for its versatility and top-notch image quality. Efficiently addressing the computational demands of SDXL models is crucial for wider reach and applicability. In this work, we introduce two scaled-down variants, Segmind Stable Diffusion (SSD-1B) and Segmind-Vega, with 1.3B and 0.74B parameter UNets, respectively,… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  21. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  22. arXiv:2312.07368  [pdf

    cs.AI cs.RO

    Sequential Planning in Large Partially Observable Environments guided by LLMs

    Authors: Swarna Kamal Paul

    Abstract: Sequential planning in large state space and action space quickly becomes intractable due to combinatorial explosion of the search space. Heuristic methods, like monte-carlo tree search, though effective for large state space, but struggle if action space is large. Pure reinforcement learning methods, relying only on reward signals, needs prohibitively large interactions with the environment to de… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 8 pages, 2 figures, 1 table

  23. arXiv:2312.02420  [pdf, other

    cs.CV

    Towards Granularity-adjusted Pixel-level Semantic Annotation

    Authors: Rohit Kundu, Sudipta Paul, Rohit Lal, Amit K. Roy-Chowdhury

    Abstract: Recent advancements in computer vision predominantly rely on learning-based systems, leveraging annotations as the driving force to develop specialized models. However, annotating pixel-level information, particularly in semantic segmentation, presents a challenging and labor-intensive task, prompting the need for autonomous processes. In this work, we propose GranSAM which distinguishes itself by… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  24. arXiv:2311.17475  [pdf, other

    cs.CV eess.IV

    CLiSA: A Hierarchical Hybrid Transformer Model using Orthogonal Cross Attention for Satellite Image Cloud Segmentation

    Authors: Subhajit Paul, Ashutosh Gupta

    Abstract: Clouds in optical satellite images are a major concern since their presence hinders the ability to carry accurate analysis as well as processing. Presence of clouds also affects the image tasking schedule and results in wastage of valuable storage space on ground as well as space-based systems. Due to these reasons, deriving accurate cloud masks from optical remote-sensing images is an important t… ▽ More

    Submitted 1 December, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: 14 pages, 11 figures, 7 tables

  25. arXiv:2311.16490  [pdf, other

    eess.IV cs.CV cs.LG

    SIRAN: Sinkhorn Distance Regularized Adversarial Network for DEM Super-resolution using Discriminative Spatial Self-attention

    Authors: Subhajit Paul, Ashutosh Gupta

    Abstract: Digital Elevation Model (DEM) is an essential aspect in the remote sensing domain to analyze and explore different applications related to surface elevation information. In this study, we intend to address the generation of high-resolution DEMs using high-resolution multi-spectral (MX) satellite imagery by incorporating adversarial learning. To promptly regulate this process, we utilize the notion… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 15 pages, 14 figures

  26. arXiv:2311.03374  [pdf, other

    cs.SE cs.AI cs.IR

    Generative AI for Software Metadata: Overview of the Information Retrieval in Software Engineering Track at FIRE 2023

    Authors: Srijoni Majumdar, Soumen Paul, Debjyoti Paul, Ayan Bandyopadhyay, Samiran Chattopadhyay, Partha Pratim Das, Paul D Clough, Prasenjit Majumder

    Abstract: The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework based on human and large language model generated labels. In this track, there is a binary classification task to classify comments as useful and not useful. The dataset consists of 9048 code comments and surrounding code snippet pairs e… ▽ More

    Submitted 27 October, 2023; originally announced November 2023.

    Comments: Overview Paper of the Information Retrieval of Software Engineering Track at the Forum for Information Retrieval, 2023

  27. arXiv:2311.00724  [pdf

    cs.LG cs.AI cs.DC

    Fraud Analytics Using Machine-learning & Engineering on Big Data (FAME) for Telecom

    Authors: Sudarson Roy Pratihar, Subhadip Paul, Pranab Kumar Dash, Amartya Kumar Das

    Abstract: Telecom industries lose globally 46.3 Billion USD due to fraud. Data mining and machine learning techniques (apart from rules oriented approach) have been used in past, but efficiency has been low as fraud pattern changes very rapidly. This paper presents an industrialized solution approach with self adaptive data mining technique and application of big data technologies to detect fraud and discov… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: Presented in International Conference in Indian Institute of Management, Bangalore, India

  28. arXiv:2310.07465  [pdf, other

    cs.DS math.CO

    Algorithmic study on liar's vertex-edge domination problem

    Authors: Debojyoti Bhattacharya, Subhabrata Paul

    Abstract: Let $G=(V,E)$ be a graph. For an edge $e=xy\in E$, the closed neighbourhood of $e$, denoted by $N_G[e]$ or $N_G[xy]$, is the set $N_G[x]\cup N_G[y]$. A vertex set $L\subseteq V$ is liar's vertex-edge dominating set of a graph $G=(V,E)$ if for every $e_i\in E$, $|N_G[e_i]\cap L|\geq 2$ and for every pair of distinct edges $e_i$ and $e_j$, $|(N_G[e_i]\cup N_G[e_j])\cap L|\geq 3$. This paper introduc… ▽ More

    Submitted 24 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  29. arXiv:2310.07452  [pdf, other

    math.CO cs.DM

    On $k$-vertex-edge domination of graph

    Authors: Debojyoti Bhattacharya, Subhabrata Paul

    Abstract: Let $G=(V,E)$ be a simple undirected graph. The open neighbourhood of a vertex $v$ in $G$ is defined as $N_G(v)=\{u\in V~|~ uv\in E\}$; whereas the closed neighbourhood is defined as $N_G[v]= N_G(v)\cup \{v\}$. For an integer $k$, a subset $D\subseteq V$ is called a $k$-vertex-edge dominating set of $G$ if for every edge $uv\in E$, $|(N_G[u]\cup N_G[v]) \cap D|\geq k$. In $k$-vertex-edge dominatio… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  30. arXiv:2310.06279  [pdf, other

    cs.NI

    MEC-Intelligent Agent Support for Low-Latency Data Plane in Private NextG Core

    Authors: Shalini Choudhury, Sushovan Das, Sanjoy Paul, Prasanthi Maddala, Ivan Seskar, Dipankar Raychaudhuri

    Abstract: Private 5G networks will soon be ubiquitous across the future-generation smart wireless access infrastructures hosting a wide range of performance-critical applications. A high-performing User Plane Function (UPF) in the data plane is critical to achieving such stringent performance goals, as it governs fast packet processing and supports several key control-plane operations. Based on a private 5G… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  31. arXiv:2309.14389  [pdf, other

    cs.CV cs.AI

    Analyzing the Efficacy of an LLM-Only Approach for Image-based Document Question Answering

    Authors: Nidhi Hegde, Sujoy Paul, Gagan Madan, Gaurav Aggarwal

    Abstract: Recent document question answering models consist of two key components: the vision encoder, which captures layout and visual elements in images, and a Large Language Model (LLM) that helps contextualize questions to the image and supplements them with external world knowledge to generate accurate answers. However, the relative contributions of the vision encoder and the language model in these ta… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  32. arXiv:2308.15037  [pdf, other

    cs.CV

    Is it an i or an l: Test-time Adaptation of Text Line Recognition Models

    Authors: Debapriya Tula, Sujoy Paul, Gagan Madan, Peter Garst, Reeve Ingle, Gaurav Aggarwal

    Abstract: Recognizing text lines from images is a challenging problem, especially for handwritten documents due to large variations in writing styles. While text line recognition models are generally trained on large corpora of real and synthetic data, such models can still make frequent mistakes if the handwriting is inscrutable or the image acquisition process adds corruptions, such as noise, blur, compre… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  33. arXiv:2308.09075  [pdf, other

    cs.MA cs.AI cs.LG cs.RO

    Fast Decision Support for Air Traffic Management at Urban Air Mobility Vertiports using Graph Learning

    Authors: Prajit KrisshnaKumar, Jhoel Witter, Steve Paul, Hanvit Cho, Karthik Dantu, Souma Chowdhury

    Abstract: Urban Air Mobility (UAM) promises a new dimension to decongested, safe, and fast travel in urban and suburban hubs. These UAM aircraft are conceived to operate from small airports called vertiports each comprising multiple take-off/landing and battery-recharging spots. Since they might be situated in dense urban areas and need to handle many aircraft landings and take-offs each hour, managing this… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted for presentation in proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems 2023

  34. arXiv:2308.02825  [pdf, other

    math.CO cs.DM

    Burning a binary tree and its generalization

    Authors: Sandip Das, Sk Samim Islam, Ritam M Mitra, Sanchita Paul

    Abstract: Graph burning is a graph process that models the spread of social contagion. Initially, all the vertices of a graph $G$ are unburnt. At each step, an unburnt vertex is put on fire and the fire from burnt vertices of the previous step spreads to their adjacent unburnt vertices. This process continues till all the vertices are burnt. The burning number $b(G)$ of the graph $G$ is the minimum number o… ▽ More

    Submitted 14 November, 2023; v1 submitted 5 August, 2023; originally announced August 2023.

  35. arXiv:2306.12213  [pdf, ps, other

    cs.CL

    Limits for Learning with Language Models

    Authors: Nicholas Asher, Swarnadeep Bhar, Akshay Chaturvedi, Julie Hunter, Soumya Paul

    Abstract: With the advent of large language models (LLMs), the trend in NLP has been to train LLMs on vast amounts of data to solve diverse language understanding and generation tasks. The list of LLM successes is long and varied. Nevertheless, several recent papers provide empirical evidence that LLMs fail to capture important aspects of linguistic meaning. Focusing on universal quantification, we provide… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  36. arXiv:2306.06823  [pdf, other

    cs.CV cs.CL

    Weakly supervised information extraction from inscrutable handwritten document images

    Authors: Sujoy Paul, Gagan Madan, Akankshya Mishra, Narayan Hegde, Pradeep Kumar, Gaurav Aggarwal

    Abstract: State-of-the-art information extraction methods are limited by OCR errors. They work well for printed text in form-like documents, but unstructured, handwritten documents still remain a challenge. Adapting existing models to domain-specific training data is quite expensive, because of two factors, 1) limited availability of the domain-specific documents (such as handwritten prescriptions, lab note… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted at ICDAR 2023

  37. arXiv:2306.05243  [pdf, ps, other

    cs.DS

    Analysis of Knuth's Sampling Algorithm D and D'

    Authors: Mridul Nandi, Soumit Paul

    Abstract: In this research paper, we address the Distinct Elements estimation problem in the context of streaming algorithms. The problem involves estimating the number of distinct elements in a given data stream $\mathcal{A} = (a_1, a_2,\ldots, a_m)$, where $a_i \in \{1, 2, \ldots, n\}$. Over the past four decades, the Distinct Elements problem has received considerable attention, theoretically and empiric… ▽ More

    Submitted 11 June, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: We have provided an unbiased analysis (using exactly the same idea as the previous version) for the continuous score distribution instead of the discrete version

    MSC Class: F.2.0;

  38. arXiv:2306.04047  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy Environments

    Authors: Xiulong Liu, Sudipta Paul, Moitreya Chatterjee, Anoop Cherian

    Abstract: Audio-visual navigation of an agent towards locating an audio goal is a challenging task especially when the audio is sporadic or the environment is noisy. In this paper, we present CAVEN, a Conversation-based Audio-Visual Embodied Navigation framework in which the agent may interact with a human/oracle for solving the task of navigating to an audio goal. Specifically, CAVEN is modeled as a budget… ▽ More

    Submitted 26 December, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted at AAAI 2024

  39. arXiv:2306.03542  [pdf, other

    cs.LG

    Masked Autoencoders are Efficient Continual Federated Learners

    Authors: Subarnaduti Paul, Lars-Joel Frey, Roshni Kamath, Kristian Kersting, Martin Mundt

    Abstract: Machine learning is typically framed from a perspective of i.i.d., and more importantly, isolated data. In parts, federated learning lifts this assumption, as it sets out to solve the real-world challenge of collaboratively learning a shared model from data distributed across clients. However, motivated primarily by privacy and computational constraints, the fact that data may change, distribution… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  40. arXiv:2305.01442  [pdf, ps, other

    cs.IT

    A Direct Construction of Optimal Symmetrical Z-Complementary Code Sets of Prime Power Lengths

    Authors: Praveen Kumar, Sudhan Majhi, Subhabrata Paul

    Abstract: This paper presents a direct construction of an optimal symmetrical Z-complementary code set (SZCCS) of prime power lengths using a multi-variable function (MVF). SZCCS is a natural extension of the Z-complementary code set (ZCCS), which has only front-end zero correlation zone (ZCZ) width. SZCCS has both front-end and tail-end ZCZ width. SZCCSs are used in developing optimal training sequences fo… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  41. arXiv:2304.14604  [pdf, other

    stat.ME cs.CV cs.LG math.NA

    Deep Neural-network Prior for Orbit Recovery from Method of Moments

    Authors: Yuehaw Khoo, Sounak Paul, Nir Sharon

    Abstract: Orbit recovery problems are a class of problems that often arise in practice and various forms. In these problems, we aim to estimate an unknown function after being distorted by a group action and observed via a known operator. Typically, the observations are contaminated with a non-trivial level of noise. Two particular orbit recovery problems of interest in this paper are multireference alignme… ▽ More

    Submitted 30 January, 2024; v1 submitted 27 April, 2023; originally announced April 2023.

    Journal ref: J. Comput. Appl. Math. 115782 (2024)

  42. arXiv:2303.08954  [pdf, other

    cs.CL

    PRESTO: A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs

    Authors: Rahul Goel, Waleed Ammar, Aditya Gupta, Siddharth Vashishtha, Motoki Sano, Faiz Surani, Max Chang, HyunJeong Choe, David Greene, Kyle He, Rattima Nitisaroj, Anna Trukhina, Shachi Paul, Pararth Shah, Rushin Shah, Zhou Yu

    Abstract: Research interest in task-oriented dialogs has increased as systems such as Google Assistant, Alexa and Siri have become ubiquitous in everyday life. However, the impact of academic research in this area has been limited by the lack of datasets that realistically capture the wide array of user pain points. To enable research on some of the more challenging aspects of parsing realistic conversation… ▽ More

    Submitted 16 March, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: PRESTO v1 Release

  43. arXiv:2303.08933  [pdf, other

    cs.MA cs.RO

    Efficient Planning of Multi-Robot Collective Transport using Graph Reinforcement Learning with Higher Order Topological Abstraction

    Authors: Steve Paul, Wenyuan Li, Brian Smyth, Yuzhou Chen, Yulia Gel, Souma Chowdhury

    Abstract: Efficient multi-robot task allocation (MRTA) is fundamental to various time-sensitive applications such as disaster response, warehouse operations, and construction. This paper tackles a particular class of these problems that we call MRTA-collective transport or MRTA-CT -- here tasks present varying workloads and deadlines, and robots are subject to flight range, communication range, and payload… ▽ More

    Submitted 17 August, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: This paper has been accepted to be presented at the IEEE International Conference on Robotics and Automation, 2023

  44. arXiv:2303.01243  [pdf, other

    cs.LG cs.CR cs.PF

    Poster: Sponge ML Model Attacks of Mobile Apps

    Authors: Souvik Paul, Nicolas Kourtellis

    Abstract: Machine Learning (ML)-powered apps are used in pervasive devices such as phones, tablets, smartwatches and IoT devices. Recent advances in collaborative, distributed ML such as Federated Learning (FL) attempt to solve privacy concerns of users and data owners, and thus used by tech industry leaders such as Google, Facebook and Apple. However, FL systems and models are still vulnerable to adversari… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 2 pages, 6 figures. Proceedings of the 24th International Workshop on Mobile Computing Systems and Applications (HotMobile). Feb. 2023

    MSC Class: 68M25; 68P27; 68Txx ACM Class: I.2.11

  45. arXiv:2302.05849  [pdf, other

    cs.MA

    Graph Learning Based Decision Support for Multi-Aircraft Take-Off and Landing at Urban Air Mobility Vertiports

    Authors: Prajit KrisshnaKumar, Jhoel Witter, Steve Paul, Karthik Dantu, Souma Chowdhury

    Abstract: Majority of aircraft under the Urban Air Mobility (UAM) concept are expected to be of the electric vertical takeoff and landing (eVTOL) vehicle type, which will operate out of vertiports. While this is akin to the relationship between general aviation aircraft and airports, the conceived location of vertiports within dense urban environments presents unique challenges in managing the air traffic s… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: Presented at AIAA Scitech Forum 2022

  46. arXiv:2301.03294  [pdf, ps, other

    cs.IT

    Construction of Optimal Binary Z-Complementary Code Sets with New Lengths

    Authors: Gobinda Ghosh, Sudhan Majhi, Shubabrata Paul

    Abstract: Z-complementary code sets (ZCCSs) are used in multicarrier code-division multiple access (MC-CDMA) systems, for interference-free communication over multiuser and quasi-asynchronous environments. In this letter, we propose three new constructions of optimal binary $\left(R2^{k+1},2^{k+1}, Rγ,γ\right)$-ZCCS, $\left(R2^{k+1},2^{k+1}, R2^{m_{2}},2^{m_{2}}\right)$-ZCCS and… ▽ More

    Submitted 22 February, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

  47. arXiv:2301.02400  [pdf, ps, other

    cs.IT

    A Direct Construction of Optimal 2D-ZCACS with Flexible Array Size and Large Set Size

    Authors: Gobinda Ghosh, Sudhan Majhi, Shubhabrata Paul

    Abstract: In this paper, we propose a direct construction of optimal two-dimensional Z-complementary array code sets (2D-ZCACS) using multivariable functions (MVFs). In contrast to earlier works, the proposed construction allows for a flexible array size and a large set size. Additionally, the proposed design can be transformed into a one-dimensional Z-complementary code set (1D-ZCCS). Many of the 1D-ZCCS d… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

  48. arXiv:2212.13599  [pdf

    cs.CV

    Brain Cancer Segmentation Using YOLOv5 Deep Neural Network

    Authors: Sudipto Paul, Dr. Md Taimur Ahad, Md. Mahedi Hasan

    Abstract: An expansion of aberrant brain cells is referred to as a brain tumor. The brain's architecture is extremely intricate, with several regions controlling various nervous system processes. Any portion of the brain or skull can develop a brain tumor, including the brain's protective coating, the base of the skull, the brainstem, the sinuses, the nasal cavity, and many other places. Over the past ten y… ▽ More

    Submitted 27 December, 2022; originally announced December 2022.

  49. arXiv:2212.04061  [pdf, other

    cs.CV cs.MA

    Elixir: A system to enhance data quality for multiple analytics on a video stream

    Authors: Sibendu Paul, Kunal Rao, Giuseppe Coviello, Murugan Sankaradas, Oliver Po, Y. Charlie Hu, Srimat T. Chakradhar

    Abstract: IoT sensors, especially video cameras, are ubiquitously deployed around the world to perform a variety of computer vision tasks in several verticals including retail, healthcare, safety and security, transportation, manufacturing, etc. To amortize their high deployment effort and cost, it is desirable to perform multiple video analytics tasks, which we refer to as Analytical Units (AUs), off the v… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

  50. arXiv:2212.02010  [pdf

    cs.MA cs.AI cs.GT

    Multi Agent Path Finding using Evolutionary Game Theory

    Authors: Sheryl Paul, Jyotirmoy V. Deshmukh

    Abstract: In this paper, we consider the problem of path finding for a set of homogeneous and autonomous agents navigating a previously unknown stochastic environment. In our problem setting, each agent attempts to maximize a given utility function while respecting safety properties. Our solution is based on ideas from evolutionary game theory, namely replicating policies that perform well and diminishing o… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.