-
RAVEN: Multitask Retrieval Augmented Vision-Language Learning
Authors:
Varun Nagaraj Rao,
Siddharth Choudhary,
Aditya Deshpande,
Ravi Kumar Satzoda,
Srikar Appalaraju
Abstract:
The scaling of large language models to encode all the world's knowledge in model parameters is unsustainable and has exacerbated resource barriers. Retrieval-Augmented Generation (RAG) presents a potential solution, yet its application to vision-language models (VLMs) is under explored. Existing methods focus on models designed for single tasks. Furthermore, they're limited by the need for resour…
▽ More
The scaling of large language models to encode all the world's knowledge in model parameters is unsustainable and has exacerbated resource barriers. Retrieval-Augmented Generation (RAG) presents a potential solution, yet its application to vision-language models (VLMs) is under explored. Existing methods focus on models designed for single tasks. Furthermore, they're limited by the need for resource intensive pre training, additional parameter requirements, unaddressed modality prioritization and lack of clear benefit over non-retrieval baselines. This paper introduces RAVEN, a multitask retrieval augmented VLM framework that enhances base VLMs through efficient, task specific fine-tuning. By integrating retrieval augmented samples without the need for additional retrieval-specific parameters, we show that the model acquires retrieval properties that are effective across multiple tasks. Our results and extensive ablations across retrieved modalities for the image captioning and VQA tasks indicate significant performance improvements compared to non retrieved baselines +1 CIDEr on MSCOCO, +4 CIDEr on NoCaps and nearly a +3\% accuracy on specific VQA question types. This underscores the efficacy of applying RAG approaches to VLMs, marking a stride toward more efficient and accessible multimodal learning.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Rideshare Transparency: Translating Gig Worker Insights on AI Platform Design to Policy
Authors:
Varun Nagaraj Rao,
Samantha Dalal,
Eesha Agarwal,
Dana Calacci,
Andrés Monroy-Hernández
Abstract:
Rideshare platforms exert significant control over workers through algorithmic systems that can result in financial, emotional, and physical harm. What steps can platforms, designers, and practitioners take to mitigate these negative impacts and meet worker needs? In this paper, through a novel mixed methods study combining a LLM-based analysis of over 1 million comments posted to online platform…
▽ More
Rideshare platforms exert significant control over workers through algorithmic systems that can result in financial, emotional, and physical harm. What steps can platforms, designers, and practitioners take to mitigate these negative impacts and meet worker needs? In this paper, through a novel mixed methods study combining a LLM-based analysis of over 1 million comments posted to online platform worker communities with semi-structured interviews of workers, we thickly characterize transparency-related harms, mitigation strategies, and worker needs while validating and contextualizing our findings within the broader worker community. Our findings expose a transparency gap between existing platform designs and the information drivers need, particularly concerning promotions, fares, routes, and task allocation. Our analysis suggests that rideshare workers need key pieces of information, which we refer to as indicators, to make informed work decisions. These indicators include details about rides, driver statistics, algorithmic implementation details, and platform policy information. We argue that instead of relying on platforms to include such information in their designs, new regulations that require platforms to publish public transparency reports may be a more effective solution to improve worker well-being. We offer recommendations for implementing such a policy.
△ Less
Submitted 19 June, 2024; v1 submitted 15 June, 2024;
originally announced June 2024.
-
PLT-D3: A High-fidelity Dynamic Driving Simulation Dataset for Stereo Depth and Scene Flow
Authors:
Joshua Tokarsky,
Ibrahim Abdulhafiz,
Satya Ayyalasomayajula,
Mostafa Mohsen,
Navya G. Rao,
Adam Forbes
Abstract:
Autonomous driving has experienced remarkable progress, bolstered by innovations in computational hardware and sophisticated deep learning methodologies. The foundation of these advancements rests on the availability and quality of datasets, which are crucial for the development and refinement of dependable and versatile autonomous driving algorithms. While numerous datasets have been developed to…
▽ More
Autonomous driving has experienced remarkable progress, bolstered by innovations in computational hardware and sophisticated deep learning methodologies. The foundation of these advancements rests on the availability and quality of datasets, which are crucial for the development and refinement of dependable and versatile autonomous driving algorithms. While numerous datasets have been developed to support the evolution of autonomous driving perception technologies, few offer the diversity required to thoroughly test and enhance system robustness under varied weather conditions. Many public datasets lack the comprehensive coverage of challenging weather scenarios and detailed, high-resolution data, which are critical for training and validating advanced autonomous-driving perception models. In this paper, we introduce PLT-D3; a Dynamic-weather Driving Dataset, designed specifically to enhance autonomous driving systems' adaptability to diverse weather conditions. PLT-D3 provides high-fidelity stereo depth and scene flow ground truth data generated using Unreal Engine 5. In particular, this dataset includes synchronized high-resolution stereo image sequences that replicate a wide array of dynamic weather scenarios including rain, snow, fog, and diverse lighting conditions, offering an unprecedented level of realism in simulation-based testing. The primary aim of PLT-D3 is to address the scarcity of comprehensive training and testing resources that can simulate real-world weather variations. Benchmarks have been established for several critical autonomous driving tasks using PLT-D3, such as depth estimation, optical flow and scene-flow to measure and enhance the performance of state-of-the-art models.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Vision Transformers for End-to-End Vision-Based Quadrotor Obstacle Avoidance
Authors:
Anish Bhattacharya,
Nishanth Rao,
Dhruv Parikh,
Pratik Kunapuli,
Nikolai Matni,
Vijay Kumar
Abstract:
We demonstrate the capabilities of an attention-based end-to-end approach for high-speed quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art architectures. Quadrotor unmanned aerial vehicles (UAVs) have tremendous maneuverability when flown fast; however, as flight speed increases, traditional vision-based navigation via independent mapping, p…
▽ More
We demonstrate the capabilities of an attention-based end-to-end approach for high-speed quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art architectures. Quadrotor unmanned aerial vehicles (UAVs) have tremendous maneuverability when flown fast; however, as flight speed increases, traditional vision-based navigation via independent mapping, planning, and control modules breaks down due to increased sensor noise, compounding errors, and increased processing latency. Thus, learning-based, end-to-end planning and control networks have shown to be effective for online control of these fast robots through cluttered environments. We train and compare convolutional, U-Net, and recurrent architectures against vision transformer models for depth-based end-to-end control, in a photorealistic, high-physics-fidelity simulator as well as in hardware, and observe that the attention-based models are more effective as quadrotor speeds increase, while recurrent models with many layers provide smoother commands at lower speeds. To the best of our knowledge, this is the first work to utilize vision transformers for end-to-end vision-based quadrotor control.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels
Authors:
Qi Chen,
Xiubo Geng,
Corby Rosset,
Carolyn Buractaon,
Jingwen Lu,
Tao Shen,
Kun Zhou,
Chenyan Xiong,
Yeyun Gong,
Paul Bennett,
Nick Craswell,
Xing Xie,
Fan Yang,
Bryan Tower,
Nikhil Rao,
Anlei Dong,
Wenqi Jiang,
Zheng Liu,
Mingqin Li,
Chuanjie Liu,
Zengzhong Li,
Rangan Majumder,
Jennifer Neville,
Andy Oakley,
Knut Magne Risvik
, et al. (6 additional authors not shown)
Abstract:
Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals. In this paper, we introduce MS MARCO Web Search, the first large-scale information-rich web dataset, featuring millions of real clicked query-document labels. This dataset closely mimics real-world web document and query distribution, provides rich information for various kinds of down…
▽ More
Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals. In this paper, we introduce MS MARCO Web Search, the first large-scale information-rich web dataset, featuring millions of real clicked query-document labels. This dataset closely mimics real-world web document and query distribution, provides rich information for various kinds of downstream tasks and encourages research in various areas, such as generic end-to-end neural indexer models, generic embedding models, and next generation information access system with large language models. MS MARCO Web Search offers a retrieval benchmark with three web retrieval challenge tasks that demand innovations in both machine learning and information retrieval system research domains. As the first dataset that meets large, real and rich data requirements, MS MARCO Web Search paves the way for future advancements in AI and system research. MS MARCO Web Search dataset is available at: https://github.com/microsoft/MS-MARCO-Web-Search.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
QuaLLM: An LLM-based Framework to Extract Quantitative Insights from Online Forums
Authors:
Varun Nagaraj Rao,
Eesha Agarwal,
Samantha Dalal,
Dan Calacci,
Andrés Monroy-Hernández
Abstract:
Online discussion forums provide crucial data to understand the concerns of a wide range of real-world communities. However, the typical qualitative and quantitative methods used to analyze those data, such as thematic analysis and topic modeling, are infeasible to scale or require significant human effort to translate outputs to human readable forms. This study introduces QuaLLM, a novel LLM-base…
▽ More
Online discussion forums provide crucial data to understand the concerns of a wide range of real-world communities. However, the typical qualitative and quantitative methods used to analyze those data, such as thematic analysis and topic modeling, are infeasible to scale or require significant human effort to translate outputs to human readable forms. This study introduces QuaLLM, a novel LLM-based framework to analyze and extract quantitative insights from text data on online forums. The framework consists of a novel prompting methodology and evaluation strategy. We applied this framework to analyze over one million comments from two Reddit's rideshare worker communities, marking the largest study of its type. We uncover significant worker concerns regarding AI and algorithmic platform decisions, responding to regulatory calls about worker insights. In short, our work sets a new precedent for AI-assisted quantitative data analysis to surface concerns from online forums.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond
Authors:
Stefan Abi-Karam,
Rishov Sarkar,
Allison Seigler,
Sean Lowe,
Zhigang Wei,
Hanqiu Chen,
Nanditha Rao,
Lizy John,
Aman Arora,
Cong Hao
Abstract:
Machine learning (ML) techniques have been applied to high-level synthesis (HLS) flows for quality-of-result (QoR) prediction and design space exploration (DSE). Nevertheless, the scarcity of accessible high-quality HLS datasets and the complexity of building such datasets present challenges. Existing datasets have limitations in terms of benchmark coverage, design space enumeration, vendor extens…
▽ More
Machine learning (ML) techniques have been applied to high-level synthesis (HLS) flows for quality-of-result (QoR) prediction and design space exploration (DSE). Nevertheless, the scarcity of accessible high-quality HLS datasets and the complexity of building such datasets present challenges. Existing datasets have limitations in terms of benchmark coverage, design space enumeration, vendor extensibility, or lack of reproducible and extensible software for dataset construction. Many works also lack user-friendly ways to add more designs, limiting wider adoption of such datasets.
In response to these challenges, we introduce HLSFactory, a comprehensive framework designed to facilitate the curation and generation of high-quality HLS design datasets. HLSFactory has three main stages: 1) a design space expansion stage to elaborate single HLS designs into large design spaces using various optimization directives across multiple vendor tools, 2) a design synthesis stage to execute HLS and FPGA tool flows concurrently across designs, and 3) a data aggregation stage for extracting standardized data into packaged datasets for ML usage. This tripartite architecture ensures broad design space coverage via design space expansion and supports multiple vendor tools. Users can contribute to each stage with their own HLS designs and synthesis results and extend the framework itself with custom frontends and tool flows. We also include an initial set of built-in designs from common HLS benchmarks curated open-source HLS designs.
We showcase the versatility and multi-functionality of our framework through six case studies: I) Design space sampling; II) Fine-grained parallelism backend speedup; III) Targeting Intel's HLS flow; IV) Adding new auxiliary designs; V) Integrating published HLS data; VI) HLS tool version regression benchmarking.
Code at https://github.com/sharc-lab/HLSFactory.
△ Less
Submitted 17 May, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents
Authors:
Corby Rosset,
Ho-Lam Chung,
Guanghui Qin,
Ethan C. Chau,
Zhuo Feng,
Ahmed Awadallah,
Jennifer Neville,
Nikhil Rao
Abstract:
Existing question answering (QA) datasets are no longer challenging to most powerful Large Language Models (LLMs). Traditional QA benchmarks like TriviaQA, NaturalQuestions, ELI5 and HotpotQA mainly study ``known unknowns'' with clear indications of both what information is missing, and how to find it to answer the question. Hence, good performance on these benchmarks provides a false sense of sec…
▽ More
Existing question answering (QA) datasets are no longer challenging to most powerful Large Language Models (LLMs). Traditional QA benchmarks like TriviaQA, NaturalQuestions, ELI5 and HotpotQA mainly study ``known unknowns'' with clear indications of both what information is missing, and how to find it to answer the question. Hence, good performance on these benchmarks provides a false sense of security. A yet unmet need of the NLP community is a bank of non-factoid, multi-perspective questions involving a great deal of unclear information needs, i.e. ``unknown uknowns''. We claim we can find such questions in search engine logs, which is surprising because most question-intent queries are indeed factoid. We present Researchy Questions, a dataset of search engine queries tediously filtered to be non-factoid, ``decompositional'' and multi-perspective. We show that users spend a lot of ``effort'' on these questions in terms of signals like clicks and session length, and that they are also challenging for GPT-4. We also show that ``slow thinking'' answering techniques, like decomposition into sub-questions shows benefit over answering directly. We release $\sim$ 100k Researchy Questions, along with the Clueweb22 URLs that were clicked.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Culturally-Attuned Moral Machines: Implicit Learning of Human Value Systems by AI through Inverse Reinforcement Learning
Authors:
Nigini Oliveira,
Jasmine Li,
Koosha Khalvati,
Rodolfo Cortes Barragan,
Katharina Reinecke,
Andrew N. Meltzoff,
Rajesh P. N. Rao
Abstract:
Constructing a universal moral code for artificial intelligence (AI) is difficult or even impossible, given that different human cultures have different definitions of morality and different societal norms. We therefore argue that the value system of an AI should be culturally attuned: just as a child raised in a particular culture learns the specific values and norms of that culture, we propose t…
▽ More
Constructing a universal moral code for artificial intelligence (AI) is difficult or even impossible, given that different human cultures have different definitions of morality and different societal norms. We therefore argue that the value system of an AI should be culturally attuned: just as a child raised in a particular culture learns the specific values and norms of that culture, we propose that an AI agent operating in a particular human community should acquire that community's moral, ethical, and cultural codes. How AI systems might acquire such codes from human observation and interaction has remained an open question. Here, we propose using inverse reinforcement learning (IRL) as a method for AI agents to acquire a culturally-attuned value system implicitly. We test our approach using an experimental paradigm in which AI agents use IRL to learn different reward functions, which govern the agents' moral values, by observing the behavior of different cultural groups in an online virtual world requiring real-time decision making. We show that an AI agent learning from the average behavior of a particular cultural group can acquire altruistic characteristics reflective of that group's behavior, and this learned value system can generalize to new scenarios requiring altruistic judgments. Our results provide, to our knowledge, the first demonstration that AI agents could potentially be endowed with the ability to continually learn their values and norms from observing and interacting with humans, thereby becoming attuned to the culture they are operating in.
△ Less
Submitted 29 December, 2023;
originally announced December 2023.
-
Knowledge Graph Reasoning Based on Attention GCN
Authors:
Meera Gupta,
Ravi Khanna,
Divya Choudhary,
Nandini Rao
Abstract:
We propose a novel technique to enhance Knowledge Graph Reasoning by combining Graph Convolution Neural Network (GCN) with the Attention Mechanism. This approach utilizes the Attention Mechanism to examine the relationships between entities and their neighboring nodes, which helps to develop detailed feature vectors for each entity. The GCN uses shared parameters to effectively represent the chara…
▽ More
We propose a novel technique to enhance Knowledge Graph Reasoning by combining Graph Convolution Neural Network (GCN) with the Attention Mechanism. This approach utilizes the Attention Mechanism to examine the relationships between entities and their neighboring nodes, which helps to develop detailed feature vectors for each entity. The GCN uses shared parameters to effectively represent the characteristics of adjacent entities. We first learn the similarity of entities for node representation learning. By integrating the attributes of the entities and their interactions, this method generates extensive implicit feature vectors for each entity, improving performance in tasks including entity classification and link prediction, outperforming traditional neural network models. To conclude, this work provides crucial methodological support for a range of applications, such as search engines, question-answering systems, recommendation systems, and data integration tasks.
△ Less
Submitted 27 January, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech Enhancement
Authors:
Martin Strauss,
Nicola Pia,
Nagashree K. S. Rao,
Bernd Edler
Abstract:
This paper proposes SEFGAN, a Deep Neural Network (DNN) combining maximum likelihood training and Generative Adversarial Networks (GANs) for efficient speech enhancement (SE). For this, a DNN is trained to synthesize the enhanced speech conditioned on noisy speech using a Normalizing Flow (NF) as generator in a GAN framework. While the combination of likelihood models and GANs is not trivial, SEFG…
▽ More
This paper proposes SEFGAN, a Deep Neural Network (DNN) combining maximum likelihood training and Generative Adversarial Networks (GANs) for efficient speech enhancement (SE). For this, a DNN is trained to synthesize the enhanced speech conditioned on noisy speech using a Normalizing Flow (NF) as generator in a GAN framework. While the combination of likelihood models and GANs is not trivial, SEFGAN demonstrates that a hybrid adversarial and maximum likelihood training approach enables the model to maintain high quality audio generation and log-likelihood estimation. Our experiments indicate that this approach strongly outperforms the baseline NF-based model without introducing additional complexity to the enhancement network. A comparison using computational metrics and a listening experiment reveals that SEFGAN is competitive with other state-of-the-art models.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach
Authors:
Nurendra Choudhary,
Nikhil Rao,
Chandan K. Reddy
Abstract:
The progress in hyperbolic neural networks (HNNs) research is hindered by their absence of inductive bias mechanisms, which are essential for generalizing to new tasks and facilitating scalable learning over large datasets. In this paper, we aim to alleviate these issues by learning generalizable inductive biases from the nodes' local subgraph and transfer them for faster learning over new subgrap…
▽ More
The progress in hyperbolic neural networks (HNNs) research is hindered by their absence of inductive bias mechanisms, which are essential for generalizing to new tasks and facilitating scalable learning over large datasets. In this paper, we aim to alleviate these issues by learning generalizable inductive biases from the nodes' local subgraph and transfer them for faster learning over new subgraphs with a disjoint set of nodes, edges, and labels in a few-shot setting. We introduce a novel method, Hyperbolic GRAph Meta Learner (H-GRAM), that, for the tasks of node classification and link prediction, learns transferable information from a set of support local subgraphs in the form of hyperbolic meta gradients and label hyperbolic protonets to enable faster learning over a query set of new tasks dealing with disjoint subgraphs. Furthermore, we show that an extension of our meta-learning framework also mitigates the scalability challenges seen in HNNs faced by existing approaches. Our comparative analysis shows that H-GRAM effectively learns and transfers information in multiple challenging few-shot settings compared to other state-of-the-art baselines. Additionally, we demonstrate that, unlike standard HNNs, our approach is able to scale over large graph datasets and improve performance over its Euclidean counterparts.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
Protocols for counterfactual and twin-field quantum digital signature
Authors:
Vinod N. Rao,
Shrikant Utagi,
Anirban Pathak,
R. Srikanth
Abstract:
Quantum digital signature (QDS) is the quantum version of its classical counterpart, and can offer security against attacks of repudiation, signature forging and external eavesdropping, on the basis of quantum mechanical no-go principles. Here we propose a QDS scheme based on quantum counterfactuality, which leverages the concept of interaction-free measurement. Employing the idea behind twin-fiel…
▽ More
Quantum digital signature (QDS) is the quantum version of its classical counterpart, and can offer security against attacks of repudiation, signature forging and external eavesdropping, on the basis of quantum mechanical no-go principles. Here we propose a QDS scheme based on quantum counterfactuality, which leverages the concept of interaction-free measurement. Employing the idea behind twin-field cryptography, we show how this two-way protocol can be turned into an equivalent non-counterfactual, one-way protocol, that is both more practical and also theoretically helpful in assessing the experimental feasibility of the first protocol. The proposed QDS protocol can be experimentally implemented with current quantum technology.
△ Less
Submitted 19 June, 2024; v1 submitted 17 October, 2023;
originally announced October 2023.
-
Normality of I-V Measurements Using ML
Authors:
Anees Al-Najjar,
Nageswara S. V. Rao,
Craig A. Bridges,
Sheng Dai
Abstract:
Electrochemistry ecosystems are promising for accelerating the design and discovery of electrochemical systems for energy storage and conversion, by automating significant parts of workflows that combine synthesis and characterization experiments with computations. They require the integration of flow controllers, solvent containers, pumps, fraction collectors, and potentiostats, all connected to…
▽ More
Electrochemistry ecosystems are promising for accelerating the design and discovery of electrochemical systems for energy storage and conversion, by automating significant parts of workflows that combine synthesis and characterization experiments with computations. They require the integration of flow controllers, solvent containers, pumps, fraction collectors, and potentiostats, all connected to an electrochemical cell. These are specialized instruments with custom software that is not originally designed for network integration. We developed network and software solutions for electrochemical workflows that adapt system and instrument settings in real-time for multiple rounds of experiments. We demonstrate this automated workflow by remotely operating the instruments and collecting their measurements to generate a voltammogram (I-V profile) of an electrolyte solution in an electrochemical cell. These measurements are made available at the remote computing system and used for subsequent analysis. In this paper, we focus on a novel, analytically validated machine learning (ML) method for an electrochemistry ecosystem to ensure that I-V measurements are consistent with the normal experimental conditions, and to detect abnormal conditions, such as disconnected electrodes or low cell content volume.
△ Less
Submitted 28 September, 2023;
originally announced October 2023.
-
Dodo: Dynamic Contextual Compression for Decoder-only LMs
Authors:
Guanghui Qin,
Corby Rosset,
Ethan C. Chau,
Nikhil Rao,
Benjamin Van Durme
Abstract:
Transformer-based language models (LMs) are inefficient in long contexts. We propose Dodo, a solution for context compression. Instead of one vector per token in a standard transformer model, Dodo represents text with a dynamic number of hidden states at each layer, reducing the cost of self-attention to a fraction of typical time and space. Moreover, off-the-shelf models such as LLaMA can be adap…
▽ More
Transformer-based language models (LMs) are inefficient in long contexts. We propose Dodo, a solution for context compression. Instead of one vector per token in a standard transformer model, Dodo represents text with a dynamic number of hidden states at each layer, reducing the cost of self-attention to a fraction of typical time and space. Moreover, off-the-shelf models such as LLaMA can be adapted to Dodo by efficient parameter tuning methods such as LoRA. In use, Dodo can act as either an autoregressive LM or a context compressor for downstream tasks. We demonstrate through experiments in language modeling, question answering, and summarization that Dodo retains capabilities in these tasks, while drastically reducing the overhead during decoding. For example, in the autoencoding task, Dodo shrinks context at a 20x compression ratio with a BLEU score of 98% for reconstruction, achieving nearly lossless encoding.
△ Less
Submitted 13 June, 2024; v1 submitted 3 October, 2023;
originally announced October 2023.
-
Automatic Pair Construction for Contrastive Post-training
Authors:
Canwen Xu,
Corby Rosset,
Ethan C. Chau,
Luciano Del Corro,
Shweti Mahajan,
Julian McAuley,
Jennifer Neville,
Ahmed Hassan Awadallah,
Nikhil Rao
Abstract:
Alignment serves as an important step to steer large language models (LLMs) towards human preferences. In this paper, we propose an automatic way to construct contrastive data for LLM, using preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4). We compare the contrastive techniques of SLiC and DPO to SFT baselines and find that DPO provides a step-funct…
▽ More
Alignment serves as an important step to steer large language models (LLMs) towards human preferences. In this paper, we propose an automatic way to construct contrastive data for LLM, using preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4). We compare the contrastive techniques of SLiC and DPO to SFT baselines and find that DPO provides a step-function improvement even after continuing SFT saturates. We also explore a data curriculum learning scheme for contrastive post-training, which starts by learning from "easier" pairs and transitioning to "harder" ones, which further improves alignment. Finally, we scale up our experiments to train with more data and larger models like Orca. Remarkably, our automatic contrastive post-training further improves the performance of Orca, already a state-of-the-art instruction learning model tuned with GPT-4 outputs, to outperform ChatGPT.
△ Less
Submitted 2 April, 2024; v1 submitted 3 October, 2023;
originally announced October 2023.
-
CAT-LM: Training Language Models on Aligned Code And Tests
Authors:
Nikitha Rao,
Kush Jain,
Uri Alon,
Claire Le Goues,
Vincent J. Hellendoorn
Abstract:
Testing is an integral part of the software development process. Yet, writing tests is time-consuming and therefore often neglected. Classical test generation tools such as EvoSuite generate behavioral test suites by optimizing for coverage, but tend to produce tests that are hard to understand. Language models trained on code can generate code that is highly similar to that written by humans, but…
▽ More
Testing is an integral part of the software development process. Yet, writing tests is time-consuming and therefore often neglected. Classical test generation tools such as EvoSuite generate behavioral test suites by optimizing for coverage, but tend to produce tests that are hard to understand. Language models trained on code can generate code that is highly similar to that written by humans, but current models are trained to generate each file separately, as is standard practice in natural language processing, and thus fail to consider the code-under-test context when producing a test file. In this work, we propose the Aligned Code And Tests Language Model (CAT-LM), a GPT-style language model with 2.7 Billion parameters, trained on a corpus of Python and Java projects. We utilize a novel pretraining signal that explicitly considers the mapping between code and test files when available. We also drastically increase the maximum sequence length of inputs to 8,192 tokens, 4x more than typical code generation models, to ensure that the code context is available to the model when generating test code. We analyze its usefulness for realistic applications, showing that sampling with filtering (e.g., by compilability, coverage) allows it to efficiently produce tests that achieve coverage similar to ones written by developers while resembling their writing style. By utilizing the code context, CAT-LM generates more valid tests than even much larger language models trained with more data (CodeGen 16B and StarCoder) and substantially outperforms a recent test-specific model (TeCo) at test completion. Overall, our work highlights the importance of incorporating software-specific insights when training language models for code and paves the way to more powerful automated test generation.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Multidimensional well-being of US households at a fine spatial scale using fused household surveys: fusionACS
Authors:
Kevin Ummel,
Miguel Poblete-Cazenave,
Karthik Akkiraju,
Nick Graetz,
Hero Ashman,
Cora Kingdon,
Steven Herrera Tenorio,
Aaryaman "Sunny" Singhal,
Daniel Aldana Cohen,
Narasimha D. Rao
Abstract:
Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistical…
▽ More
Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistically "fusing" variables from "donor" surveys onto American Community Survey (ACS) microdata. This results in an integrated microdataset of household attributes and well-being dimensions that can be analyzed to address research questions in ways that are not currently possible. The presented data comprise the fusion onto the ACS of select donor variables from the Residential Energy Consumption Survey (RECS) of 2015, the National Household Transportation Survey (NHTS) of 2017, the American Housing Survey (AHS) of 2019, and the Consumer Expenditure Survey - Interview (CEI) for the years 2015-2019. The underlying statistical techniques are included in an open-source $R$ package, fusionModel, that provides generic tools for the creation, analysis, and validation of fused microdata.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Capital Structure Dynamics and Financial Performance in Indian Banks (An Analysis of Mergers and Acquisitions)
Authors:
Kurada T S S Satyanarayana,
Addada Narasimha Rao,
Kumpatla jaya surya
Abstract:
This research investigates the multifaceted relationship underlying capital structure dynamics along with financial performance as a result of mergers and acquisitions, or M&As, in Indian banks. In the face of increasing competition, banks have deliberately embraced M&A as a strategy of improving commercial prospects and maintaining financial stability. The primary goal of this study is to examine…
▽ More
This research investigates the multifaceted relationship underlying capital structure dynamics along with financial performance as a result of mergers and acquisitions, or M&As, in Indian banks. In the face of increasing competition, banks have deliberately embraced M&A as a strategy of improving commercial prospects and maintaining financial stability. The primary goal of this study is to examine the changes in the capital framework and financial results of banks before and after M&A transactions. The investigation, which employs a paired t-test as a method of statistical analysis, is based on a review of annual reports from selected banks over a two-year period before and after M&A transactions. The paired t-test approach allows for a thorough statistical analysis of interconnected datasets, revealing the subtle influence of M&A attempts on both bank financial performance as well as capital structure dynamics. The study's findings have the potential to add to the current body of knowledge on organisational planning, managing finances, and capital structure optimisation. The research has practical significance for financial companies, legislators, and scholars interested in understanding the profound effects of M&A inside the arena of financial institutions that operate within fiercely competitive landscapes because it provides comprehensive insights regarding the complex consequences of banking merger and acquisition (M&A) deals on capital structure as well as financial performance. Finally, the goal of this research is to provide the banking sector with educated decision-making capabilities and strategic guidance to businesses facing heightened competition while coping with the complexities of capital structure.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Expressive probabilistic sampling in recurrent neural networks
Authors:
Shirui Chen,
Linxing Preston Jiang,
Rajesh P. N. Rao,
Eric Shea-Brown
Abstract:
In sampling-based Bayesian models of brain function, neural activities are assumed to be samples from probability distributions that the brain uses for probabilistic computation. However, a comprehensive understanding of how mechanistic models of neural dynamics can sample from arbitrary distributions is still lacking. We use tools from functional analysis and stochastic differential equations to…
▽ More
In sampling-based Bayesian models of brain function, neural activities are assumed to be samples from probability distributions that the brain uses for probabilistic computation. However, a comprehensive understanding of how mechanistic models of neural dynamics can sample from arbitrary distributions is still lacking. We use tools from functional analysis and stochastic differential equations to explore the minimum architectural requirements for $\textit{recurrent}$ neural circuits to sample from complex distributions. We first consider the traditional sampling model consisting of a network of neurons whose outputs directly represent the samples (sampler-only network). We argue that synaptic current and firing-rate dynamics in the traditional model have limited capacity to sample from a complex probability distribution. We show that the firing rate dynamics of a recurrent neural circuit with a separate set of output units can sample from an arbitrary probability distribution. We call such circuits reservoir-sampler networks (RSNs). We propose an efficient training procedure based on denoising score matching that finds recurrent and output weights such that the RSN implements Langevin sampling. We empirically demonstrate our model's ability to sample from several complex data distributions using the proposed neural dynamics and discuss its applicability to developing the next generation of sampling-based brain models.
△ Less
Submitted 14 November, 2023; v1 submitted 22 August, 2023;
originally announced August 2023.
-
Brain-Inspired Computational Intelligence via Predictive Coding
Authors:
Tommaso Salvatori,
Ankur Mali,
Christopher L. Buckley,
Thomas Lukasiewicz,
Rajesh P. N. Rao,
Karl Friston,
Alexander Ororbia
Abstract:
Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying unc…
▽ More
Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying uncertainty, lack of robustness, unreliability, and biological implausibility. It is possible that addressing these limitations may require schemes that are inspired and guided by neuroscience theories. One such theory, called predictive coding (PC), has shown promising performance in machine intelligence tasks, exhibiting exciting properties that make it potentially valuable for the machine learning community: PC can model information processing in different brain areas, can be used in cognitive control and robotics, and has a solid mathematical grounding in variational inference, offering a powerful inversion scheme for a specific class of continuous-state generative models. With the hope of foregrounding research in this direction, we survey the literature that has contributed to this perspective, highlighting the many ways that PC might play a role in the future of machine learning and computational intelligence at large.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Impact of Oxygen Pressure on Ferroelectric Stability of La-doped Hafnia Grown by PLD
Authors:
Badari Narayana Rao,
Shintaro Yasui,
Hiroko Yokota
Abstract:
Thin films of HfO2 doped with 4% La were fabricated on LSMO/STO (100) substrates using pulsed laser deposition. The stability of the ferroelectric orthorhombic phase in the hafnia films was investigated with respect to varying oxygen pressure during deposition. X-ray diffraction and X-ray photoelectron spectroscopy measurements were carried out to analyze the structure and composition of the films…
▽ More
Thin films of HfO2 doped with 4% La were fabricated on LSMO/STO (100) substrates using pulsed laser deposition. The stability of the ferroelectric orthorhombic phase in the hafnia films was investigated with respect to varying oxygen pressure during deposition. X-ray diffraction and X-ray photoelectron spectroscopy measurements were carried out to analyze the structure and composition of the films and correlated with their ferroelectric properties. Surprisingly, the ferroelectricity of the hafnia films showed a dependence on oxygen pressure during deposition of LSMO bottom electrode as well. The reason for this dependence is discussed in terms of the active role of non-lattice oxygen in the ferroelectric switching of hafnia.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Capital Structure Theories and its Practice, A study with reference to select NSE listed public sectors banks, India
Authors:
Kurada T S S Satyanarayana,
Addada Narasimha Rao
Abstract:
Among the various factors affecting the firms positioning and performance in modern day markets, capital structure of the firm has its own way of expressing itself as a crucial one. With the rapid changes in technology, firms are being pushed onto a paradigm that is burdening the capital management process. Hence the study of capital structure changes gives the investors an insight into firm's beh…
▽ More
Among the various factors affecting the firms positioning and performance in modern day markets, capital structure of the firm has its own way of expressing itself as a crucial one. With the rapid changes in technology, firms are being pushed onto a paradigm that is burdening the capital management process. Hence the study of capital structure changes gives the investors an insight into firm's behavior and intrinsic goals. These changes will vary for firms in different sectors. This work considers the banking sector, which has a unique capital structure for the given regulations of its operations in India. The capital structure behavioral changes in a few public sector banks are studied in this paper. A theoretical framework has been developed from the popular capital structure theories and hypotheses are derived from them accordingly. The main idea is to validate different theories with real time performance of the select banks from 2011 to 2022. Using statistical techniques like regression and correlation, tested hypotheses have resulted in establishing the relation between debt component and financial performance variables of the select banks which are helping in understanding the theories in practice.
△ Less
Submitted 26 July, 2023;
originally announced July 2023.
-
Cyber Framework for Steering and Measurements Collection Over Instrument-Computing Ecosystems
Authors:
Anees Al-Najjar,
Nageswara S. V. Rao,
Ramanan Sankaran,
Helia Zandi,
Debangshu Mukherjee,
Maxim Ziatdinov,
Craig Bridges
Abstract:
We propose a framework to develop cyber solutions to support the remote steering of science instruments and measurements collection over instrument-computing ecosystems. It is based on provisioning separate data and control connections at the network level, and developing software modules consisting of Python wrappers for instrument commands and Pyro server-client codes that make them available ac…
▽ More
We propose a framework to develop cyber solutions to support the remote steering of science instruments and measurements collection over instrument-computing ecosystems. It is based on provisioning separate data and control connections at the network level, and developing software modules consisting of Python wrappers for instrument commands and Pyro server-client codes that make them available across the ecosystem network. We demonstrate automated measurement transfers and remote steering operations in a microscopy use case for materials research over an ecosystem of Nion microscopes and computing platforms connected over site networks. The proposed framework is currently under further refinement and being adopted to science workflows with automated remote experiments steering for autonomous chemistry laboratories and smart energy grid simulations.
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
Discrimination through Image Selection by Job Advertisers on Facebook
Authors:
Varun Nagaraj Rao,
Aleksandra Korolova
Abstract:
Targeted advertising platforms are widely used by job advertisers to reach potential employees; thus issues of discrimination due to targeting that have surfaced have received widespread attention. Advertisers could misuse targeting tools to exclude people based on gender, race, location and other protected attributes from seeing their job ads. In response to legal actions, Facebook disabled the a…
▽ More
Targeted advertising platforms are widely used by job advertisers to reach potential employees; thus issues of discrimination due to targeting that have surfaced have received widespread attention. Advertisers could misuse targeting tools to exclude people based on gender, race, location and other protected attributes from seeing their job ads. In response to legal actions, Facebook disabled the ability for explicit targeting based on many attributes for some ad categories, including employment. Although this is a step in the right direction, prior work has shown that discrimination can take place not just due to the explicit targeting tools of the platforms, but also due to the impact of the biased ad delivery algorithm. Thus, one must look at the potential for discrimination more broadly, and not merely through the lens of the explicit targeting tools.
In this work, we propose and investigate the prevalence of a new means for discrimination in job advertising, that combines both targeting and delivery -- through the disproportionate representation or exclusion of people of certain demographics in job ad images. We use the Facebook Ad Library to demonstrate the prevalence of this practice through: (1) evidence of advertisers running many campaigns using ad images of people of only one perceived gender, (2) systematic analysis for gender representation in all current ad campaigns for truck drivers and nurses, (3) longitudinal analysis of ad campaign image use by gender and race for select advertisers. After establishing that the discrimination resulting from a selective choice of people in job ad images, combined with algorithmic amplification of skews by the ad delivery algorithm, is of immediate concern, we discuss approaches and challenges for addressing it.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
Single-Image-Based Deep Learning for Segmentation of Early Esophageal Cancer Lesions
Authors:
Haipeng Li,
Dingrui Liu,
Yu Zeng,
Shuaicheng Liu,
Tao Gan,
Nini Rao,
Jinlin Yang,
Bing Zeng
Abstract:
Accurate segmentation of lesions is crucial for diagnosis and treatment of early esophageal cancer (EEC). However, neither traditional nor deep learning-based methods up to today can meet the clinical requirements, with the mean Dice score - the most important metric in medical image analysis - hardly exceeding 0.75. In this paper, we present a novel deep learning approach for segmenting EEC lesio…
▽ More
Accurate segmentation of lesions is crucial for diagnosis and treatment of early esophageal cancer (EEC). However, neither traditional nor deep learning-based methods up to today can meet the clinical requirements, with the mean Dice score - the most important metric in medical image analysis - hardly exceeding 0.75. In this paper, we present a novel deep learning approach for segmenting EEC lesions. Our approach stands out for its uniqueness, as it relies solely on a single image coming from one patient, forming the so-called "You-Only-Have-One" (YOHO) framework. On one hand, this "one-image-one-network" learning ensures complete patient privacy as it does not use any images from other patients as the training data. On the other hand, it avoids nearly all generalization-related problems since each trained network is applied only to the input image itself. In particular, we can push the training to "over-fitting" as much as possible to increase the segmentation accuracy. Our technical details include an interaction with clinical physicians to utilize their expertise, a geometry-based rendering of a single lesion image to generate the training set (the \emph{biggest} novelty), and an edge-enhanced UNet. We have evaluated YOHO over an EEC data-set created by ourselves and achieved a mean Dice score of 0.888, which represents a significant advance toward clinical applications.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
Estimation of Poverty Measures for Small Areas Under a Two-Fold Nested Error Linear Regression Model: Comparison of Two Methods
Authors:
Maryam Sohrabi,
J. N. K. Rao
Abstract:
Demand for reliable statistics at a local area (small area) level has greatly increased in recent years. Traditional area-specific estimators based on probability samples are not adequate because of small sample size or even zero sample size in a local area. As a result, methods based on models linking the areas are widely used. World Bank focused on estimating poverty measures, in particular pove…
▽ More
Demand for reliable statistics at a local area (small area) level has greatly increased in recent years. Traditional area-specific estimators based on probability samples are not adequate because of small sample size or even zero sample size in a local area. As a result, methods based on models linking the areas are widely used. World Bank focused on estimating poverty measures, in particular poverty incidence and poverty gap called FGT measures, using a simulated census method, called ELL, based on a one-fold nested error model for a suitable transformation of the welfare variable. Modified ELL methods leading to significant gain in efficiency over ELL also have been proposed under the one-fold model. An advantage of ELL and modified ELL methods is that distributional assumptions on the random effects in the model are not needed. In this paper, we extend ELL and modified ELL to two-fold nested error models to estimate poverty indicators for areas (say a state) and subareas (say counties within a state). Our simulation results indicate that the modified ELL estimators lead to large efficiency gains over ELL at the area level and subarea level. Further, modified ELL method retaining both area and subarea estimated effects in the model (called MELL2) performs significantly better in terms of mean squared error (MSE) for sampled subareas than the modified ELL retaining only estimated area effect in the model (called MELL1).
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
AI for Low-Code for AI
Authors:
Nikitha Rao,
Jason Tsay,
Kiran Kate,
Vincent J. Hellendoorn,
Martin Hirzel
Abstract:
Low-code programming allows citizen developers to create programs with minimal coding effort, typically via visual (e.g. drag-and-drop) interfaces. In parallel, recent AI-powered tools such as Copilot and ChatGPT generate programs from natural language instructions. We argue that these modalities are complementary: tools like ChatGPT greatly reduce the need to memorize large APIs but still require…
▽ More
Low-code programming allows citizen developers to create programs with minimal coding effort, typically via visual (e.g. drag-and-drop) interfaces. In parallel, recent AI-powered tools such as Copilot and ChatGPT generate programs from natural language instructions. We argue that these modalities are complementary: tools like ChatGPT greatly reduce the need to memorize large APIs but still require their users to read (and modify) programs, whereas visual tools abstract away most or all programming but struggle to provide easy access to large APIs. At their intersection, we propose LowCoder, the first low-code tool for developing AI pipelines that supports both a visual programming interface (LowCoder_VP) and an AI-powered natural language interface (LowCoder_NL). We leverage this tool to provide some of the first insights into whether and how these two modalities help programmers by conducting a user study. We task 20 developers with varying levels of AI expertise with implementing four ML pipelines using LowCoder, replacing the LowCoder_NL component with a simple keyword search in half the tasks. Overall, we find that LowCoder is especially useful for (i) Discoverability: using LowCoder_NL, participants discovered new operators in 75% of the tasks, compared to just 32.5% and 27.5% using web search or scrolling through options respectively in the keyword-search condition, and (ii) Iterative Composition: 82.5% of tasks were successfully completed and many initial pipelines were further successfully improved. Qualitative analysis shows that AI helps users discover how to implement constructs when they know what to do, but still fails to support novices when they lack clarity on what they want to accomplish. Overall, our work highlights the benefits of combining the power of AI with low-code programming.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model Aggregation
Authors:
Jiong Zhu,
Aishwarya Reganti,
Edward Huang,
Charles Dickens,
Nikhil Rao,
Karthik Subbian,
Danai Koutra
Abstract:
Distributed training of GNNs enables learning on massive graphs (e.g., social and e-commerce networks) that exceed the storage and computational capacity of a single machine. To reach performance comparable to centralized training, distributed frameworks focus on maximally recovering cross-instance node dependencies with either communication across instances or periodic fallback to centralized tra…
▽ More
Distributed training of GNNs enables learning on massive graphs (e.g., social and e-commerce networks) that exceed the storage and computational capacity of a single machine. To reach performance comparable to centralized training, distributed frameworks focus on maximally recovering cross-instance node dependencies with either communication across instances or periodic fallback to centralized training, which create overhead and limit the framework scalability. In this work, we present a simplified framework for distributed GNN training that does not rely on the aforementioned costly operations, and has improved scalability, convergence speed and performance over the state-of-the-art approaches. Specifically, our framework (1) assembles independent trainers, each of which asynchronously learns a local model on locally-available parts of the training graph, and (2) only conducts periodic (time-based) model aggregation to synchronize the local models. Backed by our theoretical analysis, instead of maximizing the recovery of cross-instance node dependencies -- which has been considered the key behind closing the performance gap between model aggregation and centralized training -- , our framework leverages randomized assignment of nodes or super-nodes (i.e., collections of original nodes) to partition the training graph such that it improves data uniformity and minimizes the discrepancy of gradient and loss function across instances. In our experiments on social and e-commerce networks with up to 1.3 billion edges, our proposed RandomTMA and SuperTMA approaches -- despite using less training data -- achieve state-of-the-art performance and 2.31x speedup compared to the fastest baseline, and show better robustness to trainer failures.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Long-term cybersecurity applications enabled by quantum networks
Authors:
Nicholas A. Peters,
Muneer Alshowkan,
Joseph C. Chapman,
Raphael C. Pooser,
Nageswara S. V. Rao,
Raymond T. Newell
Abstract:
If continental-scale quantum networks are realized, they will provide the resources needed to fulfill the potential for dramatic advances in cybersecurity through quantum-enabled cryptography applications. We describe recent progress and where the US is headed as well as argue that we go one step further and jointly develop quantum and conventional cryptography methods for joint deployments along…
▽ More
If continental-scale quantum networks are realized, they will provide the resources needed to fulfill the potential for dramatic advances in cybersecurity through quantum-enabled cryptography applications. We describe recent progress and where the US is headed as well as argue that we go one step further and jointly develop quantum and conventional cryptography methods for joint deployments along the quantum backbone infrastructure.
△ Less
Submitted 27 April, 2023;
originally announced April 2023.
-
Two-mode squeezing over deployed fiber coexisting with conventional communications
Authors:
Joseph C. Chapman,
Alexander Miloshevsky,
Hsuan-Hao Lu,
Nageswara Rao,
Muneer Alshowkan,
Nicholas A. Peters
Abstract:
Squeezed light is a crucial resource for continuous-variable (CV) quantum information science. Distributed multi-mode squeezing is critical for enabling CV quantum networks and distributed quantum sensing. To date, multi-mode squeezing measured by homodyne detection has been limited to single-room experiments without coexisting classical signals, i.e., on ``dark'' fiber. Here, after distribution t…
▽ More
Squeezed light is a crucial resource for continuous-variable (CV) quantum information science. Distributed multi-mode squeezing is critical for enabling CV quantum networks and distributed quantum sensing. To date, multi-mode squeezing measured by homodyne detection has been limited to single-room experiments without coexisting classical signals, i.e., on ``dark'' fiber. Here, after distribution through separate fiber spools (5~km), $-0.9\pm0.1$-dB coexistent two-mode squeezing is measured. Moreover, after distribution through separate deployed campus fibers (about 250~m and 1.2~km), $-0.5\pm0.1$-dB coexistent two-mode squeezing is measured. Prior to distribution, the squeezed modes are each frequency multiplexed with several classical signals -- including the local oscillator and conventional network signals -- demonstrating that the squeezed modes do not need dedicated dark fiber. After distribution, joint two-mode squeezing is measured and recorded for post-processing using triggered homodyne detection in separate locations. This demonstration enables future applications in quantum networks and quantum sensing that rely on distributed multi-mode squeezing.
△ Less
Submitted 12 July, 2023; v1 submitted 19 April, 2023;
originally announced April 2023.
-
Deep Learning for Automated Experimentation in Scanning Transmission Electron Microscopy
Authors:
Sergei V. Kalinin,
Debangshu Mukherjee,
Kevin M. Roccapriore,
Ben Blaiszik,
Ayana Ghosh,
Maxim A. Ziatdinov,
A. Al-Najjar,
Christina Doty,
Sarah Akers,
Nageswara S. Rao,
Joshua C. Agar,
Steven R. Spurgeon
Abstract:
Machine learning (ML) has become critical for post-acquisition data analysis in (scanning) transmission electron microscopy, (S)TEM, imaging and spectroscopy. An emerging trend is the transition to real-time analysis and closed-loop microscope operation. The effective use of ML in electron microscopy now requires the development of strategies for microscopy-centered experiment workflow design and…
▽ More
Machine learning (ML) has become critical for post-acquisition data analysis in (scanning) transmission electron microscopy, (S)TEM, imaging and spectroscopy. An emerging trend is the transition to real-time analysis and closed-loop microscope operation. The effective use of ML in electron microscopy now requires the development of strategies for microscopy-centered experiment workflow design and optimization. Here, we discuss the associated challenges with the transition to active ML, including sequential data analysis and out-of-distribution drift effects, the requirements for the edge operation, local and cloud data storage, and theory in the loop operations. Specifically, we discuss the relative contributions of human scientists and ML agents in the ideation, orchestration, and execution of experimental workflows and the need to develop universal hyper languages that can apply across multiple platforms. These considerations will collectively inform the operationalization of ML in next-generation experimentation.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
The carbon star DY Persei may be a cool R Coronae Borealis variable
Authors:
D. A. Garcia-Hernandez,
N. Kameswara Rao,
D. L. Lambert,
K. Eriksson,
A. B. S. Reddy,
T. Masseron
Abstract:
Optical and near-IR photometry suggests that the carbon star DY Persei exhibits fadings similar to those of R Coronae Borealis (RCB) variables. Photometric surveys of the Galaxy and Magellanic Clouds uncovered new DY Per variables with infrared photometry identifying them with cool carbon stars, perhaps, with an unusual tendency to shed mass. In an attempt to resolve DY Per's identity crisis -- a…
▽ More
Optical and near-IR photometry suggests that the carbon star DY Persei exhibits fadings similar to those of R Coronae Borealis (RCB) variables. Photometric surveys of the Galaxy and Magellanic Clouds uncovered new DY Per variables with infrared photometry identifying them with cool carbon stars, perhaps, with an unusual tendency to shed mass. In an attempt to resolve DY Per's identity crisis -- a cool carbon giant or a cool RCB variable? -- we analyze a high-resolution H&K band spectrum of DY Per. The CO first-overtone bands in the K-band of DY Per show a high abundance of 18O such that 16O/18O = 4 +- 1, a ratio sharply at odds with published results for `regular' cool carbon giants with 16O/18O ~ 1000 but this exceptionally low ratio is characteristic of RCB-variables and HdC stars. This similarity suggests that DY Per indeed may be a cool RCB variable. Current opinion considers RCB-variables to result from merger of a He onto a CO white dwarf; observed abundances of these H-deficient stars including the exceptionally low 16O/18O ratios are in fair accord with predicted compositions for white dwarf merger products. A H-deficiency for DY Per is not directly observable but is suggested from the strength of a HF line and an assumption that F may be overabundant, as observed and predicted for RCB stars.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
You Only Transfer What You Share: Intersection-Induced Graph Transfer Learning for Link Prediction
Authors:
Wenqing Zheng,
Edward W Huang,
Nikhil Rao,
Zhangyang Wang,
Karthik Subbian
Abstract:
Link prediction is central to many real-world applications, but its performance may be hampered when the graph of interest is sparse. To alleviate issues caused by sparsity, we investigate a previously overlooked phenomenon: in many cases, a densely connected, complementary graph can be found for the original graph. The denser graph may share nodes with the original graph, which offers a natural b…
▽ More
Link prediction is central to many real-world applications, but its performance may be hampered when the graph of interest is sparse. To alleviate issues caused by sparsity, we investigate a previously overlooked phenomenon: in many cases, a densely connected, complementary graph can be found for the original graph. The denser graph may share nodes with the original graph, which offers a natural bridge for transferring selective, meaningful knowledge. We identify this setting as Graph Intersection-induced Transfer Learning (GITL), which is motivated by practical applications in e-commerce or academic co-authorship predictions. We develop a framework to effectively leverage the structural prior in this setting. We first create an intersection subgraph using the shared nodes between the two graphs, then transfer knowledge from the source-enriched intersection subgraph to the full target graph. In the second step, we consider two approaches: a modified label propagation, and a multi-layer perceptron (MLP) model in a teacher-student regime. Experimental results on proprietary e-commerce datasets and open-source citation graphs show that the proposed workflow outperforms existing transfer learning baselines that do not explicitly utilize the intersection structure.
△ Less
Submitted 18 June, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Exploring the substrate-driven morphological changes in Nd0.6Sr0.4MnO3 thin films
Authors:
R S Mrinaleni,
E P Amaladass,
S Amirthapandian,
A. T. Sathyanarayana,
Jegadeesan P,
Ganesan K,
R M Sarguna,
P. N. Rao,
Pooja Gupta,
T Geetha Kumary,
S. K. Rai,
Awadhesh Mani
Abstract:
Manganite thin films are promising candidates for studying the strongly correlated electron systems. Understanding the growth-and morphology-driven changes in the physical properties of manganite thin films is vital for their applications in oxitronics. This work reports the morphological, structural, and electrical transport properties of nanostructured Nd0.6Sr0.4MnO3 (NSMO) thin films fabricated…
▽ More
Manganite thin films are promising candidates for studying the strongly correlated electron systems. Understanding the growth-and morphology-driven changes in the physical properties of manganite thin films is vital for their applications in oxitronics. This work reports the morphological, structural, and electrical transport properties of nanostructured Nd0.6Sr0.4MnO3 (NSMO) thin films fabricated using the pulsed laser deposition technique. Scanning electron microscopy (SEM) imaging of the thin films revealed two prominent surface morphologies: a granular and a unique crossed-nano-rod-type morphology. From X-ray diffraction (XRD) and atomic force microscopy (AFM) analysis, we found that the observed nanostructures resulted from altered growth modes occurring on the terraced substrate surface. Furthermore, investigations on the electrical-transport properties of thin films revealed that the films with crossed-nano-rod type morphology showed a sharp resistive transition near the metal-to-insulator transition (MIT). An enhanced temperature coefficient of resistance (TCR) of up to one order of magnitude was also observed compared to the films with granular morphology. Such enhancement in TCR % by tuning the morphology makes these thin films promising candidates for developing oxide-based temperature sensors and detectors.
△ Less
Submitted 13 January, 2023;
originally announced January 2023.
-
UOCS-IX. AstroSat/UVIT study of the open cluster NGC 2818: Blue Stragglers, Yellow Stragglers, Planetary Nebula, and their membership
Authors:
Sharmila Rani,
Gajendra Pandey,
Annapurni Subramaniam,
N. Kameswara Rao
Abstract:
We present the first far-UV (FUV) imaging results of the intermediate-age Galactic open cluster NGC 2818 that has a Planetary nebula (PN) within the field using images taken from the Ultra-violet Imaging Telescope (UVIT) aboard AstroSat. We identify cluster members by combining UVIT-detected sources with Gaia EDR3 data. We detect four bright and hot blue straggler stars (BSSs) and two yellow strag…
▽ More
We present the first far-UV (FUV) imaging results of the intermediate-age Galactic open cluster NGC 2818 that has a Planetary nebula (PN) within the field using images taken from the Ultra-violet Imaging Telescope (UVIT) aboard AstroSat. We identify cluster members by combining UVIT-detected sources with Gaia EDR3 data. We detect four bright and hot blue straggler stars (BSSs) and two yellow straggler stars (YSSs) based on their location in the optical and FUV-optical color-magnitude diagrams. Based on the parameters estimated using Spectral Energy Distribution (SED), we infer that BSSs are either collisional products or might have undetectable white dwarf (WD) companions. Our photometric analysis of YSSs confirms their binarity, consistent with the spectroscopic results. We find YSSs to be formed through a mass-transfer scenario and the hot components are likely to be A-type subdwarfs. A comparison of the radial velocity (RV), Gaia EDR3 proper motion of the PN with the cluster, and reddening towards the PN and the cluster does not rule out the membership of the PN. Comparing the central star's position with theoretical pAGB models suggest that it has already entered the WD cooling phase, and its mass is deduced to be ~0.66Msun. The corresponding progenitor mass turns out to be ~2.1Msun, comparable to the turn-off mass of the cluster, implying that the progenitor could have formed in the cluster. We suggest that the NGC 2818 might be one of the few known clusters to host a PN, providing a unique opportunity to test stellar evolution models.
△ Less
Submitted 5 January, 2023;
originally announced January 2023.
-
Temporal Waypoint Navigation of Multi-UAV Payload System using Barrier Functions
Authors:
Nishanth Rao,
Suresh Sundaram,
Pushpak Jagtap
Abstract:
Aerial package transportation often requires complex spatial and temporal specifications to be satisfied in order to ensure safe and timely delivery from one point to another. It is usually efficient to transport versatile payloads using multiple UAVs that can work collaboratively to achieve the desired task. The complex temporal specifications can be handled coherently by applying Signal Temporal…
▽ More
Aerial package transportation often requires complex spatial and temporal specifications to be satisfied in order to ensure safe and timely delivery from one point to another. It is usually efficient to transport versatile payloads using multiple UAVs that can work collaboratively to achieve the desired task. The complex temporal specifications can be handled coherently by applying Signal Temporal Logic (STL) to dynamical systems. This paper addresses the problem of waypoint navigation of a multi-UAV payload system under temporal specifications using higher-order time-varying control barrier functions (HOCBFs). The complex nonlinear system of relative degree two is transformed into a simple linear system using input-output feedback linearization. An optimization-based control law is then derived to achieve the temporal waypoint navigation of the payload. The controller's efficacy and real-time implementability are demonstrated by simulating a package delivery scenario inside a high-fidelity Gazebo simulation environment.
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
Search Behavior Prediction: A Hypergraph Perspective
Authors:
Yan Han,
Edward W Huang,
Wenqing Zheng,
Nikhil Rao,
Zhangyang Wang,
Karthik Subbian
Abstract:
Although the bipartite shopping graphs are straightforward to model search behavior, they suffer from two challenges: 1) The majority of items are sporadically searched and hence have noisy/sparse query associations, leading to a \textit{long-tail} distribution. 2) Infrequent queries are more likely to link to popular items, leading to another hurdle known as \textit{disassortative mixing}. To add…
▽ More
Although the bipartite shopping graphs are straightforward to model search behavior, they suffer from two challenges: 1) The majority of items are sporadically searched and hence have noisy/sparse query associations, leading to a \textit{long-tail} distribution. 2) Infrequent queries are more likely to link to popular items, leading to another hurdle known as \textit{disassortative mixing}. To address these two challenges, we go beyond the bipartite graph to take a hypergraph perspective, introducing a new paradigm that leverages \underline{auxiliary} information from anonymized customer engagement sessions to assist the \underline{main task} of query-item link prediction. This auxiliary information is available at web scale in the form of search logs. We treat all items appearing in the same customer session as a single hyperedge. The hypothesis is that items in a customer session are unified by a common shopping interest. With these hyperedges, we augment the original bipartite graph into a new \textit{hypergraph}. We develop a \textit{\textbf{D}ual-\textbf{C}hannel \textbf{A}ttention-Based \textbf{H}ypergraph Neural Network} (\textbf{DCAH}), which synergizes information from two potentially noisy sources (original query-item edges and item-item hyperedges). In this way, items on the tail are better connected due to the extra hyperedges, thereby enhancing their link prediction performance. We further integrate DCAH with self-supervised graph pre-training and/or DropEdge training, both of which effectively alleviate disassortative mixing. Extensive experiments on three proprietary E-Commerce datasets show that DCAH yields significant improvements of up to \textbf{24.6\% in mean reciprocal rank (MRR)} and \textbf{48.3\% in recall} compared to GNN-based baselines. Our source code is available at \url{https://github.com/amazon-science/dual-channel-hypergraph-neural-network}.
△ Less
Submitted 28 November, 2022; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Computationally Light Spectrally Normalized Memory Neuron Network based Estimator for GPS-Denied operation of Micro UAV
Authors:
Nishanth Rao,
Suresh Sundaram,
Varun Raghavendra
Abstract:
This paper addresses the problem of position estimation in UAVs operating in a cluttered environment where GPS information is unavailable. A model learning-based approach is proposed that takes in the rotor RPMs and past state as input and predicts the one-step-ahead position of the UAV using a novel spectral-normalized memory neural network (SN-MNN). The spectral normalization guarantees stable a…
▽ More
This paper addresses the problem of position estimation in UAVs operating in a cluttered environment where GPS information is unavailable. A model learning-based approach is proposed that takes in the rotor RPMs and past state as input and predicts the one-step-ahead position of the UAV using a novel spectral-normalized memory neural network (SN-MNN). The spectral normalization guarantees stable and reliable prediction performance. The predicted position is transformed to global coordinate frame which is then fused along with the odometry of other peripheral sensors like IMU, barometer, compass etc., using the onboard extended Kalman filter to estimate the states of the UAV. The experimental flight data collected from a motion capture facility using a micro-UAV is used to train the SN-MNN. The PX4-ECL library is used to replay the flight data using the proposed algorithm, and the estimated position is compared with actual ground truth data. The proposed algorithm doesn't require any additional onboard sensors, and is computationally light. The performance of the proposed approach is compared with the current state-of-art GPS-denied algorithms, and it can be seen that the proposed algorithm has the least RMSE for position estimates.
△ Less
Submitted 3 December, 2022; v1 submitted 11 November, 2022;
originally announced November 2022.
-
Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning
Authors:
Rajesh P. N. Rao,
Dimitrios C. Gklezakos,
Vishwas Sathish
Abstract:
Predictive coding has emerged as a prominent model of how the brain learns through predictions, anticipating the importance accorded to predictive learning in recent AI architectures such as transformers. Here we propose a new framework for predictive coding called active predictive coding which can learn hierarchical world models and solve two radically different open problems in AI: (1) how do w…
▽ More
Predictive coding has emerged as a prominent model of how the brain learns through predictions, anticipating the importance accorded to predictive learning in recent AI architectures such as transformers. Here we propose a new framework for predictive coding called active predictive coding which can learn hierarchical world models and solve two radically different open problems in AI: (1) how do we learn compositional representations, e.g., part-whole hierarchies, for equivariant vision? and (2) how do we solve large-scale planning problems, which are hard for traditional reinforcement learning, by composing complex action sequences from primitive policies? Our approach exploits hypernetworks, self-supervised learning and reinforcement learning to learn hierarchical world models that combine task-invariant state transition networks and task-dependent policy networks at multiple abstraction levels. We demonstrate the viability of our approach on a variety of vision datasets (MNIST, FashionMNIST, Omniglot) as well as on a scalable hierarchical planning problem. Our results represent, to our knowledge, the first demonstration of a unified solution to the part-whole learning problem posed by Hinton, the nested reference frames problem posed by Hawkins, and the integrated state-action hierarchy learning problem in reinforcement learning.
△ Less
Submitted 23 October, 2022;
originally announced October 2022.
-
TransLIST: A Transformer-Based Linguistically Informed Sanskrit Tokenizer
Authors:
Jivnesh Sandhan,
Rathin Singha,
Narein Rao,
Suvendu Samanta,
Laxmidhar Behera,
Pawan Goyal
Abstract:
Sanskrit Word Segmentation (SWS) is essential in making digitized texts available and in deploying downstream tasks. It is, however, non-trivial because of the sandhi phenomenon that modifies the characters at the word boundaries, and needs special treatment. Existing lexicon driven approaches for SWS make use of Sanskrit Heritage Reader, a lexicon-driven shallow parser, to generate the complete c…
▽ More
Sanskrit Word Segmentation (SWS) is essential in making digitized texts available and in deploying downstream tasks. It is, however, non-trivial because of the sandhi phenomenon that modifies the characters at the word boundaries, and needs special treatment. Existing lexicon driven approaches for SWS make use of Sanskrit Heritage Reader, a lexicon-driven shallow parser, to generate the complete candidate solution space, over which various methods are applied to produce the most valid solution. However, these approaches fail while encountering out-of-vocabulary tokens. On the other hand, purely engineering methods for SWS have made use of recent advances in deep learning, but cannot make use of the latent word information on availability.
To mitigate the shortcomings of both families of approaches, we propose Transformer based Linguistically Informed Sanskrit Tokenizer (TransLIST) consisting of (1) a module that encodes the character input along with latent-word information, which takes into account the sandhi phenomenon specific to SWS and is apt to work with partial or no candidate solutions, (2) a novel soft-masked attention to prioritize potential candidate words and (3) a novel path ranking algorithm to rectify the corrupted predictions. Experiments on the benchmark datasets for SWS show that TransLIST outperforms the current state-of-the-art system by an average 7.2 points absolute gain in terms of perfect match (PM) metric. The codebase and datasets are publicly available at https://github.com/rsingha108/TransLIST
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Neural Co-Processors for Restoring Brain Function: Results from a Cortical Model of Grasping
Authors:
Matthew J. Bryan,
Linxing Preston Jiang,
Rajesh P N Rao
Abstract:
Objective: A major challenge in designing closed-loop brain-computer interfaces is finding optimal stimulation patterns as a function of ongoing neural activity for different subjects and objectives. Approach: To achieve goal-directed closed-loop neurostimulation, we propose "neural co-processors" which use artificial neural networks and deep learning to learn optimal closed-loop stimulation polic…
▽ More
Objective: A major challenge in designing closed-loop brain-computer interfaces is finding optimal stimulation patterns as a function of ongoing neural activity for different subjects and objectives. Approach: To achieve goal-directed closed-loop neurostimulation, we propose "neural co-processors" which use artificial neural networks and deep learning to learn optimal closed-loop stimulation policies, shaping neural activity and bridging injured neural circuits for targeted repair and rehabilitation. The co-processor adapts the stimulation policy as the biological circuit itself adapts to the stimulation, achieving a form of brain-device co-adaptation. Here we use simulations to lay the groundwork for future in vivo tests of neural co-processors. We leverage a cortical model of grasping, to which we applied various forms of simulated lesions, allowing us to develop the critical learning algorithms and study adaptations to non-stationarity. Main results: Our simulations show the ability of a neural co-processor to learn a stimulation policy using a supervised learning approach, and to adapt that policy as the underlying brain and sensors change. Our co-processor successfully co-adapted with the simulated brain to accomplish the reach-and-grasp task after a variety of lesions were applied, achieving recovery towards healthy function. Significance: Our results provide the first proof-of-concept demonstration of a co-processor for adaptive activity-dependent closed-loop neurostimulation, optimizing for a rehabilitation goal. While a gap remains between simulations and applications, our results provide insights on how co-processors may be developed for learning complex adaptive stimulation policies for a variety of neural rehabilitation and neuroprosthetic applications.
△ Less
Submitted 20 March, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
Enabling Autonomous Electron Microscopy for Networked Computation and Steering
Authors:
Anees Al-Najjar,
Nageswara S. V. Rao,
Ramanan Sankaran,
Maxim Ziatdinov,
Debangshu Mukherjee,
Olga Ovchinnikova,
Kevin Roccapriore,
Andrew R. Lupini,
Sergei V. Kalinin
Abstract:
Advanced electron microscopy workflows require an ecosystem of microscope instruments and computing systems possibly located at different sites to conduct remotely steered and automated experiments. Current workflow executions involve manual operations for steering and measurement tasks, which are typically performed from control workstations co-located with microscopes; consequently, their operat…
▽ More
Advanced electron microscopy workflows require an ecosystem of microscope instruments and computing systems possibly located at different sites to conduct remotely steered and automated experiments. Current workflow executions involve manual operations for steering and measurement tasks, which are typically performed from control workstations co-located with microscopes; consequently, their operational tempo and effectiveness are limited. We propose an approach based on separate data and control channels for such an ecosystem of Scanning Transmission Electron Microscopes (STEM) and computing systems, for which no general solutions presently exist, unlike the neutron and light source instruments. We demonstrate automated measurement transfers and remote steering of Nion STEM physical instruments over site networks. We propose a Virtual Infrastructure Twin (VIT) of this ecosystem, which is used to develop and test our steering software modules without requiring access to the physical instrument infrastructure. Additionally, we develop a VIT for a multiple laboratory scenario, which illustrates the applicability of this approach to ecosystems connected over wide-area networks, for the development and testing of software modules and their later field deployment.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
Globular Cluster UVIT legacy Survey (GlobUleS) III. Omega Centauri in Far-Ultraviolet
Authors:
Deepthi S. Prabhu,
Annapurni Subramaniam,
Snehalata Sahu,
Chul Chung,
Nathan W. C. Leigh,
Emanuele Dalessandro,
Sourav Chatterjee,
N. Kameswara Rao,
Michael Shara,
Patrick Cote,
Samyaday Choudhury,
Gajendra Pandey,
Aldo A. R. Valcarce,
Gaurav Singh,
Joesph E. Postma,
Sharmila Rani,
Avrajit Bandyopadhyay,
Aaron M. Geller,
John Hutchings,
Thomas Puzia,
Mirko Simunovic,
Young-Jong Sohn,
Sivarani Thirupathi,
Ramakant Singh Yadav
Abstract:
We present the first comprehensive study of the most massive globular cluster Omega Centauri in the far-ultraviolet (FUV) extending from the center to ~ 28% of the tidal radius using the Ultraviolet Imaging Telescope aboard AstroSat. A comparison of the FUV-optical color-magnitude diagrams with available canonical models reveals that the horizontal branch (HB) stars bluer than the knee (hHBs) and…
▽ More
We present the first comprehensive study of the most massive globular cluster Omega Centauri in the far-ultraviolet (FUV) extending from the center to ~ 28% of the tidal radius using the Ultraviolet Imaging Telescope aboard AstroSat. A comparison of the FUV-optical color-magnitude diagrams with available canonical models reveals that the horizontal branch (HB) stars bluer than the knee (hHBs) and the white dwarfs (WDs) are fainter in the FUV by ~ 0.5 mag than model predictions. They are also fainter than their counterparts in M13, another massive cluster. We simulated HB with at least five subpopulations, including three He-rich populations with a substantial He enrichment of Y up to 0.43 dex, to reproduce the observed FUV distribution. We find the He-rich younger subpopulations to be radially more segregated than the He-normal older ones, suggesting an in-situ enrichment from older generations. The Omega Cen hHBs span the same effective temperature range as their M13 counterparts, but some have smaller radii and lower luminosities. This may suggest that a fraction of Omega Cen hHBs are less massive than those of M13, similar to the result derived from earlier spectroscopic studies of outer extreme HB stars. The WDs in Omega Cen and M13 have similar luminosity-radius-effective temperature parameters, and 0.44 - 0.46 M$_\odot$ He-core WD model tracks evolving from progenitors with Y = 0.4 dex are found to fit the majority of these. This study provides constraints on the formation models of Omega Cen based on the estimated range in age, [Fe/H] and Y (in particular), for the HB stars.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
A roadmap for edge computing enabled automated multidimensional transmission electron microscopy
Authors:
Debangshu Mukherjee,
Kevin M. Roccapriore,
Anees Al-Najjar,
Ayana Ghosh,
Jacob D. Hinkle,
Andrew R. Lupini,
Rama K. Vasudevan,
Sergei V. Kalinin,
Olga S. Ovchinnikova,
Maxim A. Ziatdinov,
Nageswara S. Rao
Abstract:
The advent of modern, high-speed electron detectors has made the collection of multidimensional hyperspectral transmission electron microscopy datasets, such as 4D-STEM, a routine. However, many microscopists find such experiments daunting since such datasets' analysis, collection, long-term storage, and networking remain challenging. Some common issues are the large and unwieldy size of the said…
▽ More
The advent of modern, high-speed electron detectors has made the collection of multidimensional hyperspectral transmission electron microscopy datasets, such as 4D-STEM, a routine. However, many microscopists find such experiments daunting since such datasets' analysis, collection, long-term storage, and networking remain challenging. Some common issues are the large and unwieldy size of the said datasets, often running into several gigabytes, non-standardized data analysis routines, and a lack of clarity about the computing and network resources needed to utilize the electron microscope fully. However, the existing computing and networking bottlenecks introduce significant penalties in each step of these experiments, and thus, real-time analysis-driven automated experimentation for multidimensional TEM is exceptionally challenging. One solution is integrating microscopy with edge computing, where moderately powerful computational hardware performs the preliminary analysis before handing off the heavier computation to HPC systems. In this perspective, we trace the roots of computation in modern electron microscopy, demonstrate deep learning experiments running on an edge system, and discuss the networking requirements for tying together microscopes, edge computers, and HPC systems.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Effects of Solar Activity, Solar Insolation and the Lower Atmospheric Dust on the Martian Thermosphere
Authors:
N. V. Rao,
V. Leelavathi,
Ch. Yaswanth,
Anil Bhardwaj,
S. V. B. Rao
Abstract:
A diagnosis of the Ar densities measured by the Neutral Gas and Ion Mass Spectrometer aboard the Mars Atmosphere and Volatile EvolutioN (MAVEN) and the temperatures derived from these densities shows that solar activity, solar insolation, and the lower atmospheric dust are the dominant forcings of the Martian thermosphere. A methodology, based on multiple linear regression analysis, is developed t…
▽ More
A diagnosis of the Ar densities measured by the Neutral Gas and Ion Mass Spectrometer aboard the Mars Atmosphere and Volatile EvolutioN (MAVEN) and the temperatures derived from these densities shows that solar activity, solar insolation, and the lower atmospheric dust are the dominant forcings of the Martian thermosphere. A methodology, based on multiple linear regression analysis, is developed to quantify the contributions of the dominant forcings to the densities and temperatures. The results of the present study show that a 100 sfu (solar flux units) change in the solar activity results in approx. 136 K corresponding change in the thermospheric temperatures. The solar insolation constrains the seasonal, latitudinal, and diurnal variations to be interdependent. Diurnal variation dominates the solar insolation variability, followed by the latitudinal and seasonal variations. Both the global and regional dust storms lead to considerable enhancements in the densities and temperatures of the Martian thermosphere. Using past data of the solar fluxes and the dust optical depths, the state of the Martian thermosphere is extrapolated back to Martian year (MY) 24. While the global dust storms of MY 25, MY 28 and MY 34 raise the thermospheric temperatures by approx. 22-38 K, the regional dust storm of MY 34 leads to approx. 15 K warming. Dust driven thermospheric temperatures alone can enhance the hydrogen escape fluxes by 1.67-2.14 times compared to those without the dust. Dusts effects are relatively significant for global dust storms that occur in solar minimum compared to those that occur in solar maximum.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
Search for relativistic fractionally charged particles in space
Authors:
DAMPE Collaboration,
F. Alemanno,
C. Altomare,
Q. An,
P. Azzarello,
F. C. T. Barbato,
P. Bernardini,
X. J. Bi,
M. S. Cai,
E. Casilli,
E. Catanzani,
J. Chang,
D. Y. Chen,
J. L. Chen,
Z. F. Chen,
M. Y. Cui,
T. S. Cui,
Y. X. Cui,
H. T. Dai,
A. De-Benedittis,
I. De Mitri,
F. de Palma,
M. Deliyergiyev,
A. Di Giovanni,
M. Di Santo
, et al. (126 additional authors not shown)
Abstract:
More than a century after the performance of the oil drop experiment, the possible existence of fractionally charged particles FCP still remains unsettled. The search for FCPs is crucial for some extensions of the Standard Model in particle physics. Most of the previously conducted searches for FCPs in cosmic rays were based on experiments underground or at high altitudes. However, there have been…
▽ More
More than a century after the performance of the oil drop experiment, the possible existence of fractionally charged particles FCP still remains unsettled. The search for FCPs is crucial for some extensions of the Standard Model in particle physics. Most of the previously conducted searches for FCPs in cosmic rays were based on experiments underground or at high altitudes. However, there have been few searches for FCPs in cosmic rays carried out in orbit other than AMS-01 flown by a space shuttle and BESS by a balloon at the top of the atmosphere. In this study, we conduct an FCP search in space based on on-orbit data obtained using the DArk Matter Particle Explorer (DAMPE) satellite over a period of five years. Unlike underground experiments, which require an FCP energy of the order of hundreds of GeV, our FCP search starts at only a few GeV. An upper limit of $6.2\times 10^{-10}~~\mathrm{cm^{-2}sr^{-1} s^{-1}}$ is obtained for the flux. Our results demonstrate that DAMPE exhibits higher sensitivity than experiments of similar types by three orders of magnitude that more stringently restricts the conditions for the existence of FCP in primary cosmic rays.
△ Less
Submitted 9 September, 2022;
originally announced September 2022.
-
Coexistent quantum channel characterization using spectrally resolved Bayesian quantum process tomography
Authors:
Joseph C. Chapman,
Joseph M. Lukens,
Muneer Alshowkan,
Nageswara Rao,
Brian T. Kirby,
Nicholas A. Peters
Abstract:
The coexistence of quantum and classical signals over the same optical fiber with minimal degradation of the transmitted quantum information is critical for operating large-scale quantum networks over the existing communications infrastructure. Here, we systematically characterize the quantum channel that results from simultaneously distributing approximate single-photon polarization-encoded qubit…
▽ More
The coexistence of quantum and classical signals over the same optical fiber with minimal degradation of the transmitted quantum information is critical for operating large-scale quantum networks over the existing communications infrastructure. Here, we systematically characterize the quantum channel that results from simultaneously distributing approximate single-photon polarization-encoded qubits and classical light of varying intensities through fiber-optic channels of up to 15~km. Using spectrally resolved quantum process tomography with a Bayesian reconstruction method we developed, we estimate the full quantum channel from experimental photon counting data, both with and without classical background. Furthermore, although we find the exact channel description to be a weak function of the pump polarization, we nevertheless show that the coexistent fiber-based quantum channel has high process fidelity with an ideal depolarizing channel when the noise is dominated by Raman scattering. These results provide a basis for the future development of quantum repeater designs and quantum error correcting codes for real-world channels and inform models used in the analysis and simulation of quantum networks.
△ Less
Submitted 16 March, 2023; v1 submitted 30 August, 2022;
originally announced August 2022.
-
ECP SOLLVE: Validation and Verification Testsuite Status Update and Compiler Insight for OpenMP
Authors:
Thomas Huber,
Swaroop Pophale,
Nolan Baker,
Michael Carr,
Nikhil Rao,
Jaydon Reap,
Kristina Holsapple,
Joshua Hoke Davis,
Tobias Burnus,
Seyong Lee,
David E. Bernholdt,
Sunita Chandrasekaran
Abstract:
The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of…
▽ More
The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of compiler and hardware vendors that support OpenMP. In this manuscript, we focus on evaluating the conformity and implementation progress of various compiler vendors such as Cray, IBM, GNU, Clang/LLVM, NVIDIA, Intel and AMD. We specifically address the 4.5, 5.0, and 5.1 versions of the specification.
△ Less
Submitted 14 November, 2022; v1 submitted 28 August, 2022;
originally announced August 2022.
-
Blending type Approximations by Kantorovich variant of $α$-Schurer operators
Authors:
Nadeem Rao,
Mamta Rani,
Adem Kiliçman,
Pradeep Malik,
Mohammad Ayman-Mursaleen
Abstract:
In the present manuscript, we present a new sequence of operators, $i.e.$, $α$-Bernstein-Schurer-Kantorovich operators depending on two parameters $α\in[0,1]$ and $ρ>0$ for one and two variables to approximate measurable functions on $[0: 1+q], q>0$. Next, we give basic results and discuss the rapidity of convergence and order of approximation for univariate and bivariate of these sequences in the…
▽ More
In the present manuscript, we present a new sequence of operators, $i.e.$, $α$-Bernstein-Schurer-Kantorovich operators depending on two parameters $α\in[0,1]$ and $ρ>0$ for one and two variables to approximate measurable functions on $[0: 1+q], q>0$. Next, we give basic results and discuss the rapidity of convergence and order of approximation for univariate and bivariate of these sequences in their respective sections. Further, Graphical and numerical analysis are presented. Moreover, local and global approximation properties are discussed in terms of first and second order modulus of smoothness, Peetre's K-functional and weight functions for these sequences in different spaces of functions.
△ Less
Submitted 21 August, 2022;
originally announced August 2022.