-
Quasi-Bayes meets Vines
Authors:
David Huk,
Yuanhe Zhang,
Mark Steel,
Ritabrata Dutta
Abstract:
Recently proposed quasi-Bayesian (QB) methods initiated a new era in Bayesian computation by directly constructing the Bayesian predictive distribution through recursion, removing the need for expensive computations involved in sampling the Bayesian posterior distribution. This has proved to be data-efficient for univariate predictions, but extensions to multiple dimensions rely on a conditional d…
▽ More
Recently proposed quasi-Bayesian (QB) methods initiated a new era in Bayesian computation by directly constructing the Bayesian predictive distribution through recursion, removing the need for expensive computations involved in sampling the Bayesian posterior distribution. This has proved to be data-efficient for univariate predictions, but extensions to multiple dimensions rely on a conditional decomposition resulting from predefined assumptions on the kernel of the Dirichlet Process Mixture Model, which is the implicit nonparametric model used. Here, we propose a different way to extend Quasi-Bayesian prediction to high dimensions through the use of Sklar's theorem by decomposing the predictive distribution into one-dimensional predictive marginals and a high-dimensional copula. Thus, we use the efficient recursive QB construction for the one-dimensional marginals and model the dependence using highly expressive vine copulas. Further, we tune hyperparameters using robust divergences (eg. energy score) and show that our proposed Quasi-Bayesian Vine (QB-Vine) is a fully non-parametric density estimator with \emph{an analytical form} and convergence rate independent of the dimension of data in some situations. Our experiments illustrate that the QB-Vine is appropriate for high dimensional distributions ($\sim$64), needs very few samples to train ($\sim$200) and outperforms state-of-the-art methods with analytical forms for density estimation and supervised tasks by a considerable margin.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
IITK at SemEval-2024 Task 1: Contrastive Learning and Autoencoders for Semantic Textual Relatedness in Multilingual Texts
Authors:
Udvas Basak,
Rajarshi Dutta,
Shivam Pandey,
Ashutosh Modi
Abstract:
This paper describes our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness. The challenge is focused on automatically detecting the degree of relatedness between pairs of sentences for 14 languages including both high and low-resource Asian and African languages. Our team participated in two subtasks consisting of Track A: supervised and Track B: unsupervised. This paper f…
▽ More
This paper describes our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness. The challenge is focused on automatically detecting the degree of relatedness between pairs of sentences for 14 languages including both high and low-resource Asian and African languages. Our team participated in two subtasks consisting of Track A: supervised and Track B: unsupervised. This paper focuses on a BERT-based contrastive learning and similarity metric based approach primarily for the supervised track while exploring autoencoders for the unsupervised track. It also aims on the creation of a bigram relatedness corpus using negative sampling strategy, thereby producing refined word embeddings.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Digital Twins for Supporting AI Research with Autonomous Vehicle Networks
Authors:
Anıl Gürses,
Gautham Reddy,
Saad Masrur,
Özgür Özdemir,
İsmail Güvenç,
Mihail L. Sichitiu,
Alphan Şahin,
Ahmed Alkhateeb,
Rudra Dutta
Abstract:
Digital twins (DTs), which are virtual environments that simulate, predict, and optimize the performance of their physical counterparts, are envisioned to be essential technologies for advancing next-generation wireless networks. While DTs have been studied extensively for wireless networks, their use in conjunction with autonomous vehicles with programmable mobility remains relatively under-explo…
▽ More
Digital twins (DTs), which are virtual environments that simulate, predict, and optimize the performance of their physical counterparts, are envisioned to be essential technologies for advancing next-generation wireless networks. While DTs have been studied extensively for wireless networks, their use in conjunction with autonomous vehicles with programmable mobility remains relatively under-explored. In this paper, we study DTs used as a development environment to design, deploy, and test artificial intelligence (AI) techniques that use real-time observations, e.g. radio key performance indicators, for vehicle trajectory and network optimization decisions in an autonomous vehicle networks (AVN). We first compare and contrast the use of simulation, digital twin (software in the loop (SITL)), sandbox (hardware-in-the-loop (HITL)), and physical testbed environments for their suitability in developing and testing AI algorithms for AVNs. We then review various representative use cases of DTs for AVN scenarios. Finally, we provide an example from the NSF AERPAW platform where a DT is used to develop and test AI-aided solutions for autonomous unmanned aerial vehicles for localizing a signal source based solely on link quality measurements. Our results in the physical testbed show that SITL DTs, when supplemented with data from real-world (RW) measurements and simulations, can serve as an ideal environment for developing and testing innovative AI solutions for AVNs.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Self-evolving Autoencoder Embedded Q-Network
Authors:
J. Senthilnath,
Bangjian Zhou,
Zhen Wei Ng,
Deeksha Aggarwal,
Rajdeep Dutta,
Ji Wei Yoon,
Aye Phyu Phyu Aung,
Keyu Wu,
Min Wu,
Xiaoli Li
Abstract:
In the realm of sequential decision-making tasks, the exploration capability of a reinforcement learning (RL) agent is paramount for achieving high rewards through interactions with the environment. To enhance this crucial ability, we propose SAQN, a novel approach wherein a self-evolving autoencoder (SA) is embedded with a Q-Network (QN). In SAQN, the self-evolving autoencoder architecture adapts…
▽ More
In the realm of sequential decision-making tasks, the exploration capability of a reinforcement learning (RL) agent is paramount for achieving high rewards through interactions with the environment. To enhance this crucial ability, we propose SAQN, a novel approach wherein a self-evolving autoencoder (SA) is embedded with a Q-Network (QN). In SAQN, the self-evolving autoencoder architecture adapts and evolves as the agent explores the environment. This evolution enables the autoencoder to capture a diverse range of raw observations and represent them effectively in its latent space. By leveraging the disentangled states extracted from the encoder generated latent space, the QN is trained to determine optimal actions that improve rewards. During the evolution of the autoencoder architecture, a bias-variance regulatory strategy is employed to elicit the optimal response from the RL agent. This strategy involves two key components: (i) fostering the growth of nodes to retain previously acquired knowledge, ensuring a rich representation of the environment, and (ii) pruning the least contributing nodes to maintain a more manageable and tractable latent space. Extensive experimental evaluations conducted on three distinct benchmark environments and a real-world molecular environment demonstrate that the proposed SAQN significantly outperforms state-of-the-art counterparts. The results highlight the effectiveness of the self-evolving autoencoder and its collaboration with the Q-Network in tackling sequential decision-making tasks.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Hardware Phi-1.5B: A Large Language Model Encodes Hardware Domain Specific Knowledge
Authors:
Weimin Fu,
Shijie Li,
Yifang Zhao,
Haocheng Ma,
Raj Dutta,
Xuan Zhang,
Kaichen Yang,
Yier Jin,
Xiaolong Guo
Abstract:
In the rapidly evolving semiconductor industry, where research, design, verification, and manufacturing are intricately linked, the potential of Large Language Models to revolutionize hardware design and security verification is immense. The primary challenge, however, lies in the complexity of hardware specific issues that are not adequately addressed by the natural language or software code know…
▽ More
In the rapidly evolving semiconductor industry, where research, design, verification, and manufacturing are intricately linked, the potential of Large Language Models to revolutionize hardware design and security verification is immense. The primary challenge, however, lies in the complexity of hardware specific issues that are not adequately addressed by the natural language or software code knowledge typically acquired during the pretraining stage. Additionally, the scarcity of datasets specific to the hardware domain poses a significant hurdle in developing a foundational model. Addressing these challenges, this paper introduces Hardware Phi 1.5B, an innovative large language model specifically tailored for the hardware domain of the semiconductor industry. We have developed a specialized, tiered dataset comprising small, medium, and large subsets and focused our efforts on pretraining using the medium dataset. This approach harnesses the compact yet efficient architecture of the Phi 1.5B model. The creation of this first pretrained, hardware domain specific large language model marks a significant advancement, offering improved performance in hardware design and verification tasks and illustrating a promising path forward for AI applications in the semiconductor sector.
△ Less
Submitted 27 January, 2024;
originally announced February 2024.
-
LLM4SecHW: Leveraging Domain Specific Large Language Model for Hardware Debugging
Authors:
Weimin Fu,
Kaichen Yang,
Raj Gautam Dutta,
Xiaolong Guo,
Gang Qu
Abstract:
This paper presents LLM4SecHW, a novel framework for hardware debugging that leverages domain specific Large Language Model (LLM). Despite the success of LLMs in automating various software development tasks, their application in the hardware security domain has been limited due to the constraints of commercial LLMs and the scarcity of domain specific data. To address these challenges, we propose…
▽ More
This paper presents LLM4SecHW, a novel framework for hardware debugging that leverages domain specific Large Language Model (LLM). Despite the success of LLMs in automating various software development tasks, their application in the hardware security domain has been limited due to the constraints of commercial LLMs and the scarcity of domain specific data. To address these challenges, we propose a unique approach to compile a dataset of open source hardware design defects and their remediation steps, utilizing version control data. This dataset provides a substantial foundation for training machine learning models for hardware. LLM4SecHW employs fine tuning of medium sized LLMs based on this dataset, enabling the identification and rectification of bugs in hardware designs. This pioneering approach offers a reference workflow for the application of fine tuning domain specific LLMs in other research areas. We evaluate the performance of our proposed system on various open source hardware designs, demonstrating its efficacy in accurately identifying and correcting defects. Our work brings a new perspective on automating the quality control process in hardware design.
△ Less
Submitted 28 January, 2024;
originally announced January 2024.
-
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Authors:
Boxin Wang,
Weixin Chen,
Hengzhi Pei,
Chulin Xie,
Mintong Kang,
Chenhui Zhang,
Chejian Xu,
Zidi Xiong,
Ritik Dutta,
Rylan Schaeffer,
Sang T. Truong,
Simran Arora,
Mantas Mazeika,
Dan Hendrycks,
Zinan Lin,
Yu Cheng,
Sanmi Koyejo,
Dawn Song,
Bo Li
Abstract:
Generative Pre-trained Transformer (GPT) models have exhibited exciting progress in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications such as healthcare and finance -- where mistakes can be costly. To thi…
▽ More
Generative Pre-trained Transformer (GPT) models have exhibited exciting progress in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications such as healthcare and finance -- where mistakes can be costly. To this end, this work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5, considering diverse perspectives -- including toxicity, stereotype bias, adversarial robustness, out-of-distribution robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness. Based on our evaluations, we discover previously unpublished vulnerabilities to trustworthiness threats. For instance, we find that GPT models can be easily misled to generate toxic and biased outputs and leak private information in both training data and conversation history. We also find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, potentially because GPT-4 follows (misleading) instructions more precisely. Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps. Our benchmark is publicly available at https://decodingtrust.github.io/ ; our dataset can be previewed at https://huggingface.co/datasets/AI-Secure/DecodingTrust ; a concise version of this work is at https://openreview.net/pdf?id=kaHpo8OZw2 .
△ Less
Submitted 26 February, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Data driven localized wave solution of the Fokas-Lenells equation using modified PINN
Authors:
Gautam Kumar Saharia,
Sagardeep Talukdar,
Riki Dutta,
Sudipta Nandy
Abstract:
We investigate data driven localized wave solutions of the Fokas-Lenells equation by using physics informed neural network(PINN). We improve basic PINN by incorporating control parameters into the residual loss function. We also add conserve quantity as another loss term to modify the PINN. Using modified PINN we obtain the data driven bright soliton and dark soliton solutions of Fokas-Lenells equ…
▽ More
We investigate data driven localized wave solutions of the Fokas-Lenells equation by using physics informed neural network(PINN). We improve basic PINN by incorporating control parameters into the residual loss function. We also add conserve quantity as another loss term to modify the PINN. Using modified PINN we obtain the data driven bright soliton and dark soliton solutions of Fokas-Lenells equation. Conserved quantities informed loss function achieve more accuracy in terms of relative L2 error between predicted and exact soliton solutions. We hope that the present investigation would be useful to study the applications of deep learning in nonlinear optics and other branches of nonlinear physics. Source codes are available at https://github.com/gautamksaharia/Fokas-Lenells
△ Less
Submitted 3 June, 2023;
originally announced June 2023.
-
S-REINFORCE: A Neuro-Symbolic Policy Gradient Approach for Interpretable Reinforcement Learning
Authors:
Rajdeep Dutta,
Qincheng Wang,
Ankur Singh,
Dhruv Kumarjiguda,
Li Xiaoli,
Senthilnath Jayavelu
Abstract:
This paper presents a novel RL algorithm, S-REINFORCE, which is designed to generate interpretable policies for dynamic decision-making tasks. The proposed algorithm leverages two types of function approximators, namely Neural Network (NN) and Symbolic Regressor (SR), to produce numerical and symbolic policies, respectively. The NN component learns to generate a numerical probability distribution…
▽ More
This paper presents a novel RL algorithm, S-REINFORCE, which is designed to generate interpretable policies for dynamic decision-making tasks. The proposed algorithm leverages two types of function approximators, namely Neural Network (NN) and Symbolic Regressor (SR), to produce numerical and symbolic policies, respectively. The NN component learns to generate a numerical probability distribution over the possible actions using a policy gradient, while the SR component captures the functional form that relates the associated states with the action probabilities. The SR-generated policy expressions are then utilized through importance sampling to improve the rewards received during the learning process. We have tested the proposed S-REINFORCE algorithm on various dynamic decision-making problems with low and high dimensional action spaces, and the results demonstrate its effectiveness and impact in achieving interpretable solutions. By leveraging the strengths of both NN and SR, S-REINFORCE produces policies that are not only well-performing but also easy to interpret, making it an ideal choice for real-world applications where transparency and causality are crucial.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
A Dynamic Obstacle Tracking Strategy for Proactive Handoffs in Millimeter-wave Networks
Authors:
Rathindra Nath Dutta,
Subhojit Sarkar,
Sasthi C. Ghosh
Abstract:
Stringent line-of-sight demands necessitated by the fast attenuating nature of millimeter waves (mmWaves) through obstacles pose one of the central problems of next generation wireless networks. These mmWave links are easily disrupted due to obstacles, including vehicles and pedestrians, which cause degradation in link quality and even link failure. Dynamic obstacles are usually tracked by dedicat…
▽ More
Stringent line-of-sight demands necessitated by the fast attenuating nature of millimeter waves (mmWaves) through obstacles pose one of the central problems of next generation wireless networks. These mmWave links are easily disrupted due to obstacles, including vehicles and pedestrians, which cause degradation in link quality and even link failure. Dynamic obstacles are usually tracked by dedicated tracking hardware like RGB-D cameras, which usually have small ranges, and hence lead to prohibitively increased deployment costs to achieve complete coverage of the deployment area. In this manuscript, we propose an altogether different approach to track multiple dynamic obstacles in an mmWave network, solely based on short-term historical link failure information, without resorting to any dedicated tracking hardware. After proving that the said problem is NP-complete, we employ a greedy set-cover based approach to solve it. Using the obtained trajectories, we perform proactive handoffs for at-risk links. We compare our approach with an RGB-D camera-based approach and show that our approach provides better tracking and handoff performances when the camera coverage is low to moderate, which is often the case in real deployment scenarios.
△ Less
Submitted 12 July, 2023; v1 submitted 30 April, 2023;
originally announced May 2023.
-
Diversity Awareness in Software Engineering Participant Research
Authors:
Riya Dutta,
Diego Elias Costa,
Emad Shihab,
Tanja Tajmel
Abstract:
Diversity and inclusion are necessary prerequisites for shaping technological innovation that benefits society as a whole. A common indicator of diversity consideration is the representation of different social groups among software engineering (SE) researchers, developers, and students. However, this does not necessarily entail that diversity is considered in the SE research itself.
In our stud…
▽ More
Diversity and inclusion are necessary prerequisites for shaping technological innovation that benefits society as a whole. A common indicator of diversity consideration is the representation of different social groups among software engineering (SE) researchers, developers, and students. However, this does not necessarily entail that diversity is considered in the SE research itself.
In our study, we examine how diversity is embedded in SE research, particularly research that involves participant studies. To this end, we have selected 79 research papers containing 105 participant studies spanning three years of ICSE technical tracks. Using a content analytical approach, we identified how SE researchers report the various diversity categories of their study participants and investigated: 1) the extent to which participants are described, 2) what diversity categories are commonly reported, and 3) the function diversity serves in the SE studies.
We identified 12 different diversity categories reported in SE participant studies. Our results demonstrate that even though most SE studies report on the diversity of participants, SE research often emphasizes professional diversity data, such as occupation and work experience, over social diversity data, such as gender or location of the participants. Furthermore, our results show that participant diversity is seldom analyzed or reflected upon when SE researchers discuss their study results, outcome or limitations. To help researchers self-assess their study diversity awareness, we propose a diversity awareness model and guidelines that SE researchers can apply to their research. With this study, we hope to shed light on a new approach to tackling the diversity and inclusion crisis in the SE field.
△ Less
Submitted 31 January, 2023;
originally announced February 2023.
-
Open RAN Testbeds with Controlled Air Mobility
Authors:
Magreth Mushi,
Yuchen Liu,
Shreyas Sreenivasa,
Ozgur Ozdemir,
Ismail Guvenc,
Mihail Sichitiu,
Rudra Dutta,
Russ Gyurek
Abstract:
With its promise of increasing softwarization, improving disaggregability, and creating an open-source based ecosystem in the area of Radio Access Networks, the idea of Open RAN has generated rising interest in the community. Even as the community races to provide and verify complete Open RAN systems, the importance of verification of systems based on Open RAN under real-world conditions has becom…
▽ More
With its promise of increasing softwarization, improving disaggregability, and creating an open-source based ecosystem in the area of Radio Access Networks, the idea of Open RAN has generated rising interest in the community. Even as the community races to provide and verify complete Open RAN systems, the importance of verification of systems based on Open RAN under real-world conditions has become clear, and testbed facilities for general use have been envisioned, in addition to private testing facilities. Aerial robots, including autonomous ones, are among the increasingly important and interesting clients of RAN systems, but also present a challenge for testbeds. Based on our experience in architecting and operating an advanced wireless testbed with aerial robots as a primary citizen, we present considerations relevant to the design of Open RAN testbeds, with particular attention to making such a testbed capable of controlled experimentation with aerial clients. We also present representative results from the NSF AERPAW testbed on Open RAN slicing, programmable vehicles, and programmable radios.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
Likelihood-Free Inference with Generative Neural Networks via Scoring Rule Minimization
Authors:
Lorenzo Pacchiardi,
Ritabrata Dutta
Abstract:
Bayesian Likelihood-Free Inference methods yield posterior approximations for simulator models with intractable likelihood. Recently, many works trained neural networks to approximate either the intractable likelihood or the posterior directly. Most proposals use normalizing flows, namely neural networks parametrizing invertible maps used to transform samples from an underlying base measure; the p…
▽ More
Bayesian Likelihood-Free Inference methods yield posterior approximations for simulator models with intractable likelihood. Recently, many works trained neural networks to approximate either the intractable likelihood or the posterior directly. Most proposals use normalizing flows, namely neural networks parametrizing invertible maps used to transform samples from an underlying base measure; the probability density of the transformed samples is then accessible and the normalizing flow can be trained via maximum likelihood on simulated parameter-observation pairs. A recent work [Ramesh et al., 2022] approximated instead the posterior with generative networks, which drop the invertibility requirement and are thus a more flexible class of distributions scaling to high-dimensional and structured data. However, generative networks only allow sampling from the parametrized distribution; for this reason, Ramesh et al. [2022] follows the common solution of adversarial training, where the generative network plays a min-max game against a "critic" network. This procedure is unstable and can lead to a learned distribution underestimating the uncertainty - in extreme cases collapsing to a single point. Here, we propose to approximate the posterior with generative networks trained by Scoring Rule minimization, an overlooked adversarial-free method enabling smooth training and better uncertainty quantification. In simulation studies, the Scoring Rule approach yields better performances with shorter training time with respect to the adversarial framework.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
RSTGen: Imbuing Fine-Grained Interpretable Control into Long-FormText Generators
Authors:
Rilwan A. Adewoyin,
Ritabrata Dutta,
Yulan He
Abstract:
In this paper, we study the task of improving the cohesion and coherence of long-form text generated by language models. To this end, we propose RSTGen, a framework that utilises Rhetorical Structure Theory (RST), a classical language theory, to control the discourse structure, semantics and topics of generated text. Firstly, we demonstrate our model's ability to control structural discourse and s…
▽ More
In this paper, we study the task of improving the cohesion and coherence of long-form text generated by language models. To this end, we propose RSTGen, a framework that utilises Rhetorical Structure Theory (RST), a classical language theory, to control the discourse structure, semantics and topics of generated text. Firstly, we demonstrate our model's ability to control structural discourse and semantic features of generated text in open generation evaluation. Then we experiment on the two challenging long-form text tasks of argument generation and story generation. Evaluation using automated metrics and a metric with high correlation to human evaluation, shows that our model performs competitively against existing models, while offering significantly more controls over generated text than alternative methods.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
A Reinforcement Approach for Detecting P2P Botnet Communities in Dynamic Communication Graphs
Authors:
Harshvardhan P. Joshi,
Rudra Dutta
Abstract:
Peer-to-peer (P2P) botnets use decentralized command and control networks that make them resilient to disruptions. The P2P botnet overlay networks manifest structures in mutual-contact graphs, also called communication graphs, formed using network traffic information. It has been shown that these structures can be detected using community detection techniques from graph theory. These previous work…
▽ More
Peer-to-peer (P2P) botnets use decentralized command and control networks that make them resilient to disruptions. The P2P botnet overlay networks manifest structures in mutual-contact graphs, also called communication graphs, formed using network traffic information. It has been shown that these structures can be detected using community detection techniques from graph theory. These previous works, however, treat the communication graphs and the P2P botnet structures as static. In reality, communication graphs are dynamic as they represent the continuously changing network traffic flows. Similarly, the P2P botnets also evolve with time, as new bots join and existing bots leave either temporarily or permanently. In this paper we address the problem of detecting such evolving P2P botnet communities in dynamic communication graphs. We propose a reinforcement-based approach, suitable for large communication graphs, that improves precision and recall of P2P botnet community detection in dynamic communication graphs.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Probabilistic Forecasting with Generative Networks via Scoring Rule Minimization
Authors:
Lorenzo Pacchiardi,
Rilwan Adewoyin,
Peter Dueben,
Ritabrata Dutta
Abstract:
Probabilistic forecasting relies on past observations to provide a probability distribution for a future outcome, which is often evaluated against the realization using a scoring rule. Here, we perform probabilistic forecasting with generative neural networks, which parametrize distributions on high-dimensional spaces by transforming draws from a latent variable. Generative networks are typically…
▽ More
Probabilistic forecasting relies on past observations to provide a probability distribution for a future outcome, which is often evaluated against the realization using a scoring rule. Here, we perform probabilistic forecasting with generative neural networks, which parametrize distributions on high-dimensional spaces by transforming draws from a latent variable. Generative networks are typically trained in an adversarial framework. In contrast, we propose to train generative networks to minimize a predictive-sequential (or prequential) scoring rule on a recorded temporal sequence of the phenomenon of interest, which is appealing as it corresponds to the way forecasting systems are routinely evaluated. Adversarial-free minimization is possible for some scoring rules; hence, our framework avoids the cumbersome hyperparameter tuning and uncertainty underestimation due to unstable adversarial training, thus unlocking reliable use of generative networks in probabilistic forecasting. Further, we prove consistency of the minimizer of our objective with dependent data, while adversarial training assumes independence. We perform simulation studies on two chaotic dynamical models and a benchmark data set of global weather observations; for this last example, we define scoring rules for spatial data by drawing from the relevant literature. Our method outperforms state-of-the-art adversarial approaches, especially in probabilistic calibration, while requiring less hyperparameter tuning.
△ Less
Submitted 13 February, 2024; v1 submitted 15 December, 2021;
originally announced December 2021.
-
TRU-NET: A Deep Learning Approach to High Resolution Prediction of Rainfall
Authors:
Rilwan Adewoyin,
Peter Dueben,
Peter Watson,
Yulan He,
Ritabrata Dutta
Abstract:
Climate models (CM) are used to evaluate the impact of climate change on the risk of floods and strong precipitation events. However, these numerical simulators have difficulties representing precipitation events accurately, mainly due to limited spatial resolution when simulating multi-scale dynamics in the atmosphere. To improve the prediction of high resolution precipitation we apply a Deep Lea…
▽ More
Climate models (CM) are used to evaluate the impact of climate change on the risk of floods and strong precipitation events. However, these numerical simulators have difficulties representing precipitation events accurately, mainly due to limited spatial resolution when simulating multi-scale dynamics in the atmosphere. To improve the prediction of high resolution precipitation we apply a Deep Learning (DL) approach using an input of CM simulations of the model fields (weather variables) that are more predictable than local precipitation. To this end, we present TRU-NET (Temporal Recurrent U-Net), an encoder-decoder model featuring a novel 2D cross attention mechanism between contiguous convolutional-recurrent layers to effectively model multi-scale spatio-temporal weather processes. We use a conditional-continuous loss function to capture the zero-skewed %extreme event patterns of rainfall. Experiments show that our model consistently attains lower RMSE and MAE scores than a DL model prevalent in short term precipitation prediction and improves upon the rainfall predictions of a state-of-the-art dynamical weather model. Moreover, by evaluating the performance of our model under various, training and testing, data formulation strategies, we show that there is enough data for our deep learning approach to output robust, high-quality results across seasons and varying regions.
△ Less
Submitted 12 February, 2021; v1 submitted 20 August, 2020;
originally announced August 2020.
-
A Survey of Machine Learning Methods for Detecting False Data Injection Attacks in Power Systems
Authors:
Ali Sayghe,
Yaodan Hu,
Ioannis Zografopoulos,
XiaoRui Liu,
Raj Gautam Dutta,
Yier Jin,
Charalambos Konstantinou
Abstract:
Over the last decade, the number of cyberattacks targeting power systems and causing physical and economic damages has increased rapidly. Among them, False Data Injection Attacks (FDIAs) is a class of cyberattacks against power grid monitoring systems. Adversaries can successfully perform FDIAs in order to manipulate the power system State Estimation (SE) by compromising sensors or modifying syste…
▽ More
Over the last decade, the number of cyberattacks targeting power systems and causing physical and economic damages has increased rapidly. Among them, False Data Injection Attacks (FDIAs) is a class of cyberattacks against power grid monitoring systems. Adversaries can successfully perform FDIAs in order to manipulate the power system State Estimation (SE) by compromising sensors or modifying system data. SE is an essential process performed by the Energy Management System (EMS) towards estimating unknown state variables based on system redundant measurements and network topology. SE routines include Bad Data Detection (BDD) algorithms to eliminate errors from the acquired measurements, e.g., in case of sensor failures. FDIAs can bypass BDD modules to inject malicious data vectors into a subset of measurements without being detected, and thus manipulate the results of the SE process. In order to overcome the limitations of traditional residual-based BDD approaches, data-driven solutions based on machine learning algorithms have been widely adopted for detecting malicious manipulation of sensor data due to their fast execution times and accurate results. This paper provides a comprehensive review of the most up-to-date machine learning methods for detecting FDIAs against power system SE algorithms.
△ Less
Submitted 16 August, 2020;
originally announced August 2020.
-
Analysing Meso and Macro conversation structures in an online suicide support forum
Authors:
Sagar Joglekar,
Sumithra Velupillai,
Rina Dutta,
Nishanth Sastry
Abstract:
Platforms like Reddit and Twitter offer internet users an opportunity to talk about diverse issues, including those pertaining to physical and mental health. Some of these forums also function as a safe space for severely distressed mental health patients to get social support from peers. The online community platform Reddit's SuicideWatch is one example of an online forum dedicated specifically t…
▽ More
Platforms like Reddit and Twitter offer internet users an opportunity to talk about diverse issues, including those pertaining to physical and mental health. Some of these forums also function as a safe space for severely distressed mental health patients to get social support from peers. The online community platform Reddit's SuicideWatch is one example of an online forum dedicated specifically to people who suffer from suicidal thoughts, or who are concerned about people who might be at risk. It remains to be seen if these forums can be used to understand and model the nature of online social support, not least because of the noisy and informal nature of conversations. Moreover, understanding how a community of volunteering peers react to calls for help in cases of suicidal posts, would help to devise better tools for online mitigation of such episodes. In this paper, we propose an approach to characterise conversations in online forums. Using data from the SuicideWatch subreddit as a case study, we propose metrics at a macroscopic level -- measuring the structure of the entire conversation as a whole. We also develop a framework to measure structures in supportive conversations at a mesoscopic level -- measuring interactions with the immediate neighbours of the person in distress. We statistically show through comparison with baseline conversations from random Reddit threads that certain macro and meso-scale structures in an online conversation exhibit signatures of social support, and are particularly over-expressed in SuicideWatch conversations.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Synthetic Event Time Series Health Data Generation
Authors:
Saloni Dash,
Ritik Dutta,
Isabelle Guyon,
Adrien Pavao,
Andrew Yale,
Kristin P. Bennett
Abstract:
Synthetic medical data which preserves privacy while maintaining utility can be used as an alternative to real medical data, which has privacy costs and resource constraints associated with it. At present, most models focus on generating cross-sectional health data which is not necessarily representative of real data. In reality, medical data is longitudinal in nature, with a single patient having…
▽ More
Synthetic medical data which preserves privacy while maintaining utility can be used as an alternative to real medical data, which has privacy costs and resource constraints associated with it. At present, most models focus on generating cross-sectional health data which is not necessarily representative of real data. In reality, medical data is longitudinal in nature, with a single patient having multiple health events, non-uniformly distributed throughout their lifetime. These events are influenced by patient covariates such as comorbidities, age group, gender etc. as well as external temporal effects (e.g. flu season). While there exist seminal methods to model time series data, it becomes increasingly challenging to extend these methods to medical event time series data. Due to the complexity of the real data, in which each patient visit is an event, we transform the data by using summary statistics to characterize the events for a fixed set of time intervals, to facilitate analysis and interpretability. We then train a generative adversarial network to generate synthetic data. We demonstrate this approach by generating human sleep patterns, from a publicly available dataset. We empirically evaluate the generated data and show close univariate resemblance between synthetic and real data. However, we also demonstrate how stratification by covariates is required to gain a deeper understanding of synthetic data quality.
△ Less
Submitted 27 November, 2019; v1 submitted 14 November, 2019;
originally announced November 2019.
-
Bengali Handwritten Character Classification using Transfer Learning on Deep Convolutional Neural Network
Authors:
Swagato Chatterjee,
Rwik Kumar Dutta,
Debayan Ganguly,
Kingshuk Chatterjee,
Sudipta Roy
Abstract:
In this paper, we propose a solution which uses state-of-the-art techniques in Deep Learning to tackle the problem of Bengali Handwritten Character Recognition ( HCR ). Our method uses lesser iterations to train than most other comparable methods. We employ Transfer Learning on ResNet 50, a state-of-the-art deep Convolutional Neural Network Model, pretrained on ImageNet dataset. We also use other…
▽ More
In this paper, we propose a solution which uses state-of-the-art techniques in Deep Learning to tackle the problem of Bengali Handwritten Character Recognition ( HCR ). Our method uses lesser iterations to train than most other comparable methods. We employ Transfer Learning on ResNet 50, a state-of-the-art deep Convolutional Neural Network Model, pretrained on ImageNet dataset. We also use other techniques like a modified version of One Cycle Policy, varying the input image sizes etc. to ensure that our training occurs fast. We use the BanglaLekha-Isolated Dataset for evaluation of our technique which consists of 84 classes (50 Basic, 10 Numerals and 24 Compound Characters). We are able to achieve 96.12% accuracy in just 47 epochs on BanglaLekha-Isolated dataset. When comparing our method with that of other researchers, considering number of classes and without using Ensemble Learning, the proposed solution achieves state of the art result for Handwritten Bengali Character Recognition. Code and weight files are available at https://github.com/swagato-c/bangla-hwcr-present.
△ Less
Submitted 25 February, 2019;
originally announced February 2019.
-
UC Secure Issuer-Free Adaptive Oblivious Transfer with Hidden Access Policy
Authors:
Vandana Guleria,
Ratna Dutta
Abstract:
Privacy is a major concern in designing any cryptographic primitive when frequent transactions are done electronically. During electronic transactions, people reveal their personal data into several servers and believe that this information does not leak too much about them. The adaptive oblivious transfer with hidden access policy (AOT-HAP) takes measure against such privacy issues. The existing…
▽ More
Privacy is a major concern in designing any cryptographic primitive when frequent transactions are done electronically. During electronic transactions, people reveal their personal data into several servers and believe that this information does not leak too much about them. The adaptive oblivious transfer with hidden access policy (AOT-HAP) takes measure against such privacy issues. The existing AOT-HAP involves a sender and multiple receivers apart from a designated issuer. Security of these schemes rely on the fact that the issuer cannot collude with a set of receivers. Moreover, they loose security when run with multiple protocol instances during concurrent execution. We present the first issuer-free AOT-HAP in universal composable (UC) framework in which the protocol is secure even when composed with each other or with other protocols. A concrete security analysis is given assuming the hardness of q-strong Diffie-Hellman (SDH), decision Linear (DLIN) and decision bilinear Diffie-Hellman (DBDH) problems against malicious adversary in UC model. Moreover, the protocol outperforms the existing similar schemes.
△ Less
Submitted 29 November, 2017;
originally announced November 2017.
-
Intelligent Personal Assistant with Knowledge Navigation
Authors:
Amit Kumar,
Rahul Dutta,
Harbhajan Rai
Abstract:
An Intelligent Personal Agent (IPA) is an agent that has the purpose of helping the user to gain information through reliable resources with the help of knowledge navigation techniques and saving time to search the best content. The agent is also responsible for responding to the chat-based queries with the help of Conversation Corpus. We will be testing different methods for optimal query generat…
▽ More
An Intelligent Personal Agent (IPA) is an agent that has the purpose of helping the user to gain information through reliable resources with the help of knowledge navigation techniques and saving time to search the best content. The agent is also responsible for responding to the chat-based queries with the help of Conversation Corpus. We will be testing different methods for optimal query generation. To felicitate the ease of usage of the application, the agent will be able to accept the input through Text (Keyboard), Voice (Speech Recognition) and Server (Facebook) and output responses using the same method. Existing chat bots reply by making changes in the input, but we will give responses based on multiple SRT files. The model will learn using the human dialogs dataset and will be able respond human-like. Responses to queries about famous things (places, people, and words) can be provided using web scraping which will enable the bot to have knowledge navigation features. The agent will even learn from its past experiences supporting semi-supervised learning.
△ Less
Submitted 28 April, 2017;
originally announced April 2017.
-
Grading of Mammalian Cumulus Oocyte Complexes using Machine Learning for in Vitro Embryo Culture
Authors:
Viswanath P Sudarshan,
Tobias Weiser,
Phalgun Chintala,
Subhamoy Mandal,
Rahul Dutta
Abstract:
Visual observation of Cumulus Oocyte Complexes provides only limited information about its functional competence, whereas the molecular evaluations methods are cumbersome or costly. Image analysis of mammalian oocytes can provide attractive alternative to address this challenge. However, it is complex, given the huge number of oocytes under inspection and the subjective nature of the features insp…
▽ More
Visual observation of Cumulus Oocyte Complexes provides only limited information about its functional competence, whereas the molecular evaluations methods are cumbersome or costly. Image analysis of mammalian oocytes can provide attractive alternative to address this challenge. However, it is complex, given the huge number of oocytes under inspection and the subjective nature of the features inspected for identification. Supervised machine learning methods like random forest with annotations from expert biologists can make the analysis task standardized and reduces inter-subject variability. We present a semi-automatic framework for predicting the class an oocyte belongs to, based on multi-object parametric segmentation on the acquired microscopic image followed by a feature based classification using random forests.
△ Less
Submitted 5 March, 2016;
originally announced March 2016.
-
Modelling-based experiment retrieval: A case study with gene expression clustering
Authors:
Paul Blomstedt,
Ritabrata Dutta,
Sohan Seth,
Alvis Brazma,
Samuel Kaski
Abstract:
Motivation: Public and private repositories of experimental data are growing to sizes that require dedicated methods for finding relevant data. To improve on the state of the art of keyword searches from annotations, methods for content-based retrieval have been proposed. In the context of gene expression experiments, most methods retrieve gene expression profiles, requiring each experiment to be…
▽ More
Motivation: Public and private repositories of experimental data are growing to sizes that require dedicated methods for finding relevant data. To improve on the state of the art of keyword searches from annotations, methods for content-based retrieval have been proposed. In the context of gene expression experiments, most methods retrieve gene expression profiles, requiring each experiment to be expressed as a single profile, typically of case vs. control. A more general, recently suggested alternative is to retrieve experiments whose models are good for modelling the query dataset. However, for very noisy and high-dimensional query data, this retrieval criterion turns out to be very noisy as well.
Results: We propose doing retrieval using a denoised model of the query dataset, instead of the original noisy dataset itself. To this end, we introduce a general probabilistic framework, where each experiment is modelled separately and the retrieval is done by finding related models. For retrieval of gene expression experiments, we use a probabilistic model called product partition model, which induces a clustering of genes that show similar expression patterns across a number of samples. The suggested metric for retrieval using clusterings is the normalized information distance. Empirical results finally suggest that inference for the full probabilistic model can be approximated with good performance using computationally faster heuristic clustering approaches (e.g. $k$-means). The method is highly scalable and straightforward to apply to construct a general-purpose gene expression experiment retrieval method.
Availability: The method can be implemented using standard clustering algorithms and normalized information distance, available in many statistical software packages.
△ Less
Submitted 4 January, 2016; v1 submitted 19 May, 2015;
originally announced May 2015.
-
Clear, Concise and Effective UI: Opinion and Suggestions
Authors:
Rishabh Jain,
Rupanta Rwiteej Dutta,
Rajat Tandon
Abstract:
The most important aspect of any Software is the operability for the intended audience. This factor of operability is encompassed in the user interface, which serves as the only window to the features of the system. It is thus essential that the User Interface provided is robust, concise and lucid. Presently there are no properly defined rules or guidelines for user interface design enabling a per…
▽ More
The most important aspect of any Software is the operability for the intended audience. This factor of operability is encompassed in the user interface, which serves as the only window to the features of the system. It is thus essential that the User Interface provided is robust, concise and lucid. Presently there are no properly defined rules or guidelines for user interface design enabling a perfect design, since such a system cannot be perceived. This article aims at providing suggestions in the design of the User Interface, which would make it easier for the user to navigate through the system features and also the developers to guide the users towards better utilization of the features.
△ Less
Submitted 13 September, 2014;
originally announced September 2014.
-
Retrieval of Experiments with Sequential Dirichlet Process Mixtures in Model Space
Authors:
Ritabrata Dutta,
Sohan Seth,
Samuel Kaski
Abstract:
We address the problem of retrieving relevant experiments given a query experiment, motivated by the public databases of datasets in molecular biology and other experimental sciences, and the need of scientists to relate to earlier work on the level of actual measurement data. Since experiments are inherently noisy and databases ever accumulating, we argue that a retrieval engine should possess tw…
▽ More
We address the problem of retrieving relevant experiments given a query experiment, motivated by the public databases of datasets in molecular biology and other experimental sciences, and the need of scientists to relate to earlier work on the level of actual measurement data. Since experiments are inherently noisy and databases ever accumulating, we argue that a retrieval engine should possess two particular characteristics. First, it should compare models learnt from the experiments rather than the raw measurements themselves: this allows incorporating experiment-specific prior knowledge to suppress noise effects and focus on what is important. Second, it should be updated sequentially from newly published experiments, without explicitly storing either the measurements or the models, which is critical for saving storage space and protecting data privacy: this promotes life long learning. We formulate the retrieval as a ``supermodelling'' problem, of sequentially learning a model of the set of posterior distributions, represented as sets of MCMC samples, and suggest the use of Particle-Learning-based sequential Dirichlet process mixture (DPM) for this purpose. The relevance measure for retrieval is derived from the supermodel through the mixture representation. We demonstrate the performance of the proposed retrieval method on simulated data and molecular biological experiments.
△ Less
Submitted 6 March, 2014; v1 submitted 8 October, 2013;
originally announced October 2013.
-
Collusion resistant self-healing key distribution in mobile wireless networks
Authors:
Ratna Dutta,
Sugata Sanyal
Abstract:
A fundamental concern of any secure group communication system is key management and wireless environments create new challenges. One core requirement in these emerging networks is self-healing. In systems where users can be offline and miss updates, self-healing allows a user to recover lost session keys and get back into the secure communication without putting extra burden on the group manager.…
▽ More
A fundamental concern of any secure group communication system is key management and wireless environments create new challenges. One core requirement in these emerging networks is self-healing. In systems where users can be offline and miss updates, self-healing allows a user to recover lost session keys and get back into the secure communication without putting extra burden on the group manager. Clearly, self-healing must only be available to authorized users. This paper fixes the problem of collusion attack in an existing self-healing key distribution scheme and provides a highly efficient scheme as compared to the existing works. It is computationally secure, resists collusion attacks made between newly joined users and revoked users and achieves forward and backward secrecy. Our security analysis is in an appropriate security model. Unlike the existing constructions, our scheme does not forbid revoked users from rejoining in later sessions.
△ Less
Submitted 26 June, 2012;
originally announced June 2012.
-
Offering A Product Recommendation System in E-commerce
Authors:
Ruma Dutta,
Debajyoti Mukhopadhyay
Abstract:
This paper proposes a number of explicit and implicit ratings in product recommendation system for Business-to-customer e-commerce purposes. The system recommends the products to a new user. It depends on the purchase pattern of previous users whose purchase pattern is close to that of a user who asks for a recommendation. The system is based on weighted cosine similarity measure to find out the c…
▽ More
This paper proposes a number of explicit and implicit ratings in product recommendation system for Business-to-customer e-commerce purposes. The system recommends the products to a new user. It depends on the purchase pattern of previous users whose purchase pattern is close to that of a user who asks for a recommendation. The system is based on weighted cosine similarity measure to find out the closest user profile among the profiles of all users in database. It also implements Association rule mining rule in recommending the products. Also, this product recommendation system takes into consideration the time of transaction of purchasing the items, thus eliminating sequence recognition problem. Experimental result shows for implicit rating, the proposed method gives acceptable performance in recommending the products. It also shows introduction of association rule improves the performance measure of recommendation system.
△ Less
Submitted 20 September, 2011;
originally announced September 2011.