-
Beyond the Visible: Jointly Attending to Spectral and Spatial Dimensions with HSI-Diffusion for the FINCH Spacecraft
Authors:
Ian Vyse,
Rishit Dagli,
Dav Vrat Chadha,
John P. Ma,
Hector Chen,
Isha Ruparelia,
Prithvi Seran,
Matthew Xie,
Eesa Aamer,
Aidan Armstrong,
Naveen Black,
Ben Borstein,
Kevin Caldwell,
Orrin Dahanaggamaarachchi,
Joe Dai,
Abeer Fatima,
Stephanie Lu,
Maxime Michet,
Anoushka Paul,
Carrie Ann Po,
Shivesh Prakash,
Noa Prosser,
Riddhiman Roy,
Mirai Shinjo,
Iliya Shofman
, et al. (4 additional authors not shown)
Abstract:
Satellite remote sensing missions have gained popularity over the past fifteen years due to their ability to cover large swaths of land at regular intervals, making them ideal for monitoring environmental trends. The FINCH mission, a 3U+ CubeSat equipped with a hyperspectral camera, aims to monitor crop residue cover in agricultural fields. Although hyperspectral imaging captures both spectral and…
▽ More
Satellite remote sensing missions have gained popularity over the past fifteen years due to their ability to cover large swaths of land at regular intervals, making them ideal for monitoring environmental trends. The FINCH mission, a 3U+ CubeSat equipped with a hyperspectral camera, aims to monitor crop residue cover in agricultural fields. Although hyperspectral imaging captures both spectral and spatial information, it is prone to various types of noise, including random noise, stripe noise, and dead pixels. Effective denoising of these images is crucial for downstream scientific tasks. Traditional methods, including hand-crafted techniques encoding strong priors, learned 2D image denoising methods applied across different hyperspectral bands, or diffusion generative models applied independently on bands, often struggle with varying noise strengths across spectral bands, leading to significant spectral distortion. This paper presents a novel approach to hyperspectral image denoising using latent diffusion models that integrate spatial and spectral information. We particularly do so by building a 3D diffusion model and presenting a 3-stage training approach on real and synthetically crafted datasets. The proposed method preserves image structure while reducing noise. Evaluations on both popular hyperspectral denoising datasets and synthetically crafted datasets for the FINCH mission demonstrate the effectiveness of this approach.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Biological Neurons Compete with Deep Reinforcement Learning in Sample Efficiency in a Simulated Gameworld
Authors:
Moein Khajehnejad,
Forough Habibollahi,
Aswin Paul,
Adeel Razi,
Brett J. Kagan
Abstract:
How do biological systems and machine learning algorithms compare in the number of samples required to show significant improvements in completing a task? We compared the learning efficiency of in vitro biological neural networks to the state-of-the-art deep reinforcement learning (RL) algorithms in a simplified simulation of the game `Pong'. Using DishBrain, a system that embodies in vitro neural…
▽ More
How do biological systems and machine learning algorithms compare in the number of samples required to show significant improvements in completing a task? We compared the learning efficiency of in vitro biological neural networks to the state-of-the-art deep reinforcement learning (RL) algorithms in a simplified simulation of the game `Pong'. Using DishBrain, a system that embodies in vitro neural networks with in silico computation using a high-density multi-electrode array, we contrasted the learning rate and the performance of these biological systems against time-matched learning from three state-of-the-art deep RL algorithms (i.e., DQN, A2C, and PPO) in the same game environment. This allowed a meaningful comparison between biological neural systems and deep RL. We find that when samples are limited to a real-world time course, even these very simple biological cultures outperformed deep RL algorithms across various game performance characteristics, implying a higher sample efficiency. Ultimately, even when tested across multiple types of information input to assess the impact of higher dimensional data input, biological neurons showcased faster learning than all deep reinforcement learning agents.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
A Real-Time Rescheduling Algorithm for Multi-robot Plan Execution
Authors:
Ying Feng,
Adittyo Paul,
Zhe Chen,
Jiaoyang Li
Abstract:
One area of research in multi-agent path finding is to determine how replanning can be efficiently achieved in the case of agents being delayed during execution. One option is to reschedule the passing order of agents, i.e., the sequence in which agents visit the same location. In response, we propose Switchable-Edge Search (SES), an A*-style algorithm designed to find optimal passing orders. We p…
▽ More
One area of research in multi-agent path finding is to determine how replanning can be efficiently achieved in the case of agents being delayed during execution. One option is to reschedule the passing order of agents, i.e., the sequence in which agents visit the same location. In response, we propose Switchable-Edge Search (SES), an A*-style algorithm designed to find optimal passing orders. We prove the optimality of SES and evaluate its efficiency via simulations. The best variant of SES takes less than 1 second for small- and medium-sized problems and runs up to 4 times faster than baselines for large-sized problems.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
On Predictive planning and counterfactual learning in active inference
Authors:
Aswin Paul,
Takuya Isomura,
Adeel Razi
Abstract:
Given the rapid advancement of artificial intelligence, understanding the foundations of intelligent behaviour is increasingly important. Active inference, regarded as a general theory of behaviour, offers a principled approach to probing the basis of sophistication in planning and decision-making. In this paper, we examine two decision-making schemes in active inference based on 'planning' and 'l…
▽ More
Given the rapid advancement of artificial intelligence, understanding the foundations of intelligent behaviour is increasingly important. Active inference, regarded as a general theory of behaviour, offers a principled approach to probing the basis of sophistication in planning and decision-making. In this paper, we examine two decision-making schemes in active inference based on 'planning' and 'learning from experience'. Furthermore, we also introduce a mixed model that navigates the data-complexity trade-off between these strategies, leveraging the strengths of both to facilitate balanced decision-making. We evaluate our proposed model in a challenging grid-world scenario that requires adaptability from the agent. Additionally, our model provides the opportunity to analyze the evolution of various parameters, offering valuable insights and contributing to an explainable framework for intelligent decision-making.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Design & Implementation of Automatic Machine Condition Monitoring and Maintenance System in Limited Resource Situations
Authors:
Abu Hanif Md. Ripon,
Muhammad Ahsan Ullah,
Arindam Kumar Paul,
Md. Mortaza Morshed
Abstract:
In the era of the fourth industrial revolution, it is essential to automate fault detection and diagnosis of machineries so that a warning system can be developed that will help to take an appropriate action before any catastrophic damage. Some machines health monitoring systems are used globally but they are expensive and need trained personnel to operate and analyse. Predictive maintenance and o…
▽ More
In the era of the fourth industrial revolution, it is essential to automate fault detection and diagnosis of machineries so that a warning system can be developed that will help to take an appropriate action before any catastrophic damage. Some machines health monitoring systems are used globally but they are expensive and need trained personnel to operate and analyse. Predictive maintenance and occupational health and safety culture are not available due to inadequate infrastructure, lack of skilled manpower, financial crisis, and others in developing countries. Starting from developing a cost-effective DAS for collecting fault data in this study, the effect of limited data and resources has been investigated while automating the process. To solve this problem, A feature engineering and data reduction method has been developed combining the concepts from wavelets, differential calculus, and signal processing. Finally, for automating the whole process, all the necessary theoretical and practical considerations to develop a predictive model have been proposed. The DAS successfully collected the required data from the machine that is 89% accurate compared to the professional manual monitoring system. SVM and NN were proposed for the prediction purpose because of their high predicting accuracy greater than 95% during training and 100% during testing the new samples. In this study, the combination of the simple algorithm with a rule-based system instead of a data-intensive system turned out to be hybridization by validating with collected data. The outcome of this research can be instantly applied to small and medium-sized industries for finding other issues and developing accordingly. As one of the foundational studies in automatic FDD, the findings and procedure of this study can lead others to extend, generalize, or add other dimensions to FDD automation.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Active Inference and Intentional Behaviour
Authors:
Karl J. Friston,
Tommaso Salvatori,
Takuya Isomura,
Alexander Tschantz,
Alex Kiefer,
Tim Verbelen,
Magnus Koudahl,
Aswin Paul,
Thomas Parr,
Adeel Razi,
Brett Kagan,
Christopher L. Buckley,
Maxwell J. D. Ramstead
Abstract:
Recent advances in theoretical biology suggest that basal cognition and sentient behaviour are emergent properties of in vitro cell cultures and neuronal networks, respectively. Such neuronal networks spontaneously learn structured behaviours in the absence of reward or reinforcement. In this paper, we characterise this kind of self-organisation through the lens of the free energy principle, i.e.,…
▽ More
Recent advances in theoretical biology suggest that basal cognition and sentient behaviour are emergent properties of in vitro cell cultures and neuronal networks, respectively. Such neuronal networks spontaneously learn structured behaviours in the absence of reward or reinforcement. In this paper, we characterise this kind of self-organisation through the lens of the free energy principle, i.e., as self-evidencing. We do this by first discussing the definitions of reactive and sentient behaviour in the setting of active inference, which describes the behaviour of agents that model the consequences of their actions. We then introduce a formal account of intentional behaviour, that describes agents as driven by a preferred endpoint or goal in latent state-spaces. We then investigate these forms of (reactive, sentient, and intentional) behaviour using simulations. First, we simulate the aforementioned in vitro experiments, in which neuronal cultures spontaneously learn to play Pong, by implementing nested, free energy minimising processes. The simulations are then used to deconstruct the ensuing predictive behaviour, leading to the distinction between merely reactive, sentient, and intentional behaviour, with the latter formalised in terms of inductive planning. This distinction is further studied using simple machine learning benchmarks (navigation in a grid world and the Tower of Hanoi problem), that show how quickly and efficiently adaptive behaviour emerges under an inductive form of active inference.
△ Less
Submitted 16 December, 2023; v1 submitted 6 December, 2023;
originally announced December 2023.
-
From task structures to world models: What do LLMs know?
Authors:
Ilker Yildirim,
L. A. Paul
Abstract:
In what sense does a large language model have knowledge? The answer to this question extends beyond the capabilities of a particular AI system, and challenges our assumptions about the nature of knowledge and intelligence. We answer by granting LLMs "instrumental knowledge"; knowledge defined by a certain set of abilities. We then ask how such knowledge is related to the more ordinary, "worldly"…
▽ More
In what sense does a large language model have knowledge? The answer to this question extends beyond the capabilities of a particular AI system, and challenges our assumptions about the nature of knowledge and intelligence. We answer by granting LLMs "instrumental knowledge"; knowledge defined by a certain set of abilities. We then ask how such knowledge is related to the more ordinary, "worldly" knowledge exhibited by human agents, and explore this in terms of the degree to which instrumental knowledge can be said to incorporate the structured world models of cognitive science. We discuss ways LLMs could recover degrees of worldly knowledge, and suggest such recovery will be governed by an implicit, resource-rational tradeoff between world models and task demands.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Exploring Diverse Coping Mechanisms in 2023: A Comprehensive Survey Across Backgrounds and Cultures
Authors:
Abhijit Paul,
Rony Majumder
Abstract:
This study presents a pioneering investigation into the wide array of coping mechanisms employed by individuals in the year 2023, with a focus on data collected through the popular social media platform TikTok. Coping mechanisms are essential strategies that people adopt to navigate the challenges and stressors of everyday life, yet little research has been conducted on their comprehensive compila…
▽ More
This study presents a pioneering investigation into the wide array of coping mechanisms employed by individuals in the year 2023, with a focus on data collected through the popular social media platform TikTok. Coping mechanisms are essential strategies that people adopt to navigate the challenges and stressors of everyday life, yet little research has been conducted on their comprehensive compilation across different backgrounds, countries, and experiences.
Using TikTok as a data collection tool allowed us to access a diverse and extensive pool of participants, representing various cultural, social, and demographic backgrounds. Our study collates coping mechanisms reported by users from different parts of the world, facilitating the identification of both universal and culture-specific strategies.
This research contributes to the existing literature by providing a holistic view of coping mechanisms without being limited to specific fields or populations. By analyzing the coping methods shared on TikTok, we reveal a comprehensive list of strategies employed by people from diverse walks of life. The findings of this study not only shed light on how individuals cope with challenges in the modern era but also offer insights into the evolving coping trends and the role of social media in disseminating coping strategies. Understanding these coping mechanisms can have implications for mental health professionals, practitioners, and policymakers seeking to provide support and resources to individuals facing different stressors and hardships.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Bengali Document Layout Analysis -- A YOLOV8 Based Ensembling Approach
Authors:
Nazmus Sakib Ahmed,
Saad Sakib Noor,
Ashraful Islam Shanto Sikder,
Abhijit Paul
Abstract:
This paper focuses on enhancing Bengali Document Layout Analysis (DLA) using the YOLOv8 model and innovative post-processing techniques. We tackle challenges unique to the complex Bengali script by employing data augmentation for model robustness. After meticulous validation set evaluation, we fine-tune our approach on the complete dataset, leading to a two-stage prediction strategy for accurate e…
▽ More
This paper focuses on enhancing Bengali Document Layout Analysis (DLA) using the YOLOv8 model and innovative post-processing techniques. We tackle challenges unique to the complex Bengali script by employing data augmentation for model robustness. After meticulous validation set evaluation, we fine-tune our approach on the complete dataset, leading to a two-stage prediction strategy for accurate element segmentation. Our ensemble model, combined with post-processing, outperforms individual base architectures, addressing issues identified in the BaDLAD dataset. By leveraging this approach, we aim to advance Bengali document analysis, contributing to improved OCR and document comprehension and BaDLAD serves as a foundational resource for this endeavor, aiding future research in the field. Furthermore, our experiments provided key insights to incorporate new strategies into the established solution.
△ Less
Submitted 29 April, 2024; v1 submitted 2 September, 2023;
originally announced September 2023.
-
Few-shot Diagnosis of Chest x-rays Using an Ensemble of Random Discriminative Subspaces
Authors:
Kshitiz,
Garvit Garg,
Angshuman Paul
Abstract:
Due to the scarcity of annotated data in the medical domain, few-shot learning may be useful for medical image analysis tasks. We design a few-shot learning method using an ensemble of random subspaces for the diagnosis of chest x-rays (CXRs). Our design is computationally efficient and almost 1.8 times faster than method that uses the popular truncated singular value decomposition (t-SVD) for sub…
▽ More
Due to the scarcity of annotated data in the medical domain, few-shot learning may be useful for medical image analysis tasks. We design a few-shot learning method using an ensemble of random subspaces for the diagnosis of chest x-rays (CXRs). Our design is computationally efficient and almost 1.8 times faster than method that uses the popular truncated singular value decomposition (t-SVD) for subspace decomposition. The proposed method is trained by minimizing a novel loss function that helps create well-separated clusters of training data in discriminative subspaces. As a result, minimizing the loss maximizes the distance between the subspaces, making them discriminative and assisting in better classification. Experiments on large-scale publicly available CXR datasets yield promising results. Code for the project will be available at https://github.com/Few-shot-Learning-on-chest-x-ray/fsl_subspace.
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
A Data-Theoretic Approach to Identifying Violent Facial Expressions in Social Crime Contexts
Authors:
Arindam Kumar Paul
Abstract:
Human Facial Expressions plays an important role in identifying human actions or intention. Facial expressions can represent any specific action of any person and the pattern of violent behavior of any person strongly depends on the geographic region. Here we have designed an automated system by using a Convolutional Neural Network which can detect whether a person has any intention to commit any…
▽ More
Human Facial Expressions plays an important role in identifying human actions or intention. Facial expressions can represent any specific action of any person and the pattern of violent behavior of any person strongly depends on the geographic region. Here we have designed an automated system by using a Convolutional Neural Network which can detect whether a person has any intention to commit any crime or not. Here we proposed a new method that can identify criminal intentions or violent behavior of any person before executing crimes more efficiently by using very little data on facial expressions before executing a crime or any violent tasks. Instead of using image features which is a time-consuming and faulty method we used an automated feature selector Convolutional Neural Network model which can capture exact facial expressions for training and then can predict that target facial expressions more accurately. Here we used only the facial data of a specific geographic region which can represent the violent and before-crime before-crime facial patterns of the people of the whole region.
△ Less
Submitted 19 December, 2023; v1 submitted 16 August, 2023;
originally announced August 2023.
-
On efficient computation in active inference
Authors:
Aswin Paul,
Noor Sajid,
Lancelot Da Costa,
Adeel Razi
Abstract:
Despite being recognized as neurobiologically plausible, active inference faces difficulties when employed to simulate intelligent behaviour in complex environments due to its computational cost and the difficulty of specifying an appropriate target distribution for the agent. This paper introduces two solutions that work in concert to address these limitations. First, we present a novel planning…
▽ More
Despite being recognized as neurobiologically plausible, active inference faces difficulties when employed to simulate intelligent behaviour in complex environments due to its computational cost and the difficulty of specifying an appropriate target distribution for the agent. This paper introduces two solutions that work in concert to address these limitations. First, we present a novel planning algorithm for finite temporal horizons with drastically lower computational complexity. Second, inspired by Z-learning from control theory literature, we simplify the process of setting an appropriate target distribution for new and existing active inference planning schemes. Our first approach leverages the dynamic programming algorithm, known for its computational efficiency, to minimize the cost function used in planning through the Bellman-optimality principle. Accordingly, our algorithm recursively assesses the expected free energy of actions in the reverse temporal order. This improves computational efficiency by orders of magnitude and allows precise model learning and planning, even under uncertain conditions. Our method simplifies the planning process and shows meaningful behaviour even when specifying only the agent's final goal state. The proposed solutions make defining a target distribution from a goal state straightforward compared to the more complicated task of defining a temporally informed target distribution. The effectiveness of these methods is tested and demonstrated through simulations in standard grid-world tasks. These advances create new opportunities for various applications.
△ Less
Submitted 2 July, 2023;
originally announced July 2023.
-
Torrent Driven (TD) Coin: A Crypto Coin with Built In Distributed Data Storage System
Authors:
Anirudha Paul
Abstract:
In recent years decentralized currencies developed through Blockchains are increasingly becoming popular because of their transparent nature and absence of a central controlling authority. Though a lot of computation power, disk space, and energy are being used to run this system, most of these resources are dedicated to just keeping the bad actors away by using Proof of Work, Proof of Stake, Proo…
▽ More
In recent years decentralized currencies developed through Blockchains are increasingly becoming popular because of their transparent nature and absence of a central controlling authority. Though a lot of computation power, disk space, and energy are being used to run this system, most of these resources are dedicated to just keeping the bad actors away by using Proof of Work, Proof of Stake, Proof of Space, etc., consensus. In this paper, we discuss a way to combine those consensus mechanism and modify the defense system to create actual values for the end-users by providing a solution for securely storing their data in a decentralized manner without compromising the integrity of the blockchain.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
A System for Generalized 3D Multi-Object Search
Authors:
Kaiyu Zheng,
Anirudha Paul,
Stefanie Tellex
Abstract:
Searching for objects is a fundamental skill for robots. As such, we expect object search to eventually become an off-the-shelf capability for robots, similar to e.g., object detection and SLAM. In contrast, however, no system for 3D object search exists that generalizes across real robots and environments. In this paper, building upon a recent theoretical framework that exploited the octree struc…
▽ More
Searching for objects is a fundamental skill for robots. As such, we expect object search to eventually become an off-the-shelf capability for robots, similar to e.g., object detection and SLAM. In contrast, however, no system for 3D object search exists that generalizes across real robots and environments. In this paper, building upon a recent theoretical framework that exploited the octree structure for representing belief in 3D, we present GenMOS (Generalized Multi-Object Search), the first general-purpose system for multi-object search (MOS) in a 3D region that is robot-independent and environment-agnostic. GenMOS takes as input point cloud observations of the local region, object detection results, and localization of the robot's view pose, and outputs a 6D viewpoint to move to through online planning. In particular, GenMOS uses point cloud observations in three ways: (1) to simulate occlusion; (2) to inform occupancy and initialize octree belief; and (3) to sample a belief-dependent graph of view positions that avoid obstacles. We evaluate our system both in simulation and on two real robot platforms. Our system enables, for example, a Boston Dynamics Spot robot to find a toy cat hidden underneath a couch in under one minute. We further integrate 3D local search with 2D global search to handle larger areas, demonstrating the resulting system in a 25m$^2$ lobby area.
△ Less
Submitted 17 April, 2023; v1 submitted 6 March, 2023;
originally announced March 2023.
-
Automatically Classifying Emotions based on Text: A Comparative Exploration of Different Datasets
Authors:
Anna Koufakou,
Jairo Garciga,
Adam Paul,
Joseph Morelli,
Christopher Frank
Abstract:
Emotion Classification based on text is a task with many applications which has received growing interest in recent years. This paper presents a preliminary study with the goal to help researchers and practitioners gain insight into relatively new datasets as well as emotion classification in general. We focus on three datasets that were recently presented in the related literature, and we explore…
▽ More
Emotion Classification based on text is a task with many applications which has received growing interest in recent years. This paper presents a preliminary study with the goal to help researchers and practitioners gain insight into relatively new datasets as well as emotion classification in general. We focus on three datasets that were recently presented in the related literature, and we explore the performance of traditional as well as state-of-the-art deep learning models in the presence of different characteristics in the data. We also explore the use of data augmentation in order to improve performance. Our experimental work shows that state-of-the-art models such as RoBERTa perform the best for all cases. We also provide observations and discussion that highlight the complexity of emotion classification in these datasets and test out the applicability of the models to actual social media posts we collected and labeled.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
Understanding the effect of varying amounts of replay per step
Authors:
Animesh Kumar Paul,
Videh Raj Nema
Abstract:
Model-based reinforcement learning uses models to plan, where the predictions and policies of an agent can be improved by using more computation without additional data from the environment, thereby improving sample efficiency. However, learning accurate estimates of the model is hard. Subsequently, the natural question is whether we can get similar benefits as planning with model-free methods. Ex…
▽ More
Model-based reinforcement learning uses models to plan, where the predictions and policies of an agent can be improved by using more computation without additional data from the environment, thereby improving sample efficiency. However, learning accurate estimates of the model is hard. Subsequently, the natural question is whether we can get similar benefits as planning with model-free methods. Experience replay is an essential component of many model-free algorithms enabling sample-efficient learning and stability by providing a mechanism to store past experiences for further reuse in the gradient computational process. Prior works have established connections between models and experience replay by planning with the latter. This involves increasing the number of times a mini-batch is sampled and used for updates at each step (amount of replay per step). We attempt to exploit this connection by doing a systematic study on the effect of varying amounts of replay per step in a well-known model-free algorithm: Deep Q-Network (DQN) in the Mountain Car environment. We empirically show that increasing replay improves DQN's sample efficiency, reduces the variation in its performance, and makes it more robust to change in hyperparameters. Altogether, this takes a step toward a better algorithm for deployment.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Table Tennis Stroke Detection and Recognition Using Ball Trajectory Data
Authors:
Kaustubh Milind Kulkarni,
Rohan S Jamadagni,
Jeffrey Aaron Paul,
Sucheth Shenoy
Abstract:
In this work, the novel task of detecting and classifying table tennis strokes solely using the ball trajectory has been explored. A single camera setup positioned in the umpire's view has been employed to procure a dataset consisting of six stroke classes executed by four professional table tennis players. Ball tracking using YOLOv4, a traditional object detection model, and TrackNetv2, a tempora…
▽ More
In this work, the novel task of detecting and classifying table tennis strokes solely using the ball trajectory has been explored. A single camera setup positioned in the umpire's view has been employed to procure a dataset consisting of six stroke classes executed by four professional table tennis players. Ball tracking using YOLOv4, a traditional object detection model, and TrackNetv2, a temporal heatmap based model, have been implemented on our dataset and their performances have been benchmarked. A mathematical approach developed to extract temporal boundaries of strokes using the ball trajectory data yielded a total of 2023 valid strokes in our dataset, while also detecting services and missed strokes successfully. The temporal convolutional network developed performed stroke recognition on completely unseen data with an accuracy of 87.155%. Several machine learning and deep learning based model architectures have been trained for stroke recognition using ball trajectory input and benchmarked based on their performances. While stroke recognition in the field of table tennis has been extensively explored based on human action recognition using video data focused on the player's actions, the use of ball trajectory data for the same is an unexplored characteristic of the sport. Hence, the motivation behind the work is to demonstrate that meaningful inferences such as stroke detection and recognition can be drawn using minimal input information.
△ Less
Submitted 19 February, 2023;
originally announced February 2023.
-
High-precision regressors for particle physics
Authors:
Fady Bishara,
Ayan Paul,
Jennifer Dy
Abstract:
Monte Carlo simulations of physics processes at particle colliders like the Large Hadron Collider at CERN take up a major fraction of the computational budget. For some simulations, a single data point takes seconds, minutes, or even hours to compute from first principles. Since the necessary number of data points per simulation is on the order of $10^9$ - $10^{12}$, machine learning regressors ca…
▽ More
Monte Carlo simulations of physics processes at particle colliders like the Large Hadron Collider at CERN take up a major fraction of the computational budget. For some simulations, a single data point takes seconds, minutes, or even hours to compute from first principles. Since the necessary number of data points per simulation is on the order of $10^9$ - $10^{12}$, machine learning regressors can be used in place of physics simulators to significantly reduce this computational burden. However, this task requires high-precision regressors that can deliver data with relative errors of less than $1\%$ or even $0.1\%$ over the entire domain of the function. In this paper, we develop optimal training strategies and tune various machine learning regressors to satisfy the high-precision requirement. We leverage symmetry arguments from particle physics to optimize the performance of the regressors. Inspired by ResNets, we design a Deep Neural Network with skip connections that outperform fully connected Deep Neural Networks. We find that at lower dimensions, boosted decision trees far outperform neural networks while at higher dimensions neural networks perform significantly better. We show that these regressors can speed up simulations by a factor of $10^3$ - $10^6$ over the first-principles computations currently used in Monte Carlo simulations. Additionally, using symmetry arguments derived from particle physics, we reduce the number of regressors necessary for each simulation by an order of magnitude. Our work can significantly reduce the training and storage burden of Monte Carlo simulations at current and future collider experiments.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Sequential parametrized motion planning and its complexity, II
Authors:
Michael Farber,
Amit Kumar Paul
Abstract:
This is a continuation of our recent paper in which we developed the theory of sequential parametrized motion planning. A sequential parametrized motion planning algorithm produced a motion of the system which is required to visit a prescribed sequence of states, in a certain order, at specified moments of time. In the previous publication we analysed the sequential parametrized topological comple…
▽ More
This is a continuation of our recent paper in which we developed the theory of sequential parametrized motion planning. A sequential parametrized motion planning algorithm produced a motion of the system which is required to visit a prescribed sequence of states, in a certain order, at specified moments of time. In the previous publication we analysed the sequential parametrized topological complexity of the Fadell - Neuwirth fibration which in relevant to the problem of moving multiple robots avoiding collisions with other robots and with obstacles in the Euclidean space. Besides, in the preceeding paper we found the sequential parametrised topological complexity of the Fadell - Neuwirth bundle for the case of the Euclidean space $\Bbb R^d$ of odd dimension as well as the case $d=2$. In the present paper we give the complete answer for an arbitrary $d\ge 2$ even. Moreover, we present an explicit motion planning algorithm for controlling multiple robots in $\Bbb R^d$ having the minimal possible topological complexity; this algorithm is applicable to any number $n$ of robots and any number $m\ge 2$ of obstacles.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
How Real is Real: Evaluating the Robustness of Real-World Super Resolution
Authors:
Athiya Deviyani,
Efe Sinan Hoplamaz,
Alan Savio Paul
Abstract:
Image super-resolution (SR) is a field in computer vision that focuses on reconstructing high-resolution images from the respective low-resolution image. However, super-resolution is a well-known ill-posed problem as most methods rely on the downsampling method performed on the high-resolution image to form the low-resolution image to be known. Unfortunately, this is not something that is availabl…
▽ More
Image super-resolution (SR) is a field in computer vision that focuses on reconstructing high-resolution images from the respective low-resolution image. However, super-resolution is a well-known ill-posed problem as most methods rely on the downsampling method performed on the high-resolution image to form the low-resolution image to be known. Unfortunately, this is not something that is available in real-life super-resolution applications such as increasing the quality of a photo taken on a mobile phone. In this paper we will evaluate multiple state-of-the-art super-resolution methods and gauge their performance when presented with various types of real-life images and discuss the benefits and drawbacks of each method. We also introduce a novel dataset, WideRealSR, containing real images from a wide variety of sources. Finally, through careful experimentation and evaluation, we will present a potential solution to alleviate the generalization problem which is imminent in most state-of-the-art super-resolution models.
△ Less
Submitted 22 October, 2022;
originally announced October 2022.
-
Synthetic Dataset Generation for Adversarial Machine Learning Research
Authors:
Xiruo Liu,
Shibani Singh,
Cory Cornelius,
Colin Busho,
Mike Tan,
Anindya Paul,
Jason Martin
Abstract:
Existing adversarial example research focuses on digitally inserted perturbations on top of existing natural image datasets. This construction of adversarial examples is not realistic because it may be difficult, or even impossible, for an attacker to deploy such an attack in the real-world due to sensing and environmental effects. To better understand adversarial examples against cyber-physical s…
▽ More
Existing adversarial example research focuses on digitally inserted perturbations on top of existing natural image datasets. This construction of adversarial examples is not realistic because it may be difficult, or even impossible, for an attacker to deploy such an attack in the real-world due to sensing and environmental effects. To better understand adversarial examples against cyber-physical systems, we propose approximating the real-world through simulation. In this paper we describe our synthetic dataset generation tool that enables scalable collection of such a synthetic dataset with realistic adversarial examples. We use the CARLA simulator to collect such a dataset and demonstrate simulated attacks that undergo the same environmental transforms and processing as real-world images. Our tools have been used to collect datasets to help evaluate the efficacy of adversarial examples, and can be found at https://github.com/carla-simulator/carla/pull/4992.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
Gait Cycle Reconstruction and Human Identification from Occluded Sequences
Authors:
Abhishek Paul,
Manav Mukesh Jain,
Jinesh Jain,
Pratik Chattopadhyay
Abstract:
Gait-based person identification from videos captured at surveillance sites using Computer Vision-based techniques is quite challenging since these walking sequences are usually corrupted with occlusion, and a complete cycle of gait is not always available. In this work, we propose an effective neural network-based model to reconstruct the occluded frames in an input sequence before carrying out g…
▽ More
Gait-based person identification from videos captured at surveillance sites using Computer Vision-based techniques is quite challenging since these walking sequences are usually corrupted with occlusion, and a complete cycle of gait is not always available. In this work, we propose an effective neural network-based model to reconstruct the occluded frames in an input sequence before carrying out gait recognition. Specifically, we employ LSTM networks to predict an embedding for each occluded frame both from the forward and the backward directions, and next fuse the predictions from the two LSTMs by employing a network of residual blocks and convolutional layers. While the LSTMs are trained to minimize the mean-squared loss, the fusion network is trained to optimize the pixel-wise cross-entropy loss between the ground-truth and the reconstructed samples. Evaluation of our approach has been done using synthetically occluded sequences generated from the OU-ISIR LP and CASIA-B data and real-occluded sequences present in the TUM-IITKGP data. The effectiveness of the proposed reconstruction model has been verified through the Dice score and gait-based recognition accuracy using some popular gait recognition methods. Comparative study with existing occlusion handling methods in gait recognition highlights the superiority of our proposed occlusion reconstruction approach over the others.
△ Less
Submitted 20 June, 2022;
originally announced June 2022.
-
Learning in Feedback-driven Recurrent Spiking Neural Networks using full-FORCE Training
Authors:
Ankita Paul,
Stefan Wagner,
Anup Das
Abstract:
Feedback-driven recurrent spiking neural networks (RSNNs) are powerful computational models that can mimic dynamical systems. However, the presence of a feedback loop from the readout to the recurrent layer de-stabilizes the learning mechanism and prevents it from converging. Here, we propose a supervised training procedure for RSNNs, where a second network is introduced only during the training,…
▽ More
Feedback-driven recurrent spiking neural networks (RSNNs) are powerful computational models that can mimic dynamical systems. However, the presence of a feedback loop from the readout to the recurrent layer de-stabilizes the learning mechanism and prevents it from converging. Here, we propose a supervised training procedure for RSNNs, where a second network is introduced only during the training, to provide hint for the target dynamics. The proposed training procedure consists of generating targets for both recurrent and readout layers (i.e., for a full RSNN system). It uses the recursive least square-based First-Order and Reduced Control Error (FORCE) algorithm to fit the activity of each layer to its target. The proposed full-FORCE training procedure reduces the amount of modifications needed to keep the error between the output and target close to zero. These modifications control the feedback loop, which causes the training to converge. We demonstrate the improved performance and noise robustness of the proposed full-FORCE training procedure to model 8 dynamical systems using RSNNs with leaky integrate and fire (LIF) neurons and rate coding. For energy-efficient hardware implementation, an alternative time-to-first-spike (TTFS) coding is implemented for the full- FORCE training procedure. Compared to rate coding, full-FORCE with TTFS coding generates fewer spikes and facilitates faster convergence to the target dynamics.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
A Design Methodology for Fault-Tolerant Computing using Astrocyte Neural Networks
Authors:
Murat Işık,
Ankita Paul,
M. Lakshmi Varshika,
Anup Das
Abstract:
We propose a design methodology to facilitate fault tolerance of deep learning models. First, we implement a many-core fault-tolerant neuromorphic hardware design, where neuron and synapse circuitries in each neuromorphic core are enclosed with astrocyte circuitries, the star-shaped glial cells of the brain that facilitate self-repair by restoring the spike firing frequency of a failed neuron usin…
▽ More
We propose a design methodology to facilitate fault tolerance of deep learning models. First, we implement a many-core fault-tolerant neuromorphic hardware design, where neuron and synapse circuitries in each neuromorphic core are enclosed with astrocyte circuitries, the star-shaped glial cells of the brain that facilitate self-repair by restoring the spike firing frequency of a failed neuron using a closed-loop retrograde feedback signal. Next, we introduce astrocytes in a deep learning model to achieve the required degree of tolerance to hardware faults. Finally, we use a system software to partition the astrocyte-enabled model into clusters and implement them on the proposed fault-tolerant neuromorphic design. We evaluate this design methodology using seven deep learning inference models and show that it is both area and power efficient.
△ Less
Submitted 6 April, 2022;
originally announced April 2022.
-
Bangla hate speech detection on social media using attention-based recurrent neural network
Authors:
Amit Kumar Das,
Abdullah Al Asif,
Anik Paul,
Md. Nur Hossain
Abstract:
Hate speech has spread more rapidly through the daily use of technology and, most notably, by sharing your opinions or feelings on social media in a negative aspect. Although numerous works have been carried out in detecting hate speeches in English, German, and other languages, very few works have been carried out in the context of the Bengali language. In contrast, millions of people communicate…
▽ More
Hate speech has spread more rapidly through the daily use of technology and, most notably, by sharing your opinions or feelings on social media in a negative aspect. Although numerous works have been carried out in detecting hate speeches in English, German, and other languages, very few works have been carried out in the context of the Bengali language. In contrast, millions of people communicate on social media in Bengali. The few existing works that have been carried out need improvements in both accuracy and interpretability. This article proposed encoder decoder based machine learning model, a popular tool in NLP, to classify user's Bengali comments on Facebook pages. A dataset of 7,425 Bengali comments, consisting of seven distinct categories of hate speeches, was used to train and evaluate our model. For extracting and encoding local features from the comments, 1D convolutional layers were used. Finally, the attention mechanism, LSTM, and GRU based decoders have been used for predicting hate speech categories. Among the three encoder decoder algorithms, the attention-based decoder obtained the best accuracy (77%).
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Interpretable machine learning in Physics
Authors:
Christophe Grojean,
Ayan Paul,
Zhuoni Qian,
Inga Strümke
Abstract:
Adding interpretability to multivariate methods creates a powerful synergy for exploring complex physical systems with higher order correlations while bringing about a degree of clarity in the underlying dynamics of the system.
Adding interpretability to multivariate methods creates a powerful synergy for exploring complex physical systems with higher order correlations while bringing about a degree of clarity in the underlying dynamics of the system.
△ Less
Submitted 2 May, 2022; v1 submitted 11 March, 2022;
originally announced March 2022.
-
Energy-Efficient Respiratory Anomaly Detection in Premature Newborn Infants
Authors:
Ankita Paul,
Md. Abu Saleh Tajin,
Anup Das,
William M. Mongan,
Kapil R. Dandekar
Abstract:
Precise monitoring of respiratory rate in premature infants is essential to initiate medical interventions as required. Wired technologies can be invasive and obtrusive to the patients. We propose a Deep Learning enabled wearable monitoring system for premature newborn infants, where respiratory cessation is predicted using signals that are collected wirelessly from a non-invasive wearable Bellypa…
▽ More
Precise monitoring of respiratory rate in premature infants is essential to initiate medical interventions as required. Wired technologies can be invasive and obtrusive to the patients. We propose a Deep Learning enabled wearable monitoring system for premature newborn infants, where respiratory cessation is predicted using signals that are collected wirelessly from a non-invasive wearable Bellypatch put on infant's body. We propose a five-stage design pipeline involving data collection and labeling, feature scaling, model selection with hyperparameter tuning, model training and validation, model testing and deployment. The model used is a 1-D Convolutional Neural Network (1DCNN) architecture with 1 convolutional layer, 1 pooling layer and 3 fully-connected layers, achieving 97.15% accuracy. To address energy limitations of wearable processing, several quantization techniques are explored and their performance and energy consumption are analyzed. We propose a novel Spiking-Neural-Network(SNN) based respiratory classification solution, which can be implemented on event-driven neuromorphic hardware. We propose an approach to convert the analog operations of our baseline 1DCNN to their spiking equivalent. We perform a design-space exploration using the parameters of the converted SNN to generate inference solutions having different accuracy and energy footprints. We select a solution that achieves 93.33% accuracy with 18 times lower energy compared with baseline 1DCNN model. Additionally the proposed SNN solution achieves similar accuracy but with 4 times less energy.
△ Less
Submitted 21 February, 2022;
originally announced February 2022.
-
Implementing Spiking Neural Networks on Neuromorphic Architectures: A Review
Authors:
Phu Khanh Huynh,
M. Lakshmi Varshika,
Ankita Paul,
Murat Isik,
Adarsha Balaji,
Anup Das
Abstract:
Recently, both industry and academia have proposed several different neuromorphic systems to execute machine learning applications that are designed using Spiking Neural Networks (SNNs). With the growing complexity on design and technology fronts, programming such systems to admit and execute a machine learning application is becoming increasingly challenging. Additionally, neuromorphic systems ar…
▽ More
Recently, both industry and academia have proposed several different neuromorphic systems to execute machine learning applications that are designed using Spiking Neural Networks (SNNs). With the growing complexity on design and technology fronts, programming such systems to admit and execute a machine learning application is becoming increasingly challenging. Additionally, neuromorphic systems are required to guarantee real-time performance, consume lower energy, and provide tolerance to logic and memory failures. Consequently, there is a clear need for system software frameworks that can implement machine learning applications on current and emerging neuromorphic systems, and simultaneously address performance, energy, and reliability. Here, we provide a comprehensive overview of such frameworks proposed for both, platform-based design and hardware-software co-design. We highlight challenges and opportunities that the future holds in the area of system software technology for neuromorphic computing.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
On the Mitigation of Read Disturbances in Neuromorphic Inference Hardware
Authors:
Ankita Paul,
Shihao Song,
Twisha Titirsha,
Anup Das
Abstract:
Non-Volatile Memory (NVM) cells are used in neuromorphic hardware to store model parameters, which are programmed as resistance states. NVMs suffer from the read disturb issue, where the programmed resistance state drifts upon repeated access of a cell during inference. Resistance drifts can lower the inference accuracy. To address this, it is necessary to periodically reprogram model parameters (…
▽ More
Non-Volatile Memory (NVM) cells are used in neuromorphic hardware to store model parameters, which are programmed as resistance states. NVMs suffer from the read disturb issue, where the programmed resistance state drifts upon repeated access of a cell during inference. Resistance drifts can lower the inference accuracy. To address this, it is necessary to periodically reprogram model parameters (a high overhead operation). We study read disturb failures of an NVM cell. Our analysis show both a strong dependency on model characteristics such as synaptic activation and criticality, and on the voltage used to read resistance states during inference. We propose a system software framework to incorporate such dependencies in programming model parameters on NVM cells of a neuromorphic hardware. Our framework consists of a convex optimization formulation which aims to implement synaptic weights that have more activations and are critical, i.e., those that have high impact on accuracy on NVM cells that are exposed to lower voltages during inference. In this way, we increase the time interval between two consecutive reprogramming of model parameters. We evaluate our system software with many emerging inference models on a neuromorphic hardware simulator and show a significant reduction in the system overhead.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
Socioeconomic disparities and COVID-19: the causal connections
Authors:
Tannista Banerjee,
Ayan Paul,
Vishak Srikanth,
Inga Strümke
Abstract:
The analysis of causation is a challenging task that can be approached in various ways. With the increasing use of machine learning based models in computational socioeconomics, explaining these models while taking causal connections into account is a necessity. In this work, we advocate the use of an explanatory framework from cooperative game theory augmented with $do$ calculus, namely causal Sh…
▽ More
The analysis of causation is a challenging task that can be approached in various ways. With the increasing use of machine learning based models in computational socioeconomics, explaining these models while taking causal connections into account is a necessity. In this work, we advocate the use of an explanatory framework from cooperative game theory augmented with $do$ calculus, namely causal Shapley values. Using causal Shapley values, we analyze socioeconomic disparities that have a causal link to the spread of COVID-19 in the USA. We study several phases of the disease spread to show how the causal connections change over time. We perform a causal analysis using random effects models and discuss the correspondence between the two methods to verify our results. We show the distinct advantages a non-linear machine learning models have over linear models when performing a multivariate analysis, especially since the machine learning models can map out non-linear correlations in the data. In addition, the causal Shapley values allow for including the causal structure in the variable importance computed for the machine learning model.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
Nonnegative Matrix Factorization to understand Spatio-Temporal Traffic Pattern Variations during COVID-19: A Case Study
Authors:
Anandkumar Balasubramaniam,
Thirunavukarasu Balasubramaniam,
Rathinaraja Jeyaraj,
Anand Paul,
Richi Nayak
Abstract:
Due to the rapid developments in Intelligent Transportation System (ITS) and increasing trend in the number of vehicles on road, abundant of road traffic data is generated and available. Understanding spatio-temporal traffic patterns from this data is crucial and has been effectively helping in traffic plannings, road constructions, etc. However, understanding traffic patterns during COVID-19 pand…
▽ More
Due to the rapid developments in Intelligent Transportation System (ITS) and increasing trend in the number of vehicles on road, abundant of road traffic data is generated and available. Understanding spatio-temporal traffic patterns from this data is crucial and has been effectively helping in traffic plannings, road constructions, etc. However, understanding traffic patterns during COVID-19 pandemic is quite challenging and important as there is a huge difference in-terms of people's and vehicle's travel behavioural patterns. In this paper, a case study is conducted to understand the variations in spatio-temporal traffic patterns during COVID-19. We apply nonnegative matrix factorization (NMF) to elicit patterns. The NMF model outputs are analysed based on the spatio-temporal pattern behaviours observed during the year 2019 and 2020, which is before pandemic and during pandemic situations respectively, in Great Britain. The outputs of the analysed spatio-temporal traffic pattern variation behaviours will be useful in the fields of traffic management in Intelligent Transportation System and management in various stages of pandemic or unavoidable scenarios in-relation to road traffic.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Design Technology Co-Optimization for Neuromorphic Computing
Authors:
Ankita Paul,
Shihao Song,
Anup Das
Abstract:
We present a design-technology tradeoff analysis in implementing machine-learning inference on the processing cores of a Non-Volatile Memory (NVM)-based many-core neuromorphic hardware. Through detailed circuit-level simulations for scaled process technology nodes, we show the negative impact of design scaling on read endurance of NVMs, which directly impacts their inference lifetime. At a finer g…
▽ More
We present a design-technology tradeoff analysis in implementing machine-learning inference on the processing cores of a Non-Volatile Memory (NVM)-based many-core neuromorphic hardware. Through detailed circuit-level simulations for scaled process technology nodes, we show the negative impact of design scaling on read endurance of NVMs, which directly impacts their inference lifetime. At a finer granularity, the inference lifetime of a core depends on 1) the resistance state of synaptic weights programmed on the core (design) and 2) the voltage variation inside the core that is introduced by the parasitic components on current paths (technology). We show that such design and technology characteristics can be incorporated in a design flow to significantly improve the inference lifetime.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
Local and Global Context-Based Pairwise Models for Sentence Ordering
Authors:
Ruskin Raj Manku,
Aditya Jyoti Paul
Abstract:
Sentence Ordering refers to the task of rearranging a set of sentences into the appropriate coherent order. For this task, most previous approaches have explored global context-based end-to-end methods using Sequence Generation techniques. In this paper, we put forward a set of robust local and global context-based pairwise ordering strategies, leveraging which our prediction strategies outperform…
▽ More
Sentence Ordering refers to the task of rearranging a set of sentences into the appropriate coherent order. For this task, most previous approaches have explored global context-based end-to-end methods using Sequence Generation techniques. In this paper, we put forward a set of robust local and global context-based pairwise ordering strategies, leveraging which our prediction strategies outperform all previous works in this domain. Our proposed encoding method utilizes the paragraph's rich global contextual information to predict the pairwise order using novel transformer architectures. Analysis of the two proposed decoding strategies helps better explain error propagation in pairwise models. This approach is the most accurate pure pairwise model and our encoding strategy also significantly improves the performance of other recent approaches that use pairwise models, including the previous state-of-the-art, demonstrating the research novelty and generalizability of this work. Additionally, we show how the pre-training task for ALBERT helps it to significantly outperform BERT, despite having considerably lesser parameters. The extensive experimental results, architectural analysis and ablation studies demonstrate the effectiveness and superiority of the proposed models compared to the previous state-of-the-art, besides providing a much better understanding of the functioning of pairwise models.
△ Less
Submitted 20 August, 2022; v1 submitted 8 October, 2021;
originally announced October 2021.
-
Active Inference for Stochastic Control
Authors:
Aswin Paul,
Noor Sajid,
Manoj Gopalkrishnan,
Adeel Razi
Abstract:
Active inference has emerged as an alternative approach to control problems given its intuitive (probabilistic) formalism. However, despite its theoretical utility, computational implementations have largely been restricted to low-dimensional, deterministic settings. This paper highlights that this is a consequence of the inability to adequately model stochastic transition dynamics, particularly w…
▽ More
Active inference has emerged as an alternative approach to control problems given its intuitive (probabilistic) formalism. However, despite its theoretical utility, computational implementations have largely been restricted to low-dimensional, deterministic settings. This paper highlights that this is a consequence of the inability to adequately model stochastic transition dynamics, particularly when an extensive policy (i.e., action trajectory) space must be evaluated during planning. Fortunately, recent advancements propose a modified planning algorithm for finite temporal horizons. We build upon this work to assess the utility of active inference for a stochastic control setting. For this, we simulate the classic windy grid-world task with additional complexities, namely: 1) environment stochasticity; 2) learning of transition dynamics; and 3) partial observability. Our results demonstrate the advantage of using active inference, compared to reinforcement learning, in both deterministic and stochastic settings.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Machine Learning Advances aiding Recognition and Classification of Indian Monuments and Landmarks
Authors:
Aditya Jyoti Paul,
Smaranjit Ghose,
Kanishka Aggarwal,
Niketha Nethaji,
Shivam Pal,
Arnab Dutta Purkayastha
Abstract:
Tourism in India plays a quintessential role in the country's economy with an estimated 9.2% GDP share for the year 2018. With a yearly growth rate of 6.2%, the industry holds a huge potential for being the primary driver of the economy as observed in the nations of the Middle East like the United Arab Emirates. The historical and cultural diversity exhibited throughout the geography of the nation…
▽ More
Tourism in India plays a quintessential role in the country's economy with an estimated 9.2% GDP share for the year 2018. With a yearly growth rate of 6.2%, the industry holds a huge potential for being the primary driver of the economy as observed in the nations of the Middle East like the United Arab Emirates. The historical and cultural diversity exhibited throughout the geography of the nation is a unique spectacle for people around the world and therefore serves to attract tourists in tens of millions in number every year. Traditionally, tour guides or academic professionals who study these heritage monuments were responsible for providing information to the visitors regarding their architectural and historical significance. However, unfortunately this system has several caveats when considered on a large scale such as unavailability of sufficient trained people, lack of accurate information, failure to convey the richness of details in an attractive format etc. Recently, machine learning approaches revolving around the usage of monument pictures have been shown to be useful for rudimentary analysis of heritage sights. This paper serves as a survey of the research endeavors undertaken in this direction which would eventually provide insights for building an automated decision system that could be utilized to make the experience of tourism in India more modernized for visitors.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
The Need and Status of Sea Turtle Conservation and Survey of Associated Computer Vision Advances
Authors:
Aditya Jyoti Paul
Abstract:
For over hundreds of millions of years, sea turtles and their ancestors have swum in the vast expanses of the ocean. They have undergone a number of evolutionary changes, leading to speciation and sub-speciation. However, in the past few decades, some of the most notable forces driving the genetic variance and population decline have been global warming and anthropogenic impact ranging from large-…
▽ More
For over hundreds of millions of years, sea turtles and their ancestors have swum in the vast expanses of the ocean. They have undergone a number of evolutionary changes, leading to speciation and sub-speciation. However, in the past few decades, some of the most notable forces driving the genetic variance and population decline have been global warming and anthropogenic impact ranging from large-scale poaching, collecting turtle eggs for food, besides dumping trash including plastic waste into the ocean. This leads to severe detrimental effects in the sea turtle population, driving them to extinction. This research focusses on the forces causing the decline in sea turtle population, the necessity for the global conservation efforts along with its successes and failures, followed by an in-depth analysis of the modern advances in detection and recognition of sea turtles, involving Machine Learning and Computer Vision systems, aiding the conservation efforts.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
Extraction of Key-frames of Endoscopic Videos by using Depth Information
Authors:
Pradipta Sasmal,
Avinash Paul,
M. K. Bhuyan,
Yuji Iwahori
Abstract:
A deep learning-based monocular depth estimation (MDE) technique is proposed for selection of most informative frames (key frames) of an endoscopic video. In most of the cases, ground truth depth maps of polyps are not readily available and that is why the transfer learning approach is adopted in our method. An endoscopic modalities generally capture thousands of frames. In this scenario, it is qu…
▽ More
A deep learning-based monocular depth estimation (MDE) technique is proposed for selection of most informative frames (key frames) of an endoscopic video. In most of the cases, ground truth depth maps of polyps are not readily available and that is why the transfer learning approach is adopted in our method. An endoscopic modalities generally capture thousands of frames. In this scenario, it is quite important to discard low-quality and clinically irrelevant frames of an endoscopic video while the most informative frames should be retained for clinical diagnosis. In this view, a key-frame selection strategy is proposed by utilizing the depth information of polyps. In our method, image moment, edge magnitude, and key-points are considered for adaptively selecting the key frames. One important application of our proposed method could be the 3D reconstruction of polyps with the help of extracted key frames. Also, polyps are localized with the help of extracted depth maps.
△ Less
Submitted 30 June, 2021;
originally announced July 2021.
-
Advances in Classifying the Stages of Diabetic Retinopathy Using Convolutional Neural Networks in Low Memory Edge Devices
Authors:
Aditya Jyoti Paul
Abstract:
Diabetic Retinopathy (DR) is a severe complication that may lead to retinal vascular damage and is one of the leading causes of vision impairment and blindness. DR broadly is classified into two stages - non-proliferative (NPDR), where there are almost no symptoms, except a few microaneurysms, and proliferative (PDR) involving a huge number of microaneurysms and hemorrhages, soft and hard exudates…
▽ More
Diabetic Retinopathy (DR) is a severe complication that may lead to retinal vascular damage and is one of the leading causes of vision impairment and blindness. DR broadly is classified into two stages - non-proliferative (NPDR), where there are almost no symptoms, except a few microaneurysms, and proliferative (PDR) involving a huge number of microaneurysms and hemorrhages, soft and hard exudates, neo-vascularization, macular ischemia or a combination of these, making it easier to detect. More specifically, DR is usually classified into five levels, labeled 0-4, from 0 indicating no DR to 4 which is most severe. This paper firstly presents a discussion on the risk factors of the disease, then surveys the recent literature on the topic followed by examining certain techniques which were found to be highly effective in improving the prognosis accuracy. Finally, a convolutional neural network model is proposed to detect all the stages of DR on a low-memory edge microcontroller. The model has a size of just 5.9 MB, accuracy and F1 score both of 94% and an inference speed of about 20 frames per second.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
SkeletonVis: Interactive Visualization for Understanding Adversarial Attacks on Human Action Recognition Models
Authors:
Haekyu Park,
Zijie J. Wang,
Nilaksh Das,
Anindya S. Paul,
Pruthvi Perumalla,
Zhiyan Zhou,
Duen Horng Chau
Abstract:
Skeleton-based human action recognition technologies are increasingly used in video based applications, such as home robotics, healthcare on aging population, and surveillance. However, such models are vulnerable to adversarial attacks, raising serious concerns for their use in safety-critical applications. To develop an effective defense against attacks, it is essential to understand how such att…
▽ More
Skeleton-based human action recognition technologies are increasingly used in video based applications, such as home robotics, healthcare on aging population, and surveillance. However, such models are vulnerable to adversarial attacks, raising serious concerns for their use in safety-critical applications. To develop an effective defense against attacks, it is essential to understand how such attacks mislead the pose detection models into making incorrect predictions. We present SkeletonVis, the first interactive system that visualizes how the attacks work on the models to enhance human understanding of attacks.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
A General Framework Combining Generative Adversarial Networks and Mixture Density Networks for Inverse Modeling in Microstructural Materials Design
Authors:
Zijiang Yang,
Dipendra Jha,
Arindam Paul,
Wei-keng Liao,
Alok Choudhary,
Ankit Agrawal
Abstract:
Microstructural materials design is one of the most important applications of inverse modeling in materials science. Generally speaking, there are two broad modeling paradigms in scientific applications: forward and inverse. While the forward modeling estimates the observations based on known parameters, the inverse modeling attempts to infer the parameters given the observations. Inverse problems…
▽ More
Microstructural materials design is one of the most important applications of inverse modeling in materials science. Generally speaking, there are two broad modeling paradigms in scientific applications: forward and inverse. While the forward modeling estimates the observations based on known parameters, the inverse modeling attempts to infer the parameters given the observations. Inverse problems are usually more critical as well as difficult in scientific applications as they seek to explore the parameters that cannot be directly observed. Inverse problems are used extensively in various scientific fields, such as geophysics, healthcare and materials science. However, it is challenging to solve inverse problems, because they usually need to learn a one-to-many non-linear mapping, and also require significant computing time, especially for high-dimensional parameter space. Further, inverse problems become even more difficult to solve when the dimension of input (i.e. observation) is much lower than that of output (i.e. parameters). In this work, we propose a framework consisting of generative adversarial networks and mixture density networks for inverse modeling, and it is evaluated on a materials science dataset for microstructural materials design. Compared with baseline methods, the results demonstrate that the proposed framework can overcome the above-mentioned challenges and produce multiple promising solutions in an efficient manner.
△ Less
Submitted 25 January, 2021;
originally announced January 2021.
-
CutLang V2: towards a unified Analysis Description Language
Authors:
G. Unel,
S. Sekmen,
A. M. Toon,
B. Gokturk,
B. Orgen,
A. Paul,
N. Ravel,
J. Setpal
Abstract:
We will present the latest developments in CutLang, the runtime interpreter of a recently-developed analysis description language (ADL) for collider data analysis. ADL is a domain-specific, declarative language that describes the contents of an analysis in a standard and unambiguous way, independent of any computing framework. In ADL, analyses are written in human-readable plain text files, separa…
▽ More
We will present the latest developments in CutLang, the runtime interpreter of a recently-developed analysis description language (ADL) for collider data analysis. ADL is a domain-specific, declarative language that describes the contents of an analysis in a standard and unambiguous way, independent of any computing framework. In ADL, analyses are written in human-readable plain text files, separating object, variable and event selection definitions in blocks, with a syntax that includes mathematical and logical operations, comparison and optimisation operators, reducers, four-vector algebra and commonly used functions. Adopting ADLs would bring numerous benefits to the LHC experimental and phenomenological communities, ranging from analysis preservation beyond the lifetimes of experiments or analysis software to facilitating the abstraction, design, visualization, validation, combination, reproduction, interpretation and overall communication of the analysis contents. Since their initial release, ADL and CutLang have been used for implementing and running numerous LHC analyses. In this process, the original syntax from CutLang v1 has been modified for better ADL compatibility, and the interpreter has been adapted to work with that syntax, resulting in the current release v2. Furthermore, CutLang has been enhanced to handle object combinatorics, to include tables and weights, to save events at any analysis stage, to benefit from multi-core/multi-CPU hardware among other improvements. In this contribution, these and other enhancements are discussed in details. In addition, real life examples from LHC analyses are presented together with a user manual.
△ Less
Submitted 28 July, 2021; v1 submitted 22 January, 2021;
originally announced January 2021.
-
Clustering with Semidefinite Programming and Fixed Point Iteration
Authors:
Pedro Felzenszwalb,
Caroline Klivans,
Alice Paul
Abstract:
We introduce a novel method for clustering using a semidefinite programming (SDP) relaxation of the Max k-Cut problem. The approach is based on a new methodology for rounding the solution of an SDP relaxation using iterated linear optimization. We show the vertices of the Max k-Cut relaxation correspond to partitions of the data into at most k sets. We also show the vertices are attractive fixed p…
▽ More
We introduce a novel method for clustering using a semidefinite programming (SDP) relaxation of the Max k-Cut problem. The approach is based on a new methodology for rounding the solution of an SDP relaxation using iterated linear optimization. We show the vertices of the Max k-Cut relaxation correspond to partitions of the data into at most k sets. We also show the vertices are attractive fixed points of iterated linear optimization. Each step of this iterative process solves a relaxation of the closest vertex problem and leads to a new clustering problem where the underlying clusters are more clearly defined. Our experiments show that using fixed point iteration for rounding the Max k-Cut SDP relaxation leads to significantly better results when compared to randomized rounding.
△ Less
Submitted 5 July, 2022; v1 submitted 16 December, 2020;
originally announced December 2020.
-
An Efficient Analyses of the Behavior of One Dimensional Chaotic Maps using 0-1 Test and Three State Test
Authors:
Joan S. Muthu,
Aditya Jyoti Paul,
P. Murali
Abstract:
In this paper, a rigorous analysis of the behavior of the standard logistic map, Logistic Tent system (LTS), Logistic-Sine system (LSS) and Tent-Sine system (TSS) is performed using 0-1 test and three state test (3ST). In this work, it has been proved that the strength of the chaotic behavior is not uniform. Through extensive experiment and analysis, the strong and weak chaotic regions of LTS, LSS…
▽ More
In this paper, a rigorous analysis of the behavior of the standard logistic map, Logistic Tent system (LTS), Logistic-Sine system (LSS) and Tent-Sine system (TSS) is performed using 0-1 test and three state test (3ST). In this work, it has been proved that the strength of the chaotic behavior is not uniform. Through extensive experiment and analysis, the strong and weak chaotic regions of LTS, LSS and TSS have been identified. This would enable researchers using these maps, to have better choices of control parameters as key values, for stronger encryption. In addition, this paper serves as a precursor to stronger testing practices in cryptosystem research, as Lyapunov exponent alone has been shown to fail as a true representation of the chaotic nature of a map.
△ Less
Submitted 13 February, 2021; v1 submitted 7 December, 2020;
originally announced December 2020.
-
Iterated Linear Optimization
Authors:
Pedro Felzenszwalb,
Caroline Klivans,
Alice Paul
Abstract:
We introduce a fixed point iteration process built on optimization of a linear function over a compact domain. We prove the process always converges to a fixed point and explore the set of fixed points in various convex sets. In particular, we consider elliptopes and derive an algebraic characterization of their fixed points. We show that the attractive fixed points of an elliptope are exactly its…
▽ More
We introduce a fixed point iteration process built on optimization of a linear function over a compact domain. We prove the process always converges to a fixed point and explore the set of fixed points in various convex sets. In particular, we consider elliptopes and derive an algebraic characterization of their fixed points. We show that the attractive fixed points of an elliptope are exactly its vertices. Finally, we discuss how fixed point iteration can be used for rounding the solution of a semidefinite programming relaxation.
△ Less
Submitted 16 March, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
A Tiny CNN Architecture for Medical Face Mask Detection for Resource-Constrained Endpoints
Authors:
Puranjay Mohan,
Aditya Jyoti Paul,
Abhay Chirania
Abstract:
The world is going through one of the most dangerous pandemics of all time with the rapid spread of the novel coronavirus (COVID-19). According to the World Health Organisation, the most effective way to thwart the transmission of coronavirus is to wear medical face masks. Monitoring the use of face masks in public places has been a challenge because manual monitoring could be unsafe. This paper p…
▽ More
The world is going through one of the most dangerous pandemics of all time with the rapid spread of the novel coronavirus (COVID-19). According to the World Health Organisation, the most effective way to thwart the transmission of coronavirus is to wear medical face masks. Monitoring the use of face masks in public places has been a challenge because manual monitoring could be unsafe. This paper proposes an architecture for detecting medical face masks for deployment on resource-constrained endpoints having extremely low memory footprints. A small development board with an ARM Cortex-M7 microcontroller clocked at 480 Mhz and having just 496 KB of framebuffer RAM, has been used for the deployment of the model. Using the TensorFlow Lite framework, the model is quantized to further reduce its size. The proposed model is 138 KB post quantization and runs at the inference speed of 30 FPS.
△ Less
Submitted 3 June, 2021; v1 submitted 30 November, 2020;
originally announced November 2020.
-
Rethinking Generalization in American Sign Language Prediction for Edge Devices with Extremely Low Memory Footprint
Authors:
Aditya Jyoti Paul,
Puranjay Mohan,
Stuti Sehgal
Abstract:
Due to the boom in technical compute in the last few years, the world has seen massive advances in artificially intelligent systems solving diverse real-world problems. But a major roadblock in the ubiquitous acceptance of these models is their enormous computational complexity and memory footprint. Hence efficient architectures and training techniques are required for deployment on extremely low…
▽ More
Due to the boom in technical compute in the last few years, the world has seen massive advances in artificially intelligent systems solving diverse real-world problems. But a major roadblock in the ubiquitous acceptance of these models is their enormous computational complexity and memory footprint. Hence efficient architectures and training techniques are required for deployment on extremely low resource inference endpoints. This paper proposes an architecture for detection of alphabets in American Sign Language on an ARM Cortex-M7 microcontroller having just 496 KB of framebuffer RAM. Leveraging parameter quantization is a common technique that might cause varying drops in test accuracy. This paper proposes using interpolation as augmentation amongst other techniques as an efficient method of reducing this drop, which also helps the model generalize well to previously unseen noisy data. The proposed model is about 185 KB post-quantization and inference speed is 20 frames per second.
△ Less
Submitted 13 February, 2021; v1 submitted 27 November, 2020;
originally announced November 2020.
-
Recent Advances in Selective Image Encryption and its Indispensability due to COVID-19
Authors:
Aditya Jyoti Paul
Abstract:
The COVID-19 pandemic serves as a grim reminder of the unexpected nature of these outbreaks and gives rise to a unique set of research challenges in a variety of fields. As people all over the world adjust to this new 'normal', with most workplaces, from companies to educational institutions shifting online, enormous surges in the transmission of images and videos have been observed, creating reco…
▽ More
The COVID-19 pandemic serves as a grim reminder of the unexpected nature of these outbreaks and gives rise to a unique set of research challenges in a variety of fields. As people all over the world adjust to this new 'normal', with most workplaces, from companies to educational institutions shifting online, enormous surges in the transmission of images and videos have been observed, creating record-breaking stresses on the internet backbone. At the same time, maintaining the privacy and security of the users' data is of immense importance, this is where fast and efficient image encryption algorithms play a vital role. This paper discusses the calamitous effects of the pandemic on the world population and how their changes in multimedia consumption have led to an urgent need for the development and deployment of secure and fast image encryption, especially selective image encryption techniques. It carefully surveys the most recent advances in this field, discusses their real-world effects and finally explores some future research avenues, to provide swift relief and recover from the disastrous effects of the pandemic.
△ Less
Submitted 13 February, 2021; v1 submitted 27 November, 2020;
originally announced November 2020.
-
Randomized fast no-loss expert system to play tic tac toe like a human
Authors:
Aditya Jyoti Paul
Abstract:
This paper introduces a blazingly fast, no-loss expert system for Tic Tac Toe using Decision Trees called T3DT, that tries to emulate human gameplay as closely as possible. It does not make use of any brute force, minimax or evolutionary techniques, but is still always unbeatable. In order to make the gameplay more human-like, randomization is prioritized and T3DT randomly chooses one of the multi…
▽ More
This paper introduces a blazingly fast, no-loss expert system for Tic Tac Toe using Decision Trees called T3DT, that tries to emulate human gameplay as closely as possible. It does not make use of any brute force, minimax or evolutionary techniques, but is still always unbeatable. In order to make the gameplay more human-like, randomization is prioritized and T3DT randomly chooses one of the multiple optimal moves at each step. Since it does not need to analyse the complete game tree at any point, T3DT is exceptionally faster than any brute force or minimax algorithm, this has been shown theoretically as well as empirically from clock-time analyses in this paper. T3DT also doesn't need the data sets or the time to train an evolutionary model, making it a practical no-loss approach to play Tic Tac Toe.
△ Less
Submitted 10 November, 2020; v1 submitted 23 September, 2020;
originally announced September 2020.
-
Multifaceted COVID-19 Outbreak
Authors:
Aneesh Mathews Paul,
Sinnu Susan Thomas
Abstract:
The time when everyone is struggling in the cruel hands of COVID19, we present the holistic view on the effects of this pandemic in certain aspects of life. A lot of literature exists in COVID-19, but most of them talk about the social and psychological side of the COVID problems. COVID-19 has affected our day-to-day life and its effects are extensive. Most of the literature presents the adverse e…
▽ More
The time when everyone is struggling in the cruel hands of COVID19, we present the holistic view on the effects of this pandemic in certain aspects of life. A lot of literature exists in COVID-19, but most of them talk about the social and psychological side of the COVID problems. COVID-19 has affected our day-to-day life and its effects are extensive. Most of the literature presents the adverse effect of the pandemic, but there are very few state-of-the-art approaches that discuss its beneficial effects. We see the multiple faces of the pandemic in this paper. To the best of our knowledge, this is the first review that presents the pros and cons of the pandemic. We present a survey that surrounds over effects on education, environment, and religion. The positive side of COVID-19 raises an alarm for us to wake up and work in that direction.
△ Less
Submitted 23 June, 2022; v1 submitted 26 August, 2020;
originally announced August 2020.
-
I/Q Imbalance Aware Nonlinear Wireless-Powered Relaying of B5G Networks: Security and Reliability Analysis
Authors:
Xingwang Li,
Mengyan Huang,
Yuanwei Liu,
Varun G Menon,
Anand Paul,
Zhiguo Ding
Abstract:
Physical layer security is known as a promising paradigm to ensure security for the beyond 5G (B5G) networks in the presence of eavesdroppers. In this paper, we elaborate on a tractable analysis framework to evaluate the reliability and security of wireless-powered decode-and-forward (DF) multi-relay networks. The nonlinear energy harvesters, in-phase and quadrature-phase imbalance (IQI) and chann…
▽ More
Physical layer security is known as a promising paradigm to ensure security for the beyond 5G (B5G) networks in the presence of eavesdroppers. In this paper, we elaborate on a tractable analysis framework to evaluate the reliability and security of wireless-powered decode-and-forward (DF) multi-relay networks. The nonlinear energy harvesters, in-phase and quadrature-phase imbalance (IQI) and channel estimation errors (CEEs) are taken into account in the considered system. To further improve the secure performance, two relay selection strategies are presented: 1) suboptimal relay selection (SRS); 2) optimal relay selection (ORS). Specifically, exact analytical expressions for the outage probability (OP) and the intercept probability (IP) are derived in closed-form. For the IP, we consider that the eavesdropper can wiretap the signal from the source or the relay. In order to obtain more useful insights, we carry out the asymptotic analysis and diversity orders for the OP in the high signal-to-noise ratio (SNR) regime under non-ideal and ideal conditions. Numerical results show that: 1) Although the mismatches of amplitude/phase of transmitter (TX)/receiver (RX) limit the OP performance, it can enhance IP performance; 2) Large number of relays yields better OP performance; 3) There are error floors for the OP because of the CEEs; 4) There is a trade-off for the OP and IO to obtain the balance between reliability and security.
△ Less
Submitted 6 June, 2020;
originally announced June 2020.