-
TAGMol: Target-Aware Gradient-guided Molecule Generation
Authors:
Vineeth Dorna,
D. Subhalingam,
Keshav Kolluru,
Shreshth Tuli,
Mrityunjay Singh,
Saurabh Singal,
N. M. Anoop Krishnan,
Sayan Ranu
Abstract:
3D generative models have shown significant promise in structure-based drug design (SBDD), particularly in discovering ligands tailored to specific target binding sites. Existing algorithms often focus primarily on ligand-target binding, characterized by binding affinity. Moreover, models trained solely on target-ligand distribution may fall short in addressing the broader objectives of drug disco…
▽ More
3D generative models have shown significant promise in structure-based drug design (SBDD), particularly in discovering ligands tailored to specific target binding sites. Existing algorithms often focus primarily on ligand-target binding, characterized by binding affinity. Moreover, models trained solely on target-ligand distribution may fall short in addressing the broader objectives of drug discovery, such as the development of novel ligands with desired properties like drug-likeness, and synthesizability, underscoring the multifaceted nature of the drug design process. To overcome these challenges, we decouple the problem into molecular generation and property prediction. The latter synergistically guides the diffusion sampling process, facilitating guided diffusion and resulting in the creation of meaningful molecules with the desired properties. We call this guided molecular generation process as TAGMol. Through experiments on benchmark datasets, TAGMol demonstrates superior performance compared to state-of-the-art baselines, achieving a 22% improvement in average Vina Score and yielding favorable outcomes in essential auxiliary properties. This establishes TAGMol as a comprehensive framework for drug generation.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
PhyPlan: Generalizable and Rapid Physical Task Planning with Physics Informed Skill Networks for Robot Manipulators
Authors:
Mudit Chopra,
Abhinav Barnawal,
Harshil Vagadia,
Tamajit Banerjee,
Shreshth Tuli,
Souvik Chakraborty,
Rohan Paul
Abstract:
Given the task of positioning a ball-like object to a goal region beyond direct reach, humans can often throw, slide, or rebound objects against the wall to attain the goal. However, enabling robots to reason similarly is non-trivial. Existing methods for physical reasoning are data-hungry and struggle with complexity and uncertainty inherent in the real world. This paper presents PhyPlan, a novel…
▽ More
Given the task of positioning a ball-like object to a goal region beyond direct reach, humans can often throw, slide, or rebound objects against the wall to attain the goal. However, enabling robots to reason similarly is non-trivial. Existing methods for physical reasoning are data-hungry and struggle with complexity and uncertainty inherent in the real world. This paper presents PhyPlan, a novel physics-informed planning framework that combines physics-informed neural networks (PINNs) with modified Monte Carlo Tree Search (MCTS) to enable embodied agents to perform dynamic physical tasks. PhyPlan leverages PINNs to simulate and predict outcomes of actions in a fast and accurate manner and uses MCTS for planning. It dynamically determines whether to consult a PINN-based simulator (coarse but fast) or engage directly with the actual environment (fine but slow) to determine optimal policy. Given an unseen task, PhyPlan can infer the sequence of actions and learn the latent parameters, resulting in a generalizable approach that can rapidly learn to perform novel physical tasks. Evaluation with robots in simulated 3D environments demonstrates the ability of our approach to solve 3D-physical reasoning tasks involving the composition of dynamic skills. Quantitatively, PhyPlan excels in several aspects: (i) it achieves lower regret when learning novel tasks compared to the state-of-the-art, (ii) it expedites skill learning and enhances the speed of physical reasoning, (iii) it demonstrates higher data efficiency compared to a physics un-informed approach.
△ Less
Submitted 22 April, 2024;
originally announced June 2024.
-
DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling
Authors:
Shikhar Tuli,
Chi-Heng Lin,
Yen-Chang Hsu,
Niraj K. Jha,
Yilin Shen,
Hongxia Jin
Abstract:
Traditional language models operate autoregressively, i.e., they predict one token at a time. Rapid explosion in model sizes has resulted in high inference times. In this work, we propose DynaMo, a suite of multi-token prediction language models that reduce net inference times. Our models $\textit{dynamically}$ predict multiple tokens based on their confidence in the predicted joint probability di…
▽ More
Traditional language models operate autoregressively, i.e., they predict one token at a time. Rapid explosion in model sizes has resulted in high inference times. In this work, we propose DynaMo, a suite of multi-token prediction language models that reduce net inference times. Our models $\textit{dynamically}$ predict multiple tokens based on their confidence in the predicted joint probability distribution. We propose a lightweight technique to train these models, leveraging the weights of traditional autoregressive counterparts. Moreover, we propose novel ways to enhance the estimated joint probability to improve text generation quality, namely co-occurrence weighted masking and adaptive thresholding. We also propose systematic qualitative and quantitative methods to rigorously test the quality of generated text for non-autoregressive generation. One of the models in our suite, DynaMo-7.3B-T3, achieves same-quality generated text as the baseline (Pythia-6.9B) while achieving 2.57$\times$ speed-up with only 5.87% and 2.67% parameter and training time overheads, respectively.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Uncertainty-aware Active Learning of NeRF-based Object Models for Robot Manipulators using Visual and Re-orientation Actions
Authors:
Saptarshi Dasgupta,
Akshat Gupta,
Shreshth Tuli,
Rohan Paul
Abstract:
Manipulating unseen objects is challenging without a 3D representation, as objects generally have occluded surfaces. This requires physical interaction with objects to build their internal representations. This paper presents an approach that enables a robot to rapidly learn the complete 3D model of a given object for manipulation in unfamiliar orientations. We use an ensemble of partially constru…
▽ More
Manipulating unseen objects is challenging without a 3D representation, as objects generally have occluded surfaces. This requires physical interaction with objects to build their internal representations. This paper presents an approach that enables a robot to rapidly learn the complete 3D model of a given object for manipulation in unfamiliar orientations. We use an ensemble of partially constructed NeRF models to quantify model uncertainty to determine the next action (a visual or re-orientation action) by optimizing informativeness and feasibility. Further, our approach determines when and how to grasp and re-orient an object given its partial NeRF model and re-estimates the object pose to rectify misalignments introduced during the interaction. Experiments with a simulated Franka Emika Robot Manipulator operating in a tabletop environment with benchmark objects demonstrate an improvement of (i) 14% in visual reconstruction quality (PSNR), (ii) 20% in the geometric/depth reconstruction of the object surface (F-score) and (iii) 71% in the task success rate of manipulating objects a-priori unseen orientations/stable configurations in the scene; over current methods. The project page can be found here: https://actnerf.github.io.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
PhyPlan: Compositional and Adaptive Physical Task Reasoning with Physics-Informed Skill Networks for Robot Manipulators
Authors:
Harshil Vagadia,
Mudit Chopra,
Abhinav Barnawal,
Tamajit Banerjee,
Shreshth Tuli,
Souvik Chakraborty,
Rohan Paul
Abstract:
Given the task of positioning a ball-like object to a goal region beyond direct reach, humans can often throw, slide, or rebound objects against the wall to attain the goal. However, enabling robots to reason similarly is non-trivial. Existing methods for physical reasoning are data-hungry and struggle with complexity and uncertainty inherent in the real world. This paper presents PhyPlan, a novel…
▽ More
Given the task of positioning a ball-like object to a goal region beyond direct reach, humans can often throw, slide, or rebound objects against the wall to attain the goal. However, enabling robots to reason similarly is non-trivial. Existing methods for physical reasoning are data-hungry and struggle with complexity and uncertainty inherent in the real world. This paper presents PhyPlan, a novel physics-informed planning framework that combines physics-informed neural networks (PINNs) with modified Monte Carlo Tree Search (MCTS) to enable embodied agents to perform dynamic physical tasks. PhyPlan leverages PINNs to simulate and predict outcomes of actions in a fast and accurate manner and uses MCTS for planning. It dynamically determines whether to consult a PINN-based simulator (coarse but fast) or engage directly with the actual environment (fine but slow) to determine optimal policy. Evaluation with robots in simulated 3D environments demonstrates the ability of our approach to solve 3D-physical reasoning tasks involving the composition of dynamic skills. Quantitatively, PhyPlan excels in several aspects: (i) it achieves lower regret when learning novel tasks compared to state-of-the-art, (ii) it expedites skill learning and enhances the speed of physical reasoning, (iii) it demonstrates higher data efficiency compared to a physics un-informed approach.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
Comprehensive Forecasting of California's Energy Consumption: A Multi-Source and Sectoral Analysis Using ARIMA and ARIMAX Models
Authors:
Zahra Moslemi,
Logan Clark,
Sarah Kernal,
Samantha Rehome,
Scott Sprengel,
Ahoora Tamizifar,
Shawna Tuli,
Vish Chokshi,
Mo Nomeli,
Ella Liang,
Moury Bidgoli,
Jeff Lu,
Manish Dasaur,
Marty Hodgett
Abstract:
California's significant role as the second-largest consumer of energy in the United States underscores the importance of accurate energy consumption predictions. With a thriving industrial sector, a burgeoning population, and ambitious environmental goals, the state's energy landscape is dynamic and complex. This paper presents a comprehensive analysis of California's energy consumption trends an…
▽ More
California's significant role as the second-largest consumer of energy in the United States underscores the importance of accurate energy consumption predictions. With a thriving industrial sector, a burgeoning population, and ambitious environmental goals, the state's energy landscape is dynamic and complex. This paper presents a comprehensive analysis of California's energy consumption trends and provides detailed forecasting models for different energy sources and sectors. The study leverages ARIMA and ARIMAX models, considering both historical consumption data and exogenous variables. We address the unique challenges posed by the COVID-19 pandemic and the limited data for 2022, highlighting the resilience of these models in the face of uncertainty. Our analysis reveals that while fossil fuels continue to dominate California's energy landscape, renewable energy sources, particularly solar and biomass, are experiencing substantial growth. Hydroelectric power, while sensitive to precipitation, remains a significant contributor to renewable energy consumption. Furthermore, we anticipate ongoing efforts to reduce fossil fuel consumption. The forecasts for energy consumption by sector suggest continued growth in the commercial and residential sectors, reflecting California's expanding economy and population. In contrast, the industrial sector is expected to experience more moderate changes, while the transportation sector remains the largest energy consumer.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
BREATHE: Second-Order Gradients and Heteroscedastic Emulation based Design Space Exploration
Authors:
Shikhar Tuli,
Niraj K. Jha
Abstract:
Researchers constantly strive to explore larger and more complex search spaces in various scientific studies and physical experiments. However, such investigations often involve sophisticated simulators or time-consuming experiments that make exploring and observing new design samples challenging. Previous works that target such applications are typically sample-inefficient and restricted to vecto…
▽ More
Researchers constantly strive to explore larger and more complex search spaces in various scientific studies and physical experiments. However, such investigations often involve sophisticated simulators or time-consuming experiments that make exploring and observing new design samples challenging. Previous works that target such applications are typically sample-inefficient and restricted to vector search spaces. To address these limitations, this work proposes a constrained multi-objective optimization (MOO) framework, called BREATHE, that searches not only traditional vector-based design spaces but also graph-based design spaces to obtain best-performing graphs. It leverages second-order gradients and actively trains a heteroscedastic surrogate model for sample-efficient optimization. In a single-objective vector optimization application, it leads to 64.1% higher performance than the next-best baseline, random forest regression. In graph-based search, BREATHE outperforms the next-best baseline, i.e., a graphical version of Gaussian-process-based Bayesian optimization, with up to 64.9% higher performance. In a MOO task, it achieves up to 21.9$\times$ higher hypervolume than the state-of-the-art method, multi-objective Bayesian optimization (MOBOpt). BREATHE also outperforms the baseline methods on most standard MOO benchmark applications.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
TransCODE: Co-design of Transformers and Accelerators for Efficient Training and Inference
Authors:
Shikhar Tuli,
Niraj K. Jha
Abstract:
Automated co-design of machine learning models and evaluation hardware is critical for efficiently deploying such models at scale. Despite the state-of-the-art performance of transformer models, they are not yet ready for execution on resource-constrained hardware platforms. High memory requirements and low parallelizability of the transformer architecture exacerbate this problem. Recently-propose…
▽ More
Automated co-design of machine learning models and evaluation hardware is critical for efficiently deploying such models at scale. Despite the state-of-the-art performance of transformer models, they are not yet ready for execution on resource-constrained hardware platforms. High memory requirements and low parallelizability of the transformer architecture exacerbate this problem. Recently-proposed accelerators attempt to optimize the throughput and energy consumption of transformer models. However, such works are either limited to a one-sided search of the model architecture or a restricted set of off-the-shelf devices. Furthermore, previous works only accelerate model inference and not training, which incurs substantially higher memory and compute resources, making the problem even more challenging. To address these limitations, this work proposes a dynamic training framework, called DynaProp, that speeds up the training process and reduces memory consumption. DynaProp is a low-overhead pruning method that prunes activations and gradients at runtime. To effectively execute this method on hardware for a diverse set of transformer architectures, we propose ELECTOR, a framework that simulates transformer inference and training on a design space of accelerators. We use this simulator in conjunction with the proposed co-design technique, called TransCODE, to obtain the best-performing models with high accuracy on the given task and minimize latency, energy consumption, and chip area. The obtained transformer-accelerator pair achieves 0.3% higher accuracy than the state-of-the-art pair while incurring 5.2$\times$ lower latency and 3.0$\times$ lower energy consumption.
△ Less
Submitted 26 March, 2023;
originally announced March 2023.
-
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms
Authors:
Shikhar Tuli,
Niraj K. Jha
Abstract:
Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while searching for the best-performing transformer architecture. Furthermore, running traditional, complex, and large transformer models on low-compute edge platforms is a challenging problem. In this work, we propose a framewo…
▽ More
Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while searching for the best-performing transformer architecture. Furthermore, running traditional, complex, and large transformer models on low-compute edge platforms is a challenging problem. In this work, we propose a framework, called ProTran, to profile the hardware performance measures for a design space of transformer architectures and a diverse set of edge devices. We use this profiler in conjunction with the proposed co-design technique to obtain the best-performing models that have high accuracy on the given task and minimize latency, energy consumption, and peak power draw to enable edge deployment. We refer to our framework for co-optimizing accuracy and hardware performance measures as EdgeTran. It searches for the best transformer model and edge device pair. Finally, we propose GPTran, a multi-stage block-level grow-and-prune post-processing step that further improves accuracy in a hardware-aware manner. The obtained transformer model is 2.8$\times$ smaller and has a 0.8% higher GLUE score than the baseline (BERT-Base). Inference with it on the selected edge device enables 15.0% lower latency, 10.0$\times$ lower energy, and 10.8$\times$ lower peak power draw compared to an off-the-shelf GPU.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers
Authors:
Shikhar Tuli,
Niraj K. Jha
Abstract:
Self-attention-based transformer models have achieved tremendous success in the domain of natural language processing. Despite their efficacy, accelerating the transformer is challenging due to its quadratic computational complexity and large activation sizes. Existing transformer accelerators attempt to prune its tokens to reduce memory access, albeit with high compute overheads. Moreover, previo…
▽ More
Self-attention-based transformer models have achieved tremendous success in the domain of natural language processing. Despite their efficacy, accelerating the transformer is challenging due to its quadratic computational complexity and large activation sizes. Existing transformer accelerators attempt to prune its tokens to reduce memory access, albeit with high compute overheads. Moreover, previous works directly operate on large matrices involved in the attention operation, which limits hardware utilization. In order to address these challenges, this work proposes a novel dynamic inference scheme, DynaTran, which prunes activations at runtime with low overhead, substantially reducing the number of ineffectual operations. This improves the throughput of transformer inference. We further propose tiling the matrices in transformer operations along with diverse dataflows to improve data reuse, thus enabling higher energy efficiency. To effectively implement these methods, we propose AccelTran, a novel accelerator architecture for transformers. Extensive experiments with different models and benchmarks demonstrate that DynaTran achieves higher accuracy than the state-of-the-art top-k hardware-aware pruning strategy while attaining up to 1.2$\times$ higher sparsity. One of our proposed accelerators, AccelTran-Edge, achieves 330K$\times$ higher throughput with 93K$\times$ lower energy requirement when compared to a Raspberry Pi device. On the other hand, AccelTran-Server achieves 5.73$\times$ higher throughput and 3.69$\times$ lower energy consumption compared to the state-of-the-art transformer co-processor, Energon. The simulation source code is available at https://github.com/jha-lab/acceltran.
△ Less
Submitted 1 May, 2023; v1 submitted 28 February, 2023;
originally announced February 2023.
-
CILP: Co-simulation based Imitation Learner for Dynamic Resource Provisioning in Cloud Computing Environments
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Intelligent Virtual Machine (VM) provisioning is central to cost and resource efficient computation in cloud computing environments. As bootstrapping VMs is time-consuming, a key challenge for latency-critical tasks is to predict future workload demands to provision VMs proactively. However, existing AI-based solutions tend to not holistically consider all crucial aspects such as provisioning over…
▽ More
Intelligent Virtual Machine (VM) provisioning is central to cost and resource efficient computation in cloud computing environments. As bootstrapping VMs is time-consuming, a key challenge for latency-critical tasks is to predict future workload demands to provision VMs proactively. However, existing AI-based solutions tend to not holistically consider all crucial aspects such as provisioning overheads, heterogeneous VM costs and Quality of Service (QoS) of the cloud system. To address this, we propose a novel method, called CILP, that formulates the VM provisioning problem as two sub-problems of prediction and optimization, where the provisioning plan is optimized based on predicted workload demands. CILP leverages a neural network as a surrogate model to predict future workload demands with a co-simulated digital-twin of the infrastructure to compute QoS scores. We extend the neural network to also act as an imitation learner that dynamically decides the optimal VM provisioning plan. A transformer based neural model reduces training and inference overheads while our novel two-phase decision making loop facilitates in making informed provisioning decisions. Crucially, we address limitations of prior work by including resource utilization, deployment costs and provisioning overheads to inform the provisioning decisions in our imitation learning framework. Experiments with three public benchmarks demonstrate that CILP gives up to 22% higher resource utilization, 14% higher QoS scores and 44% lower execution costs compared to the current online and offline optimization based state-of-the-art methods.
△ Less
Submitted 16 April, 2023; v1 submitted 11 February, 2023;
originally announced February 2023.
-
Simulation-Driven Automated End-to-End Test and Oracle Inference
Authors:
Shreshth Tuli,
Kinga Bojarczuk,
Natalija Gucevska,
Mark Harman,
Xiao-Yu Wang,
Graham Wright
Abstract:
This is the first work to report on inferential testing at scale in industry. Specifically, it reports the experience of automated testing of integrity systems at Meta. We built an internal tool called ALPACAS for automated inference of end-to-end integrity tests. Integrity tests are designed to keep users safe online by checking that interventions take place when harmful behaviour occurs on a pla…
▽ More
This is the first work to report on inferential testing at scale in industry. Specifically, it reports the experience of automated testing of integrity systems at Meta. We built an internal tool called ALPACAS for automated inference of end-to-end integrity tests. Integrity tests are designed to keep users safe online by checking that interventions take place when harmful behaviour occurs on a platform. ALPACAS infers not only the test input, but also the oracle, by observing production interventions to prevent harmful behaviour. This approach allows Meta to automate the process of generating integrity tests for its platforms, such as Facebook and Instagram, which consist of hundreds of millions of lines of production code. We outline the design and deployment of ALPACAS, and report results for its coverage, number of tests produced at each stage of the test inference process, and their pass rates. Specifically, we demonstrate that using ALPACAS significantly improves coverage from a manual test design for the particular aspect of integrity end-to-end testing it was applied to. Further, from a pool of 3 million data points, ALPACAS automatically yields 39 production-ready end-to-end integrity tests. We also report that the ALPACAS-inferred test suite enjoys exceptionally low flakiness for end-to-end testing with its average in-production pass rate of 99.84%.
△ Less
Submitted 5 February, 2023;
originally announced February 2023.
-
CODEBench: A Neural Architecture and Hardware Accelerator Co-Design Framework
Authors:
Shikhar Tuli,
Chia-Hao Li,
Ritvik Sharma,
Niraj K. Jha
Abstract:
Recently, automated co-design of machine learning (ML) models and accelerator architectures has attracted significant attention from both the industry and academia. However, most co-design frameworks either explore a limited search space or employ suboptimal exploration techniques for simultaneous design decision investigations of the ML model and the accelerator. Furthermore, training the ML mode…
▽ More
Recently, automated co-design of machine learning (ML) models and accelerator architectures has attracted significant attention from both the industry and academia. However, most co-design frameworks either explore a limited search space or employ suboptimal exploration techniques for simultaneous design decision investigations of the ML model and the accelerator. Furthermore, training the ML model and simulating the accelerator performance is computationally expensive. To address these limitations, this work proposes a novel neural architecture and hardware accelerator co-design framework, called CODEBench. It is composed of two new benchmarking sub-frameworks, CNNBench and AccelBench, which explore expanded design spaces of convolutional neural networks (CNNs) and CNN accelerators. CNNBench leverages an advanced search technique, BOSHNAS, to efficiently train a neural heteroscedastic surrogate model to converge to an optimal CNN architecture by employing second-order gradients. AccelBench performs cycle-accurate simulations for a diverse set of accelerator architectures in a vast design space. With the proposed co-design method, called BOSHCODE, our best CNN-accelerator pair achieves 1.4% higher accuracy on the CIFAR-10 dataset compared to the state-of-the-art pair, while enabling 59.1% lower latency and 60.8% lower energy consumption. On the ImageNet dataset, it achieves 3.7% higher Top1 accuracy at 43.8% lower latency and 11.2% lower energy consumption. CODEBench outperforms the state-of-the-art framework, i.e., Auto-NBA, by achieving 1.5% higher accuracy and 34.7x higher throughput, while enabling 11.0x lower energy-delay product (EDP) and 4.0x lower chip area on CIFAR-10.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
DeepFT: Fault-Tolerant Edge Computing using a Self-Supervised Deep Surrogate Model
Authors:
Shreshth Tuli,
Giuliano Casale,
Ludmila Cherkasova,
Nicholas R. Jennings
Abstract:
The emergence of latency-critical AI applications has been supported by the evolution of the edge computing paradigm. However, edge solutions are typically resource-constrained, posing reliability challenges due to heightened contention for compute and communication capacities and faulty application behavior in the presence of overload conditions. Although a large amount of generated log data can…
▽ More
The emergence of latency-critical AI applications has been supported by the evolution of the edge computing paradigm. However, edge solutions are typically resource-constrained, posing reliability challenges due to heightened contention for compute and communication capacities and faulty application behavior in the presence of overload conditions. Although a large amount of generated log data can be mined for fault prediction, labeling this data for training is a manual process and thus a limiting factor for automation. Due to this, many companies resort to unsupervised fault-tolerance models. Yet, failure models of this kind can incur a loss of accuracy when they need to adapt to non-stationary workloads and diverse host characteristics. To cope with this, we propose a novel modeling approach, called DeepFT, to proactively avoid system overloads and their adverse effects by optimizing the task scheduling and migration decisions. DeepFT uses a deep surrogate model to accurately predict and diagnose faults in the system and co-simulation based self-supervised learning to dynamically adapt the model in volatile settings. It offers a highly scalable solution as the model size scales by only 3 and 1 percent per unit increase in the number of active tasks and hosts. Extensive experimentation on a Raspberry-Pi based edge cluster with DeFog benchmarks shows that DeepFT can outperform state-of-the-art baseline methods in fault-detection and QoS metrics. Specifically, DeepFT gives the highest F1 scores for fault-detection, reducing service deadline violations by up to 37\% while also improving response time by up to 9%.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
DRAGON: Decentralized Fault Tolerance in Edge Federations
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Edge Federation is a new computing paradigm that seamlessly interconnects the resources of multiple edge service providers. A key challenge in such systems is the deployment of latency-critical and AI based resource-intensive applications in constrained devices. To address this challenge, we propose a novel memory-efficient deep learning based model, namely generative optimization networks (GON).…
▽ More
Edge Federation is a new computing paradigm that seamlessly interconnects the resources of multiple edge service providers. A key challenge in such systems is the deployment of latency-critical and AI based resource-intensive applications in constrained devices. To address this challenge, we propose a novel memory-efficient deep learning based model, namely generative optimization networks (GON). Unlike GANs, GONs use a single network to both discriminate input and generate samples, significantly reducing their memory footprint. Leveraging the low memory footprint of GONs, we propose a decentralized fault-tolerance method called DRAGON that runs simulations (as per a digital modeling twin) to quickly predict and optimize the performance of the edge federation. Extensive experiments with real-world edge computing benchmarks on multiple Raspberry-Pi based federated edge configurations show that DRAGON can outperform the baseline methods in fault-detection and Quality of Service (QoS) metrics. Specifically, the proposed method gives higher F1 scores for fault-detection than the best deep learning (DL) method, while consuming lower memory than the heuristic methods. This allows for improvement in energy consumption, response time and service level agreement violations by up to 74, 63 and 82 percent, respectively.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
AI Augmented Edge and Fog Computing: Trends and Challenges
Authors:
Shreshth Tuli,
Fatemeh Mirhakimi,
Samodha Pallewatta,
Syed Zawad,
Giuliano Casale,
Bahman Javadi,
Feng Yan,
Rajkumar Buyya,
Nicholas R. Jennings
Abstract:
In recent years, the landscape of computing paradigms has witnessed a gradual yet remarkable shift from monolithic computing to distributed and decentralized paradigms such as Internet of Things (IoT), Edge, Fog, Cloud, and Serverless. The frontiers of these computing technologies have been boosted by shift from manually encoded algorithms to Artificial Intelligence (AI)-driven autonomous systems…
▽ More
In recent years, the landscape of computing paradigms has witnessed a gradual yet remarkable shift from monolithic computing to distributed and decentralized paradigms such as Internet of Things (IoT), Edge, Fog, Cloud, and Serverless. The frontiers of these computing technologies have been boosted by shift from manually encoded algorithms to Artificial Intelligence (AI)-driven autonomous systems for optimum and reliable management of distributed computing resources. Prior work focuses on improving existing systems using AI across a wide range of domains, such as efficient resource provisioning, application deployment, task placement, and service management. This survey reviews the evolution of data-driven AI-augmented technologies and their impact on computing systems. We demystify new techniques and draw key insights in Edge, Fog and Cloud resource management-related uses of AI methods and also look at how AI can innovate traditional applications for enhanced Quality of Service (QoS) in the presence of a continuum of resources. We present the latest trends and impact areas such as optimizing AI models that are deployed on or for computing systems. We layout a roadmap for future research directions in areas such as resource management for QoS optimization and service reliability. Finally, we discuss blue-sky ideas and envision this work as an anchor point for future research on AI-driven computing systems.
△ Less
Submitted 14 April, 2023; v1 submitted 1 August, 2022;
originally announced August 2022.
-
ToolTango: Common sense Generalization in Predicting Sequential Tool Interactions for Robot Plan Synthesis
Authors:
Shreshth Tuli,
Rajas Bansal,
Rohan Paul,
Mausam
Abstract:
Robots assisting us in environments such as factories or homes must learn to make use of objects as tools to perform tasks, for instance using a tray to carry objects. We consider the problem of learning commonsense knowledge of when a tool may be useful and how its use may be composed with other tools to accomplish a high-level task instructed by a human. Specifically, we introduce a novel neural…
▽ More
Robots assisting us in environments such as factories or homes must learn to make use of objects as tools to perform tasks, for instance using a tray to carry objects. We consider the problem of learning commonsense knowledge of when a tool may be useful and how its use may be composed with other tools to accomplish a high-level task instructed by a human. Specifically, we introduce a novel neural model, termed TOOLTANGO, that first predicts the next tool to be used, and then uses this information to predict the next action. We show that this joint model can inform learning of a fine-grained policy enabling the robot to use a particular tool in sequence and adds a significant value in making the model more accurate. TOOLTANGO encodes the world state, comprising objects and symbolic relationships between them, using a graph neural network and is trained using demonstrations from human teachers instructing a virtual robot in a physics simulator. The model learns to attend over the scene using knowledge of the goal and the action history, finally decoding the symbolic action to execute. Crucially, we address generalization to unseen environments where some known tools are missing, but alternative unseen tools are present. We show that by augmenting the representation of the environment with pre-trained embeddings derived from a knowledge-base, the model can generalize effectively to novel environments. Experimental results show at least 48.8-58.1% absolute improvement over the baselines in predicting successful symbolic plans for a simulated mobile manipulator in novel environments with unseen objects. This work takes a step in the direction of enabling robots to rapidly synthesize robust plans for complex tasks, particularly in novel settings
△ Less
Submitted 18 June, 2022;
originally announced June 2022.
-
RadNet: Incident Prediction in Spatio-Temporal Road Graph Networks Using Traffic Forecasting
Authors:
Shreshth Tuli,
Matthew R. Wilkinson,
Chris Kettell
Abstract:
Efficient and accurate incident prediction in spatio-temporal systems is critical to minimize service downtime and optimize performance. This work aims to utilize historic data to predict and diagnose incidents using spatio-temporal forecasting. We consider the specific use case of road traffic systems where incidents take the form of anomalous events, such as accidents or broken-down vehicles. To…
▽ More
Efficient and accurate incident prediction in spatio-temporal systems is critical to minimize service downtime and optimize performance. This work aims to utilize historic data to predict and diagnose incidents using spatio-temporal forecasting. We consider the specific use case of road traffic systems where incidents take the form of anomalous events, such as accidents or broken-down vehicles. To tackle this, we develop a neural model, called RadNet, which forecasts system parameters such as average vehicle speeds for a future timestep. As such systems largely follow daily or weekly periodicity, we compare RadNet's predictions against historical averages to label incidents. Unlike prior work, RadNet infers spatial and temporal trends in both permutations, finally combining the dense representations before forecasting. This facilitates informed inference and more accurate incident detection. Experiments with two publicly available and a new road traffic dataset demonstrate that the proposed model gives up to 8% higher prediction F1 scores compared to the state-of-the-art methods.
△ Less
Submitted 11 June, 2022;
originally announced June 2022.
-
FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?
Authors:
Shikhar Tuli,
Bhishma Dedhia,
Shreshth Tuli,
Niraj K. Jha
Abstract:
The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. Training such models and exploring their hyperparameter space, however, is computationally expensive. Prior work proposes several neural architecture search (NAS) methods that employ…
▽ More
The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. Training such models and exploring their hyperparameter space, however, is computationally expensive. Prior work proposes several neural architecture search (NAS) methods that employ performance predictors (e.g., surrogate models) to address this issue; however, analysis has been limited to homogeneous models that use fixed dimensionality throughout the network. This leads to sub-optimal architectures. To address this limitation, we propose a suite of heterogeneous and flexible models, namely FlexiBERT, that have varied encoder layers with a diverse set of possible operations and different hidden dimensions. For better-posed surrogate modeling in this expanded design space, we propose a new graph-similarity-based embedding scheme. We also propose a novel NAS policy, called BOSHNAS, that leverages this new scheme, Bayesian modeling, and second-order optimization, to quickly train and use a neural surrogate model to converge to the optimal architecture. A comprehensive set of experiments shows that the proposed policy, when applied to the FlexiBERT design space, pushes the performance frontier upwards compared to traditional models. FlexiBERT-Mini, one of our proposed models, has 3% fewer parameters than BERT-Mini and achieves 8.9% higher GLUE score. A FlexiBERT model with equivalent performance as the best homogeneous model achieves 2.6x smaller size. FlexiBERT-Large, another proposed model, achieves state-of-the-art results, outperforming the baseline models by at least 5.7% on the GLUE benchmark.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
MetaNet: Automated Dynamic Selection of Scheduling Policies in Cloud Environments
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Task scheduling is a well-studied problem in the context of optimizing the Quality of Service (QoS) of cloud computing environments. In order to sustain the rapid growth of computational demands, one of the most important QoS metrics for cloud schedulers is the execution cost. In this regard, several data-driven deep neural networks (DNNs) based schedulers have been proposed in recent years to all…
▽ More
Task scheduling is a well-studied problem in the context of optimizing the Quality of Service (QoS) of cloud computing environments. In order to sustain the rapid growth of computational demands, one of the most important QoS metrics for cloud schedulers is the execution cost. In this regard, several data-driven deep neural networks (DNNs) based schedulers have been proposed in recent years to allow scalable and efficient resource management in dynamic workload settings. However, optimal scheduling frequently relies on sophisticated DNNs with high computational needs implying higher execution costs. Further, even in non-stationary environments, sophisticated schedulers might not always be required and we could briefly rely on low-cost schedulers in the interest of cost-efficiency. Therefore, this work aims to solve the non-trivial meta problem of online dynamic selection of a scheduling policy using a surrogate model called MetaNet. Unlike traditional solutions with a fixed scheduling policy, MetaNet on-the-fly chooses a scheduler from a large set of DNN based methods to optimize task scheduling and execution costs in tandem. Compared to state-of-the-art DNN schedulers, this allows for improvement in execution costs, energy consumption, response time and service level agreement violations by up to 11, 43, 8 and 13 percent, respectively.
△ Less
Submitted 21 May, 2022;
originally announced May 2022.
-
Learning to Dynamically Select Cost Optimal Schedulers in Cloud Computing Environments
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
The operational cost of a cloud computing platform is one of the most significant Quality of Service (QoS) criteria for schedulers, crucial to keep up with the growing computational demands. Several data-driven deep neural network (DNN)-based schedulers have been proposed in recent years that outperform alternative approaches by providing scalable and effective resource management for dynamic work…
▽ More
The operational cost of a cloud computing platform is one of the most significant Quality of Service (QoS) criteria for schedulers, crucial to keep up with the growing computational demands. Several data-driven deep neural network (DNN)-based schedulers have been proposed in recent years that outperform alternative approaches by providing scalable and effective resource management for dynamic workloads. However, state-of-the-art schedulers rely on advanced DNNs with high computational requirements, implying high scheduling costs. In non-stationary contexts, the most sophisticated schedulers may not always be required, and it may be sufficient to rely on low-cost schedulers to temporarily save operational costs. In this work, we propose MetaNet, a surrogate model that predicts the operational costs and scheduling overheads of a large number of DNN-based schedulers and chooses one on-the-fly to jointly optimize job scheduling and execution costs. This facilitates improvements in execution costs, energy usage and service level agreement violations of up to 11%, 43% and 13% compared to the state-of-the-art methods.
△ Less
Submitted 21 May, 2022;
originally announced May 2022.
-
SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural Networks in Mobile Edge Environments
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
In recent years, deep learning models have become ubiquitous in industry and academia alike. Deep neural networks can solve some of the most complex pattern-recognition problems today, but come with the price of massive compute and memory requirements. This makes the problem of deploying such large-scale neural networks challenging in resource-constrained mobile edge computing platforms, specifica…
▽ More
In recent years, deep learning models have become ubiquitous in industry and academia alike. Deep neural networks can solve some of the most complex pattern-recognition problems today, but come with the price of massive compute and memory requirements. This makes the problem of deploying such large-scale neural networks challenging in resource-constrained mobile edge computing platforms, specifically in mission-critical domains like surveillance and healthcare. To solve this, a promising solution is to split resource-hungry neural networks into lightweight disjoint smaller components for pipelined distributed processing. At present, there are two main approaches to do this: semantic and layer-wise splitting. The former partitions a neural network into parallel disjoint models that produce a part of the result, whereas the latter partitions into sequential models that produce intermediate results. However, there is no intelligent algorithm that decides which splitting strategy to use and places such modular splits to edge nodes for optimal performance. To combat this, this work proposes a novel AI-driven online policy, SplitPlace, that uses Multi-Armed-Bandits to intelligently decide between layer and semantic splitting strategies based on the input task's service deadline demands. SplitPlace places such neural network split fragments on mobile edge devices using decision-aware reinforcement learning for efficient and scalable computing. Moreover, SplitPlace fine-tunes its placement engine to adapt to volatile environments. Our experiments on physical mobile-edge environments with real-world workloads show that SplitPlace can significantly improve the state-of-the-art in terms of average response time, deadline violation rate, inference accuracy, and total reward by up to 46, 69, 3 and 12 percent respectively.
△ Less
Submitted 21 May, 2022;
originally announced May 2022.
-
GoalNet: Inferring Conjunctive Goal Predicates from Human Plan Demonstrations for Robot Instruction Following
Authors:
Shreya Sharma,
Jigyasa Gupta,
Shreshth Tuli,
Rohan Paul,
Mausam
Abstract:
Our goal is to enable a robot to learn how to sequence its actions to perform tasks specified as natural language instructions, given successful demonstrations from a human partner. The ability to plan high-level tasks can be factored as (i) inferring specific goal predicates that characterize the task implied by a language instruction for a given world state and (ii) synthesizing a feasible goal-…
▽ More
Our goal is to enable a robot to learn how to sequence its actions to perform tasks specified as natural language instructions, given successful demonstrations from a human partner. The ability to plan high-level tasks can be factored as (i) inferring specific goal predicates that characterize the task implied by a language instruction for a given world state and (ii) synthesizing a feasible goal-reaching action-sequence with such predicates. For the former, we leverage a neural network prediction model, while utilizing a symbolic planner for the latter. We introduce a novel neuro-symbolic model, GoalNet, for contextual and task dependent inference of goal predicates from human demonstrations and linguistic task descriptions. GoalNet combines (i) learning, where dense representations are acquired for language instruction and the world state that enables generalization to novel settings and (ii) planning, where the cause-effect modeling by the symbolic planner eschews irrelevant predicates facilitating multi-stage decision making in large domains. GoalNet demonstrates a significant improvement (51%) in the task completion rate in comparison to a state-of-the-art rule-based approach on a benchmark data set displaying linguistic variations, particularly for multi-stage instructions.
△ Less
Submitted 14 May, 2022;
originally announced May 2022.
-
CAROL: Confidence-Aware Resilience Model for Edge Federations
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
In recent years, the deployment of large-scale Internet of Things (IoT) applications has given rise to edge federations that seamlessly interconnect and leverage resources from multiple edge service providers. The requirement of supporting both latency-sensitive and compute-intensive IoT tasks necessitates service resilience, especially for the broker nodes in typical broker-worker deployment desi…
▽ More
In recent years, the deployment of large-scale Internet of Things (IoT) applications has given rise to edge federations that seamlessly interconnect and leverage resources from multiple edge service providers. The requirement of supporting both latency-sensitive and compute-intensive IoT tasks necessitates service resilience, especially for the broker nodes in typical broker-worker deployment designs. Existing fault-tolerance or resilience schemes often lack robustness and generalization capability in non-stationary workload settings. This is typically due to the expensive periodic fine-tuning of models required to adapt them in dynamic scenarios. To address this, we present a confidence aware resilience model, CAROL, that utilizes a memory-efficient generative neural network to predict the Quality of Service (QoS) for a future state and a confidence score for each prediction. Thus, whenever a broker fails, we quickly recover the system by executing a local-search over the broker-worker topology space and optimize future QoS. The confidence score enables us to keep track of the prediction performance and run parsimonious neural network fine-tuning to avoid excessive overheads, further improving the QoS of the system. Experiments on a Raspberry-Pi based edge testbed with IoT benchmark applications show that CAROL outperforms state-of-the-art resilience schemes by reducing the energy consumption, deadline violation rates and resilience overheads by up to 16, 17 and 36 percent, respectively.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Efficient anomaly detection and diagnosis in multivariate time-series data is of great importance for modern industrial applications. However, building a system that is able to quickly and accurately pinpoint anomalous observations is a challenging problem. This is due to the lack of anomaly labels, high data volatility and the demands of ultra-low inference times in modern applications. Despite t…
▽ More
Efficient anomaly detection and diagnosis in multivariate time-series data is of great importance for modern industrial applications. However, building a system that is able to quickly and accurately pinpoint anomalous observations is a challenging problem. This is due to the lack of anomaly labels, high data volatility and the demands of ultra-low inference times in modern applications. Despite the recent developments of deep learning approaches for anomaly detection, only a few of them can address all of these challenges. In this paper, we propose TranAD, a deep transformer network based anomaly detection and diagnosis model which uses attention-based sequence encoders to swiftly perform inference with the knowledge of the broader temporal trends in the data. TranAD uses focus score-based self-conditioning to enable robust multi-modal feature extraction and adversarial training to gain stability. Additionally, model-agnostic meta learning (MAML) allows us to train the model using limited data. Extensive empirical studies on six publicly available datasets demonstrate that TranAD can outperform state-of-the-art baseline methods in detection and diagnosis performance with data and time-efficient training. Specifically, TranAD increases F1 scores by up to 17%, reducing training times by up to 99% compared to the baselines.
△ Less
Submitted 14 May, 2022; v1 submitted 18 January, 2022;
originally announced January 2022.
-
GOSH: Task Scheduling Using Deep Surrogate Models in Fog Computing Environments
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Recently, intelligent scheduling approaches using surrogate models have been proposed to efficiently allocate volatile tasks in heterogeneous fog environments. Advances like deterministic surrogate models, deep neural networks (DNN) and gradient-based optimization allow low energy consumption and response times to be reached. However, deterministic surrogate models, which estimate objective values…
▽ More
Recently, intelligent scheduling approaches using surrogate models have been proposed to efficiently allocate volatile tasks in heterogeneous fog environments. Advances like deterministic surrogate models, deep neural networks (DNN) and gradient-based optimization allow low energy consumption and response times to be reached. However, deterministic surrogate models, which estimate objective values for optimization, do not consider the uncertainties in the distribution of the Quality of Service (QoS) objective function that can lead to high Service Level Agreement (SLA) violation rates. Moreover, the brittle nature of DNN training and prevent such models from reaching minimal energy or response times. To overcome these difficulties, we present a novel scheduler: GOSH i.e. Gradient Based Optimization using Second Order derivatives and Heteroscedastic Deep Surrogate Models. GOSH uses a second-order gradient based optimization approach to obtain better QoS and reduce the number of iterations to converge to a scheduling decision, subsequently lowering the scheduling time. Instead of a vanilla DNN, GOSH uses a Natural Parameter Network to approximate objective scores. Further, a Lower Confidence Bound optimization approach allows GOSH to find an optimal trade-off between greedy minimization of the mean latency and uncertainty reduction by employing error-based exploration. Thus, GOSH and its co-simulation based extension GOSH*, can adapt quickly and reach better objective scores than baseline methods. We show that GOSH* reaches better objective scores than GOSH, but it is suitable only for high resource availability settings, whereas GOSH is apt for limited resource settings. Real system experiments for both GOSH and GOSH* show significant improvements against the state-of-the-art in terms of energy consumption, response time and SLA violations by up to 18, 27 and 82 percent, respectively.
△ Less
Submitted 16 December, 2021;
originally announced December 2021.
-
MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing Systems
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Workflow scheduling is a long-studied problem in parallel and distributed computing (PDC), aiming to efficiently utilize compute resources to meet user's service requirements. Recently proposed scheduling methods leverage the low response times of edge computing platforms to optimize application Quality of Service (QoS). However, scheduling workflow applications in mobile edge-cloud systems is cha…
▽ More
Workflow scheduling is a long-studied problem in parallel and distributed computing (PDC), aiming to efficiently utilize compute resources to meet user's service requirements. Recently proposed scheduling methods leverage the low response times of edge computing platforms to optimize application Quality of Service (QoS). However, scheduling workflow applications in mobile edge-cloud systems is challenging due to computational heterogeneity, changing latencies of mobile devices and the volatile nature of workload resource requirements. To overcome these difficulties, it is essential, but at the same time challenging, to develop a long-sighted optimization scheme that efficiently models the QoS objectives. In this work, we propose MCDS: Monte Carlo Learning using Deep Surrogate Models to efficiently schedule workflow applications in mobile edge-cloud computing systems. MCDS is an Artificial Intelligence (AI) based scheduling approach that uses a tree-based search strategy and a deep neural network-based surrogate model to estimate the long-term QoS impact of immediate actions for robust optimization of scheduling decisions. Experiments on physical and simulated edge-cloud testbeds show that MCDS can improve over the state-of-the-art methods in terms of energy consumption, response time, SLA violations and cost by at least 6.13, 4.56, 45.09 and 30.71 percent respectively.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
PreGAN: Preemptive Migration Prediction Network for Proactive Fault-Tolerant Edge Computing
Authors:
Shreshth Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Building a fault-tolerant edge system that can quickly react to node overloads or failures is challenging due to the unreliability of edge devices and the strict service deadlines of modern applications. Moreover, unnecessary task migrations can stress the system network, giving rise to the need for a smart and parsimonious failure recovery scheme. Prior approaches often fail to adapt to highly vo…
▽ More
Building a fault-tolerant edge system that can quickly react to node overloads or failures is challenging due to the unreliability of edge devices and the strict service deadlines of modern applications. Moreover, unnecessary task migrations can stress the system network, giving rise to the need for a smart and parsimonious failure recovery scheme. Prior approaches often fail to adapt to highly volatile workloads or accurately detect and diagnose faults for optimal remediation. There is thus a need for a robust and proactive fault-tolerance mechanism to meet service level objectives. In this work, we propose PreGAN, a composite AI model using a Generative Adversarial Network (GAN) to predict preemptive migration decisions for proactive fault-tolerance in containerized edge deployments. PreGAN uses co-simulations in tandem with a GAN to learn a few-shot anomaly classifier and proactively predict migration decisions for reliable computing. Extensive experiments on a Raspberry-Pi based edge environment show that PreGAN can outperform state-of-the-art baseline methods in fault-detection, diagnosis and classification, thus achieving high quality of service. PreGAN accomplishes this by 5.1% more accurate fault detection, higher diagnosis scores and 23.8% lower overheads compared to the best method among the considered baselines.
△ Less
Submitted 4 December, 2021;
originally announced December 2021.
-
START: Straggler Prediction and Mitigation for Cloud Computing Environments using Encoder LSTM Networks
Authors:
Shreshth Tuli,
Sukhpal Singh Gill,
Peter Garraghan,
Rajkumar Buyya,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Modern large-scale computing systems distribute jobs into multiple smaller tasks which execute in parallel to accelerate job completion rates and reduce energy consumption. However, a common performance problem in such systems is dealing with straggler tasks that are slow running instances that increase the overall response time. Such tasks can significantly impact the system's Quality of Service…
▽ More
Modern large-scale computing systems distribute jobs into multiple smaller tasks which execute in parallel to accelerate job completion rates and reduce energy consumption. However, a common performance problem in such systems is dealing with straggler tasks that are slow running instances that increase the overall response time. Such tasks can significantly impact the system's Quality of Service (QoS) and the Service Level Agreements (SLA). To combat this issue, there is a need for automatic straggler detection and mitigation mechanisms that execute jobs without violating the SLA. Prior work typically builds reactive models that focus first on detection and then mitigation of straggler tasks, which leads to delays. Other works use prediction based proactive mechanisms, but ignore heterogeneous host or volatile task characteristics. In this paper, we propose a Straggler Prediction and Mitigation Technique (START) that is able to predict which tasks might be stragglers and dynamically adapt scheduling to achieve lower response times. Our technique analyzes all tasks and hosts based on compute and network resource consumption using an Encoder Long-Short-Term-Memory (LSTM) network. The output of this network is then used to predict and mitigate expected straggler tasks. This reduces the SLA violation rate and execution time without compromising QoS. Specifically, we use the CloudSim toolkit to simulate START in a cloud environment and compare it with state-of-the-art techniques (IGRU-SD, SGC, Dolly, GRASS, NearestFit and Wrangler) in terms of QoS parameters such as energy consumption, execution time, resource contention, CPU utilization and SLA violation rate. Experiments show that START reduces execution time, resource contention, energy and SLA violations by 13%, 11%, 16% and 19%, respectively, compared to the state-of-the-art approaches.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
HUNTER: AI based Holistic Resource Management for Sustainable Cloud Computing
Authors:
Shreshth Tuli,
Sukhpal Singh Gill,
Minxian Xu,
Peter Garraghan,
Rami Bahsoon,
Schahram Dustdar,
Rizos Sakellariou,
Omer Rana,
Rajkumar Buyya,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
The worldwide adoption of cloud data centers (CDCs) has given rise to the ubiquitous demand for hosting application services on the cloud. Further, contemporary data-intensive industries have seen a sharp upsurge in the resource requirements of modern applications. This has led to the provisioning of an increased number of cloud servers, giving rise to higher energy consumption and, consequently,…
▽ More
The worldwide adoption of cloud data centers (CDCs) has given rise to the ubiquitous demand for hosting application services on the cloud. Further, contemporary data-intensive industries have seen a sharp upsurge in the resource requirements of modern applications. This has led to the provisioning of an increased number of cloud servers, giving rise to higher energy consumption and, consequently, sustainability concerns. Traditional heuristics and reinforcement learning based algorithms for energy-efficient cloud resource management address the scalability and adaptability related challenges to a limited extent. Existing work often fails to capture dependencies across thermal characteristics of hosts, resource consumption of tasks and the corresponding scheduling decisions. This leads to poor scalability and an increase in the compute resource requirements, particularly in environments with non-stationary resource demands. To address these limitations, we propose an artificial intelligence (AI) based holistic resource management technique for sustainable cloud computing called HUNTER. The proposed model formulates the goal of optimizing energy efficiency in data centers as a multi-objective scheduling problem, considering three important models: energy, thermal and cooling. HUNTER utilizes a Gated Graph Convolution Network as a surrogate model for approximating the Quality of Service (QoS) for a system state and generating optimal scheduling decisions. Experiments on simulated and physical cloud environments using the CloudSim toolkit and the COSCO framework show that HUNTER outperforms state-of-the-art baselines in terms of energy consumption, SLA violation, scheduling time, cost and temperature by up to 12, 35, 43, 54 and 3 percent respectively.
△ Less
Submitted 28 October, 2021; v1 submitted 11 October, 2021;
originally announced October 2021.
-
SplitPlace: Intelligent Placement of Split Neural Nets in Mobile Edge Environments
Authors:
Shreshth Tuli
Abstract:
In recent years, deep learning models have become ubiquitous in industry and academia alike. Modern deep neural networks can solve one of the most complex problems today, but coming with the price of massive compute and storage requirements. This makes deploying such massive neural networks challenging in the mobile edge computing paradigm, where edge nodes are resource-constrained, hence limiting…
▽ More
In recent years, deep learning models have become ubiquitous in industry and academia alike. Modern deep neural networks can solve one of the most complex problems today, but coming with the price of massive compute and storage requirements. This makes deploying such massive neural networks challenging in the mobile edge computing paradigm, where edge nodes are resource-constrained, hence limiting the input analysis power of such frameworks. Semantic and layer-wise splitting of neural networks for distributed processing show some hope in this direction. However, there are no intelligent algorithms that place such modular splits to edge nodes for optimal performance. This work proposes a novel placement policy, SplitPlace, for the placement of such neural network split fragments on mobile edge hosts for efficient and scalable computing.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Generative Optimization Networks for Memory Efficient Data Generation
Authors:
Shreshth Tuli,
Shikhar Tuli,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
In standard generative deep learning models, such as autoencoders or GANs, the size of the parameter set is proportional to the complexity of the generated data distribution. A significant challenge is to deploy resource-hungry deep learning models in devices with limited memory to prevent system upgrade costs. To combat this, we propose a novel framework called generative optimization networks (G…
▽ More
In standard generative deep learning models, such as autoencoders or GANs, the size of the parameter set is proportional to the complexity of the generated data distribution. A significant challenge is to deploy resource-hungry deep learning models in devices with limited memory to prevent system upgrade costs. To combat this, we propose a novel framework called generative optimization networks (GON) that is similar to GANs, but does not use a generator, significantly reducing its memory footprint. GONs use a single discriminator network and run optimization in the input space to generate new data samples, achieving an effective compromise between training time and memory consumption. GONs are most suited for data generation problems in limited memory settings. Here we illustrate their use for the problem of anomaly detection in memory-constrained edge devices arising from attacks or intrusion events. Specifically, we use a GON to calculate a reconstruction-based anomaly score for input time-series windows. Experiments on a Raspberry-Pi testbed with two existing and a new suite of datasets show that our framework gives up to 32% higher detection F1 scores and 58% lower memory consumption, with only 5% higher training overheads compared to the state-of-the-art.
△ Less
Submitted 28 October, 2021; v1 submitted 6 October, 2021;
originally announced October 2021.
-
Are Convolutional Neural Networks or Transformers more like human vision?
Authors:
Shikhar Tuli,
Ishita Dasgupta,
Erin Grant,
Thomas L. Griffiths
Abstract:
Modern machine learning models for computer vision exceed humans in accuracy on specific visual recognition tasks, notably on datasets like ImageNet. However, high accuracy can be achieved in many ways. The particular decision function found by a machine learning system is determined not only by the data to which the system is exposed, but also the inductive biases of the model, which are typicall…
▽ More
Modern machine learning models for computer vision exceed humans in accuracy on specific visual recognition tasks, notably on datasets like ImageNet. However, high accuracy can be achieved in many ways. The particular decision function found by a machine learning system is determined not only by the data to which the system is exposed, but also the inductive biases of the model, which are typically harder to characterize. In this work, we follow a recent trend of in-depth behavioral analyses of neural network models that go beyond accuracy as an evaluation metric by looking at patterns of errors. Our focus is on comparing a suite of standard Convolutional Neural Networks (CNNs) and a recently-proposed attention-based network, the Vision Transformer (ViT), which relaxes the translation-invariance constraint of CNNs and therefore represents a model with a weaker set of inductive biases. Attention-based networks have previously been shown to achieve higher accuracy than CNNs on vision tasks, and we demonstrate, using new metrics for examining error consistency with more granularity, that their errors are also more consistent with those of humans. These results have implications both for building more human-like vision models, as well as for understanding visual object recognition in humans.
△ Less
Submitted 1 July, 2021; v1 submitted 15 May, 2021;
originally announced May 2021.
-
TANGO: Commonsense Generalization in Predicting Tool Interactions for Mobile Manipulators
Authors:
Shreshth Tuli,
Rajas Bansal,
Rohan Paul,
Mausam
Abstract:
Robots assisting us in factories or homes must learn to make use of objects as tools to perform tasks, e.g., a tray for carrying objects. We consider the problem of learning commonsense knowledge of when a tool may be useful and how its use may be composed with other tools to accomplish a high-level task instructed by a human. We introduce a novel neural model, termed TANGO, for predicting task-sp…
▽ More
Robots assisting us in factories or homes must learn to make use of objects as tools to perform tasks, e.g., a tray for carrying objects. We consider the problem of learning commonsense knowledge of when a tool may be useful and how its use may be composed with other tools to accomplish a high-level task instructed by a human. We introduce a novel neural model, termed TANGO, for predicting task-specific tool interactions, trained using demonstrations from human teachers instructing a virtual robot. TANGO encodes the world state, comprising objects and symbolic relationships between them, using a graph neural network. The model learns to attend over the scene using knowledge of the goal and the action history, finally decoding the symbolic action to execute. Crucially, we address generalization to unseen environments where some known tools are missing, but alternative unseen tools are present. We show that by augmenting the representation of the environment with pre-trained embeddings derived from a knowledge-base, the model can generalize effectively to novel environments. Experimental results show a 60.5-78.9% absolute improvement over the baseline in predicting successful symbolic plans in unseen settings for a simulated mobile manipulator.
△ Less
Submitted 23 May, 2021; v1 submitted 5 May, 2021;
originally announced May 2021.
-
COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments
Authors:
Shreshth Tuli,
Shivananda Poojara,
Satish N. Srirama,
Giuliano Casale,
Nicholas R. Jennings
Abstract:
Intelligent task placement and management of tasks in large-scale fog platforms is challenging due to the highly volatile nature of modern workload applications and sensitive user requirements of low energy consumption and response time. Container orchestration platforms have emerged to alleviate this problem with prior art either using heuristics to quickly reach scheduling decisions or AI driven…
▽ More
Intelligent task placement and management of tasks in large-scale fog platforms is challenging due to the highly volatile nature of modern workload applications and sensitive user requirements of low energy consumption and response time. Container orchestration platforms have emerged to alleviate this problem with prior art either using heuristics to quickly reach scheduling decisions or AI driven methods like reinforcement learning and evolutionary approaches to adapt to dynamic scenarios. The former often fail to quickly adapt in highly dynamic environments, whereas the latter have run-times that are slow enough to negatively impact response time. Therefore, there is a need for scheduling policies that are both reactive to work efficiently in volatile environments and have low scheduling overheads. To achieve this, we propose a Gradient Based Optimization Strategy using Back-propagation of gradients with respect to Input (GOBI). Further, we leverage the accuracy of predictive digital-twin models and simulation capabilities by developing a Coupled Simulation and Container Orchestration Framework (COSCO). Using this, we create a hybrid simulation driven decision approach, GOBI*, to optimize Quality of Service (QoS) parameters. Co-simulation and the back-propagation approaches allow these methods to adapt quickly in volatile environments. Experiments conducted using real-world data on fog applications using the GOBI and GOBI* methods, show a significant improvement in terms of energy consumption, response time, Service Level Objective and scheduling time by up to 15, 40, 4, and 82 percent respectively when compared to the state-of-the-art algorithms.
△ Less
Submitted 9 July, 2021; v1 submitted 29 April, 2021;
originally announced April 2021.
-
Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments using A3C learning and Residual Recurrent Neural Networks
Authors:
Shreshth Tuli,
Shashikant Ilager,
Kotagiri Ramamohanarao,
Rajkumar Buyya
Abstract:
The ubiquitous adoption of Internet-of-Things (IoT) based applications has resulted in the emergence of the Fog computing paradigm, which allows seamlessly harnessing both mobile-edge and cloud resources. Efficient scheduling of application tasks in such environments is challenging due to constrained resource capabilities, mobility factors in IoT, resource heterogeneity, network hierarchy, and sto…
▽ More
The ubiquitous adoption of Internet-of-Things (IoT) based applications has resulted in the emergence of the Fog computing paradigm, which allows seamlessly harnessing both mobile-edge and cloud resources. Efficient scheduling of application tasks in such environments is challenging due to constrained resource capabilities, mobility factors in IoT, resource heterogeneity, network hierarchy, and stochastic behaviors. xisting heuristics and Reinforcement Learning based approaches lack generalizability and quick adaptability, thus failing to tackle this problem optimally. They are also unable to utilize the temporal workload patterns and are suitable only for centralized setups. However, Asynchronous-Advantage-Actor-Critic (A3C) learning is known to quickly adapt to dynamic scenarios with less data and Residual Recurrent Neural Network (R2N2) to quickly update model parameters. Thus, we propose an A3C based real-time scheduler for stochastic Edge-Cloud environments allowing decentralized learning, concurrently across multiple agents. We use the R2N2 architecture to capture a large number of host and task parameters together with temporal patterns to provide efficient scheduling decisions. The proposed model is adaptive and able to tune different hyper-parameters based on the application requirements. We explicate our choice of hyper-parameters through sensitivity analysis. The experiments conducted on real-world data set show a significant improvement in terms of energy consumption, response time, Service-Level-Agreement and running cost by 14.4%, 7.74%, 31.9%, and 4.64%, respectively when compared to the state-of-the-art algorithms.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
ToolNet: Using Commonsense Generalization for Predicting Tool Use for Robot Plan Synthesis
Authors:
Rajas Bansal,
Shreshth Tuli,
Rohan Paul,
Mausam
Abstract:
A robot working in a physical environment (like home or factory) needs to learn to use various available tools for accomplishing different tasks, for instance, a mop for cleaning and a tray for carrying objects. The number of possible tools is large and it may not be feasible to demonstrate usage of each individual tool during training. Can a robot learn commonsense knowledge and adapt to novel se…
▽ More
A robot working in a physical environment (like home or factory) needs to learn to use various available tools for accomplishing different tasks, for instance, a mop for cleaning and a tray for carrying objects. The number of possible tools is large and it may not be feasible to demonstrate usage of each individual tool during training. Can a robot learn commonsense knowledge and adapt to novel settings where some known tools are missing, but alternative unseen tools are present? We present a neural model that predicts the best tool from the available objects for achieving a given declarative goal. This model is trained by user demonstrations, which we crowd-source through humans instructing a robot in a physics simulator. This dataset maintains user plans involving multi-step object interactions along with symbolic state changes. Our neural model, ToolNet, combines a graph neural network to encode the current environment state, and goal-conditioned spatial attention to predict the appropriate tool. We find that providing metric and semantic properties of objects, and pre-trained object embeddings derived from a commonsense knowledge repository such as ConceptNet, significantly improves the model's ability to generalize to unseen tools. The model makes accurate and generalizable tool predictions. When compared to a graph neural network baseline, it achieves 14-27% accuracy improvement for predicting known tools from new world scenes, and 44-67% improvement in generalization for novel objects not encountered during training.
△ Less
Submitted 17 September, 2021; v1 submitted 9 June, 2020;
originally announced June 2020.
-
AVAC: A Machine Learning based Adaptive RRAM Variability-Aware Controller for Edge Devices
Authors:
Shikhar Tuli,
Shreshth Tuli
Abstract:
Recently, the Edge Computing paradigm has gained significant popularity both in industry and academia. Researchers now increasingly target to improve performance and reduce energy consumption of such devices. Some recent efforts focus on using emerging RRAM technologies for improving energy efficiency, thanks to their no leakage property and high integration density. As the complexity and dynamism…
▽ More
Recently, the Edge Computing paradigm has gained significant popularity both in industry and academia. Researchers now increasingly target to improve performance and reduce energy consumption of such devices. Some recent efforts focus on using emerging RRAM technologies for improving energy efficiency, thanks to their no leakage property and high integration density. As the complexity and dynamism of applications supported by such devices escalate, it has become difficult to maintain ideal performance by static RRAM controllers. Machine Learning provides a promising solution for this, and hence, this work focuses on extending such controllers to allow dynamic parameter updates. In this work we propose an Adaptive RRAM Variability-Aware Controller, AVAC, which periodically updates Wait Buffer and batch sizes using on-the-fly learning models and gradient ascent. AVAC allows Edge devices to adapt to different applications and their stages, to improve computation performance and reduce energy consumption. Simulations demonstrate that the proposed model can provide up to 29% increase in performance and 19% decrease in energy, compared to static controllers, using traces of real-life healthcare applications on a Raspberry-Pi based Edge deployment.
△ Less
Submitted 6 May, 2020;
originally announced May 2020.
-
ThermoSim: Deep Learning based Framework for Modeling and Simulation of Thermal-aware Resource Management for Cloud Computing Environments
Authors:
Sukhpal Singh Gill,
Shreshth Tuli,
Adel Nadjaran Toosi,
Felix Cuadrado,
Peter Garraghan,
Rami Bahsoon,
Hanan Lutfiyya,
Rizos Sakellariou,
Omer Rana,
Schahram Dustdar,
Rajkumar Buyya
Abstract:
Current cloud computing frameworks host millions of physical servers that utilize cloud computing resources in the form of different virtual machines (VM). Cloud Data Center (CDC) infrastructures require significant amounts of energy to deliver large scale computational services. Computing nodes generate large volumes of heat, requiring cooling units in turn to eliminate the effect of this heat. T…
▽ More
Current cloud computing frameworks host millions of physical servers that utilize cloud computing resources in the form of different virtual machines (VM). Cloud Data Center (CDC) infrastructures require significant amounts of energy to deliver large scale computational services. Computing nodes generate large volumes of heat, requiring cooling units in turn to eliminate the effect of this heat. Thus, the overall energy consumption of the CDC increases tremendously for servers as well as for cooling units. However, current workload allocation policies do not take into account the effect on temperature and it is challenging to simulate the thermal behavior of CDCs. There is a need for a thermal-aware framework to simulate and model the behavior of nodes and measure the important performance parameters which can be affected by its temperature. In this paper, we propose a lightweight framework, ThermoSim, for modeling and simulation of thermal-aware resource management for cloud computing environments. This work presents a Recurrent Neural Network based deep learning temperature predictor for CDCs which is utilized by ThermoSim for lightweight resource management in constrained cloud environments. ThermoSim extends the CloudSim toolkit helping to analyze the performance of various key parameters such as energy consumption, SLA violation rate, number of VM migrations and temperature during the management of cloud resources for execution of workloads. Further, different energy-aware and thermal-aware resource management techniques are tested using the proposed ThermoSim framework in order to validate it against the existing framework. The experimental results demonstrate the proposed framework is capable of modeling and simulating the thermal behavior of a CDC and the ThermoSim framework is better than Thas in terms of energy consumption, cost, time, memory usage & prediction accuracy.
△ Less
Submitted 8 May, 2020; v1 submitted 17 April, 2020;
originally announced April 2020.
-
iGateLink: A Gateway Library for Linking IoT, Edge, Fog and Cloud Computing Environments
Authors:
Riccardo Mancini,
Shreshth Tuli,
Tommaso Cucinotta,
Rajkumar Buyya
Abstract:
In recent years, the Internet of Things (IoT) has been growing in popularity, along with the increasingly important role played by IoT gateways, mediating the interactions among a plethora of heterogeneous IoT devices and cloud services. In this paper, we present iGateLink, an open-source Android library easing the development of Android applications acting as a gateway between IoT devices and Edg…
▽ More
In recent years, the Internet of Things (IoT) has been growing in popularity, along with the increasingly important role played by IoT gateways, mediating the interactions among a plethora of heterogeneous IoT devices and cloud services. In this paper, we present iGateLink, an open-source Android library easing the development of Android applications acting as a gateway between IoT devices and Edge/Fog/Cloud Computing environments. Thanks to its pluggable design, modules providing connectivity with a number of devices acting as data sources or Fog/Cloud frameworks can be easily reused for different applications. Using iGateLink in two case-studies replicating previous works in the healthcare and image processing domains, the library proved to be effective in adapting to different scenarios and speeding up the development of gateway applications, as compared to the use of conventional methods.
△ Less
Submitted 16 November, 2019;
originally announced November 2019.
-
HealthFog: An Ensemble Deep Learning based Smart Healthcare System for Automatic Diagnosis of Heart Diseases in Integrated IoT and Fog Computing Environments
Authors:
Shreshth Tuli,
Nipam Basumatary,
Sukhpal Singh Gill,
Mohsen Kahani,
Rajesh Chand Arya,
Gurpreet Singh Wander,
Rajkumar Buyya
Abstract:
Cloud computing provides resources over the Internet and allows a plethora of applications to be deployed to provide services for different industries. The major bottleneck being faced currently in these cloud frameworks is their limited scalability and hence inability to cater to the requirements of centralized Internet of Things (IoT) based compute environments. The main reason for this is that…
▽ More
Cloud computing provides resources over the Internet and allows a plethora of applications to be deployed to provide services for different industries. The major bottleneck being faced currently in these cloud frameworks is their limited scalability and hence inability to cater to the requirements of centralized Internet of Things (IoT) based compute environments. The main reason for this is that latency-sensitive applications like health monitoring and surveillance systems now require computation over large amounts of data (Big Data) transferred to centralized database and from database to cloud data centers which leads to drop in performance of such systems. The new paradigms of fog and edge computing provide innovative solutions by bringing resources closer to the user and provide low latency and energy-efficient solutions for data processing compared to cloud domains. Still, the current fog models have many limitations and focus from a limited perspective on either accuracy of results or reduced response time but not both. We proposed a novel framework called HealthFog for integrating ensemble deep learning in Edge computing devices and deployed it for a real-life application of automatic Heart Disease analysis. HealthFog delivers healthcare as a fog service using IoT devices and efficiently manages the data of heart patients, which comes as user requests. Fog-enabled cloud framework, FogBus is used to deploy and test the performance of the proposed model in terms of power consumption, network bandwidth, latency, jitter, accuracy and execution time. HealthFog is configurable to various operation modes that provide the best Quality of Service or prediction accuracy, as required, in diverse fog computation scenarios and for different user requirements.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Transformative effects of IoT, Blockchain and Artificial Intelligence on cloud computing: Evolution, vision, trends and open challenges
Authors:
Sukhpal Singh Gill,
Shreshth Tuli,
Minxian Xu,
Inderpreet Singh,
Karan Vijay Singh,
Dominic Lindsay,
Shikhar Tuli,
Daria Smirnova,
Manmeet Singh,
Udit Jain,
Haris Pervaiz,
Bhanu Sehgal,
Sukhwinder Singh Kaila,
Sanjay Misra,
Mohammad Sadegh Aslanpour,
Harshit Mehta,
Vlado Stankovski,
Peter Garraghan
Abstract:
Cloud computing plays a critical role in modern society and enables a range of applications from infrastructure to social media. Such system must cope with varying load and evolving usage reflecting societies interaction and dependency on automated computing systems whilst satisfying Quality of Service (QoS) guarantees. Enabling these systems are a cohort of conceptual technologies, synthesized to…
▽ More
Cloud computing plays a critical role in modern society and enables a range of applications from infrastructure to social media. Such system must cope with varying load and evolving usage reflecting societies interaction and dependency on automated computing systems whilst satisfying Quality of Service (QoS) guarantees. Enabling these systems are a cohort of conceptual technologies, synthesized to meet demand of evolving computing applications. In order to understand current and future challenges of such system, there is a need to identify key technologies enabling future applications. In this study, we aim to explore how three emerging paradigms (Blockchain, IoT and Artificial Intelligence) will influence future cloud computing systems. Further, we identify several technologies driving these paradigms and invite international experts to discuss the current status and future directions of cloud computing. Finally, we proposed a conceptual model for cloud futurology to explore the influence of emerging paradigms and technologies on evolution of cloud computing.
△ Less
Submitted 21 October, 2019;
originally announced November 2019.
-
APEX: Adaptive Ext4 File System for Enhanced Data Recoverability in Edge Devices
Authors:
Shreshth Tuli,
Shikhar Tuli,
Udit Jain,
Rajkumar Buyya
Abstract:
Recently Edge Computing paradigm has gained significant popularity both in industry and academia. With its increased usage in real-life scenarios, security, privacy and integrity of data in such environments have become critical. Malicious deletion of mission-critical data due to ransomware, trojans and viruses has been a huge menace and recovering such lost data is an active field of research. As…
▽ More
Recently Edge Computing paradigm has gained significant popularity both in industry and academia. With its increased usage in real-life scenarios, security, privacy and integrity of data in such environments have become critical. Malicious deletion of mission-critical data due to ransomware, trojans and viruses has been a huge menace and recovering such lost data is an active field of research. As most of Edge computing devices have compute and storage limitations, difficult constraints arise in providing an optimal scheme for data protection. These devices mostly use Linux/Unix based operating systems. Hence, this work focuses on extending the Ext4 file system to APEX (Adaptive Ext4): a file system based on novel on-the-fly learning model that provides an Adaptive Recover-ability Aware file allocation platform for efficient post-deletion data recovery and therefore maintaining data integrity. Our recovery model and its lightweight implementation allow significant improvement in recover-ability of lost data with lower compute, space, time, and cost overheads compared to other methods. We demonstrate the effectiveness of APEX through a case study of overwriting surveillance videos by CryPy malware on Raspberry-Pi based Edge deployment and show 678% and 32% higher recovery than Ext4 and current state-of-the-art File Systems. We also evaluate the overhead characteristics and experimentally show that they are lower than other related works.
△ Less
Submitted 3 October, 2019;
originally announced October 2019.
-
EdgeLens: Deep Learning based Object Detection in Integrated IoT, Fog and Cloud Computing Environments
Authors:
Shreshth Tuli,
Nipam Basumatary,
Rajkumar Buyya
Abstract:
Data-intensive applications are growing at an increasing rate and there is a growing need to solve scalability and high-performance issues in them. By the advent of Cloud computing paradigm, it became possible to harness remote resources to build and deploy these applications. In recent years, new set of applications and services based on Internet of Things (IoT) paradigm, require to process large…
▽ More
Data-intensive applications are growing at an increasing rate and there is a growing need to solve scalability and high-performance issues in them. By the advent of Cloud computing paradigm, it became possible to harness remote resources to build and deploy these applications. In recent years, new set of applications and services based on Internet of Things (IoT) paradigm, require to process large amount of data in very less time. Among them surveillance and object detection have gained prime importance, but cloud is unable to bring down the network latencies to meet the response time requirements. This problem is solved by Fog computing which harnesses resources in the edge of the network along with remote cloud resources as required. However, there is still a lack of frameworks that are successfully able to integrate sophisticated software and applications, especially deep learning, with fog and cloud computing environments. In this work, we propose a framework to deploy deep learning-based applications in fog-cloud environments to harness edge and cloud resources to provide better service quality for such applications. Our proposed framework, called EdgeLens, adapts to the application or user requirements to provide high accuracy or low latency modes of services. We also tested the performance of the software in terms of accuracy, response time, jitter, network bandwidth and power consumption and show how EdgeLens adapts to different service requirements.
△ Less
Submitted 26 June, 2019;
originally announced June 2019.
-
FogBus: A Blockchain-based Lightweight Framework for Edge and Fog Computing
Authors:
Shreshth Tuli,
Redowan Mahmud,
Shikhar Tuli,
Rajkumar Buyya
Abstract:
The requirement of supporting both latency sensitive and computing intensive Internet of Things (IoT) applications is consistently boosting the necessity for integrating Edge, Fog and Cloud infrastructure. Although there are a number of real-world frameworks attempt to support such integration, they have many limitations from various perspectives including platform independence, security, resource…
▽ More
The requirement of supporting both latency sensitive and computing intensive Internet of Things (IoT) applications is consistently boosting the necessity for integrating Edge, Fog and Cloud infrastructure. Although there are a number of real-world frameworks attempt to support such integration, they have many limitations from various perspectives including platform independence, security, resource management and multi-application assistance. To address these limitations, we propose a simplified but effective framework, named FogBus for facilitating end-to-end IoT-Fog(Edge)-Cloud integration. FogBus offers a platform independent interface to IoT applications and computing instances for execution and interaction. It not only assists developers in building applications but also helps users in running multiple applications at a time and service providers to manage their resources. In addition, FogBus applies Blockchain, authentication and encryption techniques to secure operations on sensitive data. Because of its lightweight and cross platform software systems, it is easy to deploy, scalable and cost e_cient. We demonstrate the effectiveness of our framework by creating a computing environment with it that integrates finger pulse oximeter as IoT devices with Smartphone-based gateway and Raspberry Pi-based Fog nodes for Sleep Apnea analysis. We also run several experiments on this computing environment varying FogBus settings. The experimental results show that different FogBus settings can improve latency, energy, network and CPU usage of the computing infrastructure.
△ Less
Submitted 29 November, 2018;
originally announced November 2018.
-
Beam-energy and centrality dependence of direct-photon emission from ultra-relativistic heavy-ion collisions
Authors:
A. Adare,
S. Afanasiev,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
R. Akimoto,
H. Al-Bataineh,
J. Alexander,
M. Alfred,
A. Al-Jamel,
H. Al-Ta'ani,
A. Angerami,
K. Aoki,
N. Apadula,
L. Aphecetche,
Y. Aramaki,
R. Armendariz,
S. H. Aronson,
J. Asai,
H. Asano,
E. C. Aschenauer,
E. T. Atomssa,
R. Averbeck,
T. C. Awes,
B. Azmoun
, et al. (648 additional authors not shown)
Abstract:
The PHENIX collaboration presents first measurements of low-momentum ($0.4<p_T<3$ GeV/$c$) direct-photon yields from Au$+$Au collisions at $\sqrt{s_{_{NN}}}$=39 and 62.4 GeV. For both beam energies the direct-photon yields are substantially enhanced with respect to expectations from prompt processes, similar to the yields observed in Au$+$Au collisions at $\sqrt{s_{_{NN}}}$=200. Analyzing the phot…
▽ More
The PHENIX collaboration presents first measurements of low-momentum ($0.4<p_T<3$ GeV/$c$) direct-photon yields from Au$+$Au collisions at $\sqrt{s_{_{NN}}}$=39 and 62.4 GeV. For both beam energies the direct-photon yields are substantially enhanced with respect to expectations from prompt processes, similar to the yields observed in Au$+$Au collisions at $\sqrt{s_{_{NN}}}$=200. Analyzing the photon yield as a function of the experimental observable $dN_{\rm ch}/dη$ reveals that the low-momentum ($>$1\,GeV/$c$) direct-photon yield $dN_γ^{\rm dir}/dη$ is a smooth function of $dN_{\rm ch}/dη$ and can be well described as proportional to $(dN_{\rm ch}/dη)^α$ with $α{\approx}1.25$. This scaling behavior holds for a wide range of beam energies at the Relativistic Heavy Ion Collider and the Large Hadron Collider, for centrality selected samples, as well as for different, $A$$+$$A$ collision systems. At a given beam energy the scaling also holds for high $p_T$ ($>5$\,GeV/$c$) but when results from different collision energies are compared, an additional $\sqrt{s_{_{NN}}}$-dependent multiplicative factor is needed to describe the integrated-direct-photon yield.
△ Less
Submitted 5 June, 2019; v1 submitted 10 May, 2018;
originally announced May 2018.
-
Transverse energy production and charged-particle multiplicity at midrapidity in various systems from $\sqrt{s_{NN}}=7.7$ to 200 GeV
Authors:
A. Adare,
S. Afanasiev,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
R. Akimoto,
H. Al-Bataineh,
J. Alexander,
M. Alfred,
A. Al-Jamel,
H. Al-Ta'ani,
A. Angerami,
K. Aoki,
N. Apadula,
L. Aphecetche,
Y. Aramaki,
R. Armendariz,
S. H. Aronson,
J. Asai,
H. Asano,
E. C. Aschenauer,
E. T. Atomssa,
R. Averbeck,
T. C. Awes,
B. Azmoun
, et al. (681 additional authors not shown)
Abstract:
Measurements of midrapidity charged particle multiplicity distributions, $dN_{\rm ch}/dη$, and midrapidity transverse-energy distributions, $dE_T/dη$, are presented for a variety of collision systems and energies. Included are distributions for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$, 130, 62.4, 39, 27, 19.6, 14.5, and 7.7 GeV, Cu$+$Cu collisions at $\sqrt{s_{_{NN}}}=200$ and 62.4 GeV, Cu$+$A…
▽ More
Measurements of midrapidity charged particle multiplicity distributions, $dN_{\rm ch}/dη$, and midrapidity transverse-energy distributions, $dE_T/dη$, are presented for a variety of collision systems and energies. Included are distributions for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$, 130, 62.4, 39, 27, 19.6, 14.5, and 7.7 GeV, Cu$+$Cu collisions at $\sqrt{s_{_{NN}}}=200$ and 62.4 GeV, Cu$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV, U$+$U collisions at $\sqrt{s_{_{NN}}}=193$ GeV, $d$$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV, $^{3}$He$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV, and $p$$+$$p$ collisions at $\sqrt{s_{_{NN}}}=200$ GeV. Centrality-dependent distributions at midrapidity are presented in terms of the number of nucleon participants, $N_{\rm part}$, and the number of constituent quark participants, $N_{q{\rm p}}$. For all $A$$+$$A$ collisions down to $\sqrt{s_{_{NN}}}=7.7$ GeV, it is observed that the midrapidity data are better described by scaling with $N_{q{\rm p}}$ than scaling with $N_{\rm part}$. Also presented are estimates of the Bjorken energy density, $\varepsilon_{\rm BJ}$, and the ratio of $dE_T/dη$ to $dN_{\rm ch}/dη$, the latter of which is seen to be constant as a function of centrality for all systems.
△ Less
Submitted 23 February, 2016; v1 submitted 22 September, 2015;
originally announced September 2015.
-
Systematic Study of Azimuthal Anisotropy in Cu$+$Cu and Au$+$Au Collisions at $\sqrt{s_{_{NN}}} = 62.4$ and 200 GeV
Authors:
A. Adare,
S. Afanasiev,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
H. Al-Bataineh,
A. Al-Jamel,
J. Alexander,
K. Aoki,
L. Aphecetche,
R. Armendariz,
S. H. Aronson,
J. Asai,
E. T. Atomssa,
R. Averbeck,
T. C. Awes,
B. Azmoun,
V. Babintsev,
G. Baksay,
L. Baksay,
A. Baldisseri,
K. N. Barish,
P. D. Barnes,
B. Bassalleck,
S. Bathe
, et al. (399 additional authors not shown)
Abstract:
We have studied the dependence of azimuthal anisotropy $v_2$ for inclusive and identified charged hadrons in Au$+$Au and Cu$+$Cu collisions on collision energy, species, and centrality. The values of $v_2$ as a function of transverse momentum $p_T$ and centrality in Au$+$Au collisions at $\sqrt{s_{_{NN}}}$=200 GeV and 62.4 GeV are the same within uncertainties. However, in Cu$+$Cu collisions we ob…
▽ More
We have studied the dependence of azimuthal anisotropy $v_2$ for inclusive and identified charged hadrons in Au$+$Au and Cu$+$Cu collisions on collision energy, species, and centrality. The values of $v_2$ as a function of transverse momentum $p_T$ and centrality in Au$+$Au collisions at $\sqrt{s_{_{NN}}}$=200 GeV and 62.4 GeV are the same within uncertainties. However, in Cu$+$Cu collisions we observe a decrease in $v_2$ values as the collision energy is reduced from 200 to 62.4 GeV. The decrease is larger in the more peripheral collisions. By examining both Au$+$Au and Cu$+$Cu collisions we find that $v_2$ depends both on eccentricity and the number of participants, $N_{\rm part}$. We observe that $v_2$ divided by eccentricity ($\varepsilon$) monotonically increases with $N_{\rm part}$ and scales as ${N_{\rm part}^{1/3}}$. The Cu$+$Cu data at 62.4 GeV falls below the other scaled $v_{2}$ data. For identified hadrons, $v_2$ divided by the number of constituent quarks $n_q$ is independent of hadron species as a function of transverse kinetic energy $KE_T=m_T-m$ between $0.1<KE_T/n_q<1$ GeV. Combining all of the above scaling and normalizations, we observe a near-universal scaling, with the exception of the Cu$+$Cu data at 62.4 GeV, of $v_2/(n_q\cdot\varepsilon\cdot N^{1/3}_{\rm part})$ vs $KE_T/n_q$ for all measured particles.
△ Less
Submitted 18 September, 2015; v1 submitted 2 December, 2014;
originally announced December 2014.
-
Transverse-energy distributions at midrapidity in $p$$+$$p$, $d$$+$Au, and Au$+$Au collisions at $\sqrt{s_{_{NN}}}=62.4$--200~GeV and implications for particle-production models
Authors:
S. S. Adler,
S. Afanasiev,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
A. Al-Jamel,
J. Alexander,
K. Aoki,
L. Aphecetche,
R. Armendariz,
S. H. Aronson,
R. Averbeck,
T. C. Awes,
B. Azmoun,
V. Babintsev,
A. Baldisseri,
K. N. Barish,
P. D. Barnes,
B. Bassalleck,
S. Bathe,
S. Batsouli,
V. Baublis,
F. Bauer,
A. Bazilevsky,
S. Belikov
, et al. (366 additional authors not shown)
Abstract:
Measurements of the midrapidity transverse energy distribution, $d\Et/dη$, are presented for $p$$+$$p$, $d$$+$Au, and Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV and additionally for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=62.4$ and 130 GeV. The $d\Et/dη$ distributions are first compared with the number of nucleon participants $N_{\rm part}$, number of binary collisions $N_{\rm coll}$, and nu…
▽ More
Measurements of the midrapidity transverse energy distribution, $d\Et/dη$, are presented for $p$$+$$p$, $d$$+$Au, and Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV and additionally for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=62.4$ and 130 GeV. The $d\Et/dη$ distributions are first compared with the number of nucleon participants $N_{\rm part}$, number of binary collisions $N_{\rm coll}$, and number of constituent-quark participants $N_{qp}$ calculated from a Glauber model based on the nuclear geometry. For Au$+$Au, $\mean{d\Et/dη}/N_{\rm part}$ increases with $N_{\rm part}$, while $\mean{d\Et/dη}/N_{qp}$ is approximately constant for all three energies. This indicates that the two component ansatz, $dE_{T}/dη\propto (1-x) N_{\rm part}/2 + x N_{\rm coll}$, which has been used to represent $E_T$ distributions, is simply a proxy for $N_{qp}$, and that the $N_{\rm coll}$ term does not represent a hard-scattering component in $E_T$ distributions. The $dE_{T}/dη$ distributions of Au$+$Au and $d$$+$Au are then calculated from the measured $p$$+$$p$ $E_T$ distribution using two models that both reproduce the Au$+$Au data. However, while the number-of-constituent-quark-participant model agrees well with the $d$$+$Au data, the additive-quark model does not.
△ Less
Submitted 23 December, 2013;
originally announced December 2013.
-
Measurement of Direct Photons in Au+Au Collisions at sqrt(s_NN) = 200 GeV
Authors:
S. Afanasiev,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
A. Al-Jamel,
J. Alexander,
K. Aoki,
L. Aphecetche,
R. Armendariz,
S. H. Aronson,
R. Averbeck,
T. C. Awes,
B. Azmoun,
V. Babintsev,
A. Baldisseri,
K. N. Barish,
P. D. Barnes,
B. Bassalleck,
S. Bathe,
S. Batsouli,
V. Baublis,
F. Bauer,
A. Bazilevsky,
S. Belikov,
R. Bennett
, et al. (321 additional authors not shown)
Abstract:
We report the measurement of direct photons at midrapidity in Au+Au collisions at sqrt{s_NN} = 200 GeV. The direct photon signal was extracted for the transverse-momentum range of 4 GeV/c < p_T < 22 GeV/c, using a statistical method to subtract decay photons from the inclusive-photon sample. The direct-photon nuclear-modification factor R_AA was calculated as a function of p_T for different Au+Au…
▽ More
We report the measurement of direct photons at midrapidity in Au+Au collisions at sqrt{s_NN} = 200 GeV. The direct photon signal was extracted for the transverse-momentum range of 4 GeV/c < p_T < 22 GeV/c, using a statistical method to subtract decay photons from the inclusive-photon sample. The direct-photon nuclear-modification factor R_AA was calculated as a function of p_T for different Au+Au collision centralities using the measured p+p direct-photon spectrum and compared to theoretical predictions. R_AA was found to be consistent with unity for all centralities over the entire measured p_T range. Theoretical models that account for modifications of initial-direct-photon production due to modified-parton-distribution functions in Au and the different isospin composition of the nuclei, predict a modest change of R_AA from unity and are consistent with the data. Models with compensating effects of the quark-gluon plasma on high-energy photons, such as suppression of jet-fragmentation photons and induced-photon bremsstrahlung from partons traversing the medium, are also consistent with this measurement.
△ Less
Submitted 25 May, 2012;
originally announced May 2012.