-
Many-Shot In-Context Learning
Authors:
Rishabh Agarwal,
Avi Singh,
Lei M. Zhang,
Bernd Bohnet,
Luis Rosias,
Stephanie Chan,
Biao Zhang,
Ankesh Anand,
Zaheer Abbas,
Azade Nova,
John D. Co-Reyes,
Eric Chu,
Feryal Behbahani,
Aleksandra Faust,
Hugo Larochelle
Abstract:
Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative…
▽ More
Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative and discriminative tasks. While promising, many-shot ICL can be bottlenecked by the available amount of human-generated examples. To mitigate this limitation, we explore two new settings: Reinforced and Unsupervised ICL. Reinforced ICL uses model-generated chain-of-thought rationales in place of human examples. Unsupervised ICL removes rationales from the prompt altogether, and prompts the model only with domain-specific questions. We find that both Reinforced and Unsupervised ICL can be quite effective in the many-shot regime, particularly on complex reasoning tasks. Finally, we demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases, can learn high-dimensional functions with numerical inputs, and performs comparably to fine-tuning. Our analysis also reveals the limitations of next-token prediction loss as an indicator of downstream ICL performance.
△ Less
Submitted 22 May, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1092 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 14 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Energy Efficiency Maximization in IRS-enabled Phase Cooperative PS-SWIPT based Self-sustainable IoT Network
Authors:
Haleema Sadia,
Ahmad Kamal Hassan,
Ziaul Haq Abbas,
Ghulam Abbas,
Thar Baker
Abstract:
Power splitting based simultaneous wireless information and power transfer (PS-SWIPT) appears to be a promising solution to support future self-sustainable Internet of Things (SS-IoT) networks. However, the performance of these networks is constrained by radio frequency signal strength and channel impairments. To address this challenge, intelligent reflecting surfaces (IRSs) are introduced in PS-S…
▽ More
Power splitting based simultaneous wireless information and power transfer (PS-SWIPT) appears to be a promising solution to support future self-sustainable Internet of Things (SS-IoT) networks. However, the performance of these networks is constrained by radio frequency signal strength and channel impairments. To address this challenge, intelligent reflecting surfaces (IRSs) are introduced in PS-SWIPT based SS-IoT networks to improve network efficiency by controlling signal reflections. In this article, an IRS-enabled phase cooperative framework is proposed to improve energy efficiency (EE) of the IoT network $({\mathtt {I}}^{net})$ using phase shifts of the user network $({\mathtt {U}^{net})}$, without constraining hardware resources at ${\mathtt {U}^{net}}$. We exploit transmit beamforming (BF) at access points (APs) and phase shifts optimization at the IRS end with phase effective cooperation between APs to enhance ${\mathtt {I}}^{net}$ EE performance. The maximization problem turns out to be NP-hard, so first, an alternating optimization (AO) is solved for the ${\mathtt {U}^{net}}$ using low computational complexity heuristic BF approaches, namely, transmit minimum-mean-square-error and zero-forcing BF, and phase optimization is performed using semidefinite relaxation (SDR) approach. To combat the computational complexity of AO, we also propose an alternative solution by exploiting heuristic BF schemes and an iterative algorithm, i.e., the element-wise block-coordinate descent method for phase shifts optimization. Next, EE maximization is solved for the ${\mathtt {I}^{net}}$ by optimizing the PS ratio and active BF vectors by exploiting optimal phase shifts of the ${\mathtt {U}}^{net}$. Simulation results confirm that employing IRS phase cooperation in PS-SWIPT based SS-IoT networks can significantly improve EE performance of ${\mathtt {I}^{net}}$ without constraining resources.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Towards model-free RL algorithms that scale well with unstructured data
Authors:
Joseph Modayil,
Zaheer Abbas
Abstract:
Conventional reinforcement learning (RL) algorithms exhibit broad generality in their theoretical formulation and high performance on several challenging domains when combined with powerful function approximation. However, developing RL algorithms that perform well across problems with unstructured observations at scale remains challenging because most function approximation methods rely on extern…
▽ More
Conventional reinforcement learning (RL) algorithms exhibit broad generality in their theoretical formulation and high performance on several challenging domains when combined with powerful function approximation. However, developing RL algorithms that perform well across problems with unstructured observations at scale remains challenging because most function approximation methods rely on externally provisioned knowledge about the structure of the input for good performance (e.g. convolutional networks, graph neural networks, tile-coding). A common practice in RL is to evaluate algorithms on a single problem, or on problems with limited variation in the observation scale. RL practitioners lack a systematic way to study how well a single RL algorithm performs when instantiated across a range of problem scales, and they lack function approximation techniques that scale well with unstructured observations.
We address these limitations by providing environments and algorithms to study scaling for unstructured observation vectors and flat action spaces. We introduce a family of combinatorial RL problems with an exponentially large state space and high-dimensional dynamics but where linear computation is sufficient to learn a (nonlinear) value function estimate for performant control. We provide an algorithm that constructs reward-relevant general value function (GVF) questions to find and exploit predictive structure directly from the experience stream. In an empirical evaluation of the approach on synthetic problems, we observe a sample complexity that scales linearly with the observation size. The proposed algorithm reliably outperforms a conventional deep RL algorithm on these scaling problems, and they exhibit several desirable auxiliary properties. These results suggest new algorithmic mechanisms by which algorithms can learn at scale from unstructured data.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Enhancing ML model accuracy for Digital VLSI circuits using diffusion models: A study on synthetic data generation
Authors:
Prasha Srivastava,
Pawan Kumar,
Zia Abbas
Abstract:
Generative AI has seen remarkable growth over the past few years, with diffusion models being state-of-the-art for image generation. This study investigates the use of diffusion models in generating artificial data generation for electronic circuits for enhancing the accuracy of subsequent machine learning models in tasks such as performance assessment, design, and testing when training data is us…
▽ More
Generative AI has seen remarkable growth over the past few years, with diffusion models being state-of-the-art for image generation. This study investigates the use of diffusion models in generating artificial data generation for electronic circuits for enhancing the accuracy of subsequent machine learning models in tasks such as performance assessment, design, and testing when training data is usually known to be very limited. We utilize simulations in the HSPICE design environment with 22nm CMOS technology nodes to obtain representative real training data for our proposed diffusion model. Our results demonstrate the close resemblance of synthetic data using diffusion model to real data. We validate the quality of generated data, and demonstrate that data augmentation certainly effective in predictive analysis of VLSI design for digital circuits.
△ Less
Submitted 15 October, 2023;
originally announced October 2023.
-
Enforcing Data Geolocation Policies in Public Clouds using Trusted Computing
Authors:
Zair Abbas,
Mudassar Aslam
Abstract:
With the advancement in technology, Cloud computing always amazes the world with revolutionizing solutions that automate and simplify complex computational tasks. The advantages like no maintenance cost, accessibility, data backup, pay-per-use models, unlimited storage, and processing power encourage individuals and businesses to migrate their workload to the cloud. Despite the numerous advantages…
▽ More
With the advancement in technology, Cloud computing always amazes the world with revolutionizing solutions that automate and simplify complex computational tasks. The advantages like no maintenance cost, accessibility, data backup, pay-per-use models, unlimited storage, and processing power encourage individuals and businesses to migrate their workload to the cloud. Despite the numerous advantages of cloud computing, the geolocation of data in the cloud environment is a massive concern, which relates to the performance and government legislation that will be applied to data. The unclarity of data geolocation can cause compliance concerns. In this work, we have presented a technique that will allow users to restrict the geolocation of their data in the cloud environment. We have used trusted computing mechanisms to attest the host and its geolocation remotely. With this model, the user will upload the data whose decryption key will be shared with a third-party attestation server only. The decryption key will be sealed to the TPM of the host after successful attestation guaranteeing the authorized geolocation and platform state.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Loss of Plasticity in Continual Deep Reinforcement Learning
Authors:
Zaheer Abbas,
Rosie Zhao,
Joseph Modayil,
Adam White,
Marlos C. Machado
Abstract:
The ability to learn continually is essential in a complex and changing world. In this paper, we characterize the behavior of canonical value-based deep reinforcement learning (RL) approaches under varying degrees of non-stationarity. In particular, we demonstrate that deep RL agents lose their ability to learn good policies when they cycle through a sequence of Atari 2600 games. This phenomenon i…
▽ More
The ability to learn continually is essential in a complex and changing world. In this paper, we characterize the behavior of canonical value-based deep reinforcement learning (RL) approaches under varying degrees of non-stationarity. In particular, we demonstrate that deep RL agents lose their ability to learn good policies when they cycle through a sequence of Atari 2600 games. This phenomenon is alluded to in prior work under various guises -- e.g., loss of plasticity, implicit under-parameterization, primacy bias, and capacity loss. We investigate this phenomenon closely at scale and analyze how the weights, gradients, and activations change over time in several experiments with varying dimensions (e.g., similarity between games, number of games, number of frames per game), with some experiments spanning 50 days and 2 billion environment interactions. Our analysis shows that the activation footprint of the network becomes sparser, contributing to the diminishing gradients. We investigate a remarkably simple mitigation strategy -- Concatenated ReLUs (CReLUs) activation function -- and demonstrate its effectiveness in facilitating continual learning in a changing environment.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Enhanced entanglement and controlling quantum steering in a Laguerre-Gaussian cavity optomechanical system with two rotating mirrors
Authors:
Amjad Sohail,
Zaheer Abbas,
Rizwan Ahmed,
Aamir Shahzad,
Naeem Akhtar,
Jia-Xing Peng
Abstract:
Gaussian quantum steering is a type of quantum correlation in which two entangled states exhibit asymmetry. We present an efficient theoretical scheme for controlling quantum steering and enhancing entanglement in a Laguerre-Gaussian (LG) rotating cavity optomechanical system with an optical parametric amplifier (OPA) driven by coherent light. The numerical simulation results show that manipulatin…
▽ More
Gaussian quantum steering is a type of quantum correlation in which two entangled states exhibit asymmetry. We present an efficient theoretical scheme for controlling quantum steering and enhancing entanglement in a Laguerre-Gaussian (LG) rotating cavity optomechanical system with an optical parametric amplifier (OPA) driven by coherent light. The numerical simulation results show that manipulating system parameters such as parametric gain $χ$, parametric phase $θ$, and rotating mirror frequency, among others, significantly improves mirror-mirror and mirror-cavity entanglement. In addition to bipartite entanglement, we achieve mirror-cavity-mirror tripartite entanglement. Another intriguing discovery is the control of quantum steering, for which we obtained several results by investigating it for various system parameters. We show that the steering directivity is primarily determined by the frequency of two rotating mirrors. Furthermore, for two rotating mirrors, quantum steering is found to be asymmetric both one-way and two-way. As a result, we can assert that the current proposal may help in the understanding of non-local correlations and entanglement verification tasks.
△ Less
Submitted 12 March, 2023;
originally announced March 2023.
-
Qualitative Data Augmentation for Performance Prediction in VLSI circuits
Authors:
Prasha Srivastava,
Pawan Kumar,
Zia Abbas
Abstract:
Various studies have shown the advantages of using Machine Learning (ML) techniques for analog and digital IC design automation and optimization. Data scarcity is still an issue for electronic designs, while training highly accurate ML models. This work proposes generating and evaluating artificial data using generative adversarial networks (GANs) for circuit data to aid and improve the accuracy o…
▽ More
Various studies have shown the advantages of using Machine Learning (ML) techniques for analog and digital IC design automation and optimization. Data scarcity is still an issue for electronic designs, while training highly accurate ML models. This work proposes generating and evaluating artificial data using generative adversarial networks (GANs) for circuit data to aid and improve the accuracy of ML models trained with a small training data set. The training data is obtained by various simulations in the Cadence Virtuoso, HSPICE, and Microcap design environment with TSMC 180nm and 22nm CMOS technology nodes. The artificial data is generated and tested for an appropriate set of analog and digital circuits. The experimental results show that the proposed artificial data generation significantly improves ML models and reduces the percentage error by more than 50\% of the original percentage error, which were previously trained with insufficient data. Furthermore, this research aims to contribute to the extensive application of AI/ML in the field of VLSI design and technology by relieving the training data availability-related challenges.
△ Less
Submitted 15 February, 2023;
originally announced February 2023.
-
Urdu News Article Recommendation Model using Natural Language Processing Techniques
Authors:
Syed Zain Abbas,
Dr. Arif ur Rahman,
Abdul Basit Mughal,
Syed Mujtaba Haider
Abstract:
There are several online newspapers in urdu but for the users it is difficult to find the content they are looking for because these most of them contain irrelevant data and most users did not get what they want to retrieve. Our proposed framework will help to predict Urdu news in the interests of users and reduce the users searching time for news. For this purpose, NLP techniques are used for pre…
▽ More
There are several online newspapers in urdu but for the users it is difficult to find the content they are looking for because these most of them contain irrelevant data and most users did not get what they want to retrieve. Our proposed framework will help to predict Urdu news in the interests of users and reduce the users searching time for news. For this purpose, NLP techniques are used for pre-processing, and then TF-IDF with cosine similarity is used for gaining the highest similarity and recommended news on user preferences. Moreover, the BERT language model is also used for similarity, and by using the BERT model similarity increases as compared to TF-IDF so the approach works better with the BERT language model and recommends news to the user on their interest. The news is recommended when the similarity of the articles is above 60 percent.
△ Less
Submitted 29 May, 2022;
originally announced June 2022.
-
Investigating the Properties of Neural Network Representations in Reinforcement Learning
Authors:
Han Wang,
Erfan Miahi,
Martha White,
Marlos C. Machado,
Zaheer Abbas,
Raksha Kumaraswamy,
Vincent Liu,
Adam White
Abstract:
In this paper we investigate the properties of representations learned by deep reinforcement learning systems. Much of the early work on representations for reinforcement learning focused on designing fixed-basis architectures to achieve properties thought to be desirable, such as orthogonality and sparsity. In contrast, the idea behind deep reinforcement learning methods is that the agent designe…
▽ More
In this paper we investigate the properties of representations learned by deep reinforcement learning systems. Much of the early work on representations for reinforcement learning focused on designing fixed-basis architectures to achieve properties thought to be desirable, such as orthogonality and sparsity. In contrast, the idea behind deep reinforcement learning methods is that the agent designer should not encode representational properties, but rather that the data stream should determine the properties of the representation -- good representations emerge under appropriate training schemes. In this paper we bring these two perspectives together, empirically investigating the properties of representations that support transfer in reinforcement learning. We introduce and measure six representational properties over more than 25 thousand agent-task settings. We consider Deep Q-learning agents with different auxiliary losses in a pixel-based navigation environment, with source and transfer tasks corresponding to different goal locations. We develop a method to better understand why some representations work better for transfer, through a systematic approach varying task similarity and measuring and correlating representation properties with transfer performance. We demonstrate the generality of the methodology by investigating representations learned by a Rainbow agent that successfully transfer across games modes in Atari 2600.
△ Less
Submitted 5 May, 2023; v1 submitted 29 March, 2022;
originally announced March 2022.
-
AI/ML Algorithms and Applications in VLSI Design and Technology
Authors:
Deepthi Amuru,
Harsha V. Vudumula,
Pavan K. Cherupally,
Sushanth R. Gurram,
Amir Ahmad,
Andleeb Zahra,
Zia Abbas
Abstract:
An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast…
▽ More
An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations.
△ Less
Submitted 15 February, 2023; v1 submitted 21 February, 2022;
originally announced February 2022.
-
Effect of Measurement Errors on the Multivariate CUSUM CoDa Control Chart for the Manufacturing Process
Authors:
Muhammad Imran,
Jinsheng Sun,
Fatima Sehar Zaidi,
Zameer Abbas,
Hafiz Zafar Nazir
Abstract:
Control charts, one of the main tools in Statistical Process Control (SPC), have been widely adopted in manufacturing sectors as an effective strategy for malfunction detection throughout the previous decades. Measurement errors (M.E's) are involved in the quality characteristic of interest. The authors explored the impact of a linear covariate error model on the multivariate cumulative sum (CUSUM…
▽ More
Control charts, one of the main tools in Statistical Process Control (SPC), have been widely adopted in manufacturing sectors as an effective strategy for malfunction detection throughout the previous decades. Measurement errors (M.E's) are involved in the quality characteristic of interest. The authors explored the impact of a linear covariate error model on the multivariate cumulative sum (CUSUM) control charts for a specific kind of data known as compositional data(CoDa). The average run length ARL is used to assess the performance of the proposed chart. The results indicate that M.E's significantly affects the multivariate CUSUM-CoDa control charts. The authors have used the Markov chain method to study the impact of different involved parameters using four different cases for the variance-covariance matrix (i.e. uncorrelated with equal variances, negatively correlated with equal variances, uncorrelated with unequal variances, positively correlated with unequal variances). The authors concluded that the ARL of the multivariate CUSUM-CoDa chart increase with an increase in the value of error variance-covariance matrix, while the ARL decreases with an increase in the subgroup size m or the constant powering b. For the implementation of the proposal, two illustrated examples have been reported for multivariate CUSUM-CoDa control charts in the presence of M.E's. One deals with the manufacturing process of uncoated aspirin tablets, and the other is based on monitoring machines in the muesli manufacturing process.
△ Less
Submitted 10 February, 2022; v1 submitted 26 January, 2022;
originally announced January 2022.
-
From Eye-blinks to State Construction: Diagnostic Benchmarks for Online Representation Learning
Authors:
Banafsheh Rafiee,
Zaheer Abbas,
Sina Ghiassian,
Raksha Kumaraswamy,
Richard Sutton,
Elliot Ludvig,
Adam White
Abstract:
We present three new diagnostic prediction problems inspired by classical-conditioning experiments to facilitate research in online prediction learning. Experiments in classical conditioning show that animals such as rabbits, pigeons, and dogs can make long temporal associations that enable multi-step prediction. To replicate this remarkable ability, an agent must construct an internal state repre…
▽ More
We present three new diagnostic prediction problems inspired by classical-conditioning experiments to facilitate research in online prediction learning. Experiments in classical conditioning show that animals such as rabbits, pigeons, and dogs can make long temporal associations that enable multi-step prediction. To replicate this remarkable ability, an agent must construct an internal state representation that summarizes its interaction history. Recurrent neural networks can automatically construct state and learn temporal associations. However, the current training methods are prohibitively expensive for online prediction -- continual learning on every time step -- which is the focus of this paper. Our proposed problems test the learning capabilities that animals readily exhibit and highlight the limitations of the current recurrent learning methods. While the proposed problems are nontrivial, they are still amenable to extensive testing and analysis in the small-compute regime, thereby enabling researchers to study issues in isolation, ultimately accelerating progress towards scalable online representation learning methods.
△ Less
Submitted 10 October, 2022; v1 submitted 9 November, 2020;
originally announced November 2020.
-
Selective Dyna-style Planning Under Limited Model Capacity
Authors:
Zaheer Abbas,
Samuel Sokota,
Erin J. Talvitie,
Martha White
Abstract:
In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but…
▽ More
In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful. An effective selective planning mechanism requires estimating predictive uncertainty, which arises out of aleatoric uncertainty, parameter uncertainty, and model inadequacy, among other sources. Prior work has focused on parameter uncertainty for selective planning. In this work, we emphasize the importance of model inadequacy. We show that heteroscedastic regression can signal predictive uncertainty arising from model inadequacy that is complementary to that which is detected by methods designed for parameter uncertainty, indicating that considering both parameter uncertainty and model inadequacy may be a more promising direction for effective selective planning than either in isolation.
△ Less
Submitted 7 March, 2021; v1 submitted 5 July, 2020;
originally announced July 2020.
-
Planning with Expectation Models
Authors:
Yi Wan,
Zaheer Abbas,
Adam White,
Martha White,
Richard S. Sutton
Abstract:
Distribution and sample models are two popular model choices in model-based reinforcement learning (MBRL). However, learning these models can be intractable, particularly when the state and action spaces are large. Expectation models, on the other hand, are relatively easier to learn due to their compactness and have also been widely used for deterministic environments. For stochastic environments…
▽ More
Distribution and sample models are two popular model choices in model-based reinforcement learning (MBRL). However, learning these models can be intractable, particularly when the state and action spaces are large. Expectation models, on the other hand, are relatively easier to learn due to their compactness and have also been widely used for deterministic environments. For stochastic environments, it is not obvious how expectation models can be used for planning as they only partially characterize a distribution. In this paper, we propose a sound way of using approximate expectation models for MBRL. In particular, we 1) show that planning with an expectation model is equivalent to planning with a distribution model if the state value function is linear in state features, 2) analyze two common parametrization choices for approximating the expectation: linear and non-linear expectation models, 3) propose a sound model-based policy evaluation algorithm and present its convergence results, and 4) empirically demonstrate the effectiveness of the proposed planning algorithm.
△ Less
Submitted 29 July, 2020; v1 submitted 1 April, 2019;
originally announced April 2019.
-
General Value Function Networks
Authors:
Matthew Schlegel,
Andrew Jacobsen,
Zaheer Abbas,
Andrew Patterson,
Adam White,
Martha White
Abstract:
State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation. This internal state provides a summary of the observed sequence, to facilitate accurate predictions…
▽ More
State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation. This internal state provides a summary of the observed sequence, to facilitate accurate predictions and decision-making. At the same time, specifying and training RNNs is notoriously tricky, particularly as the common strategy to approximate gradients back in time, called truncated Back-prop Through Time (BPTT), can be sensitive to the truncation window. Further, domain-expertise--which can usually help constrain the function class and so improve trainability--can be difficult to incorporate into complex recurrent units used within RNNs. In this work, we explore how to use multi-step predictions to constrain the RNN and incorporate prior knowledge. In particular, we revisit the idea of using predictions to construct state and ask: does constraining (parts of) the state to consist of predictions about the future improve RNN trainability? We formulate a novel RNN architecture, called a General Value Function Network (GVFN), where each internal state component corresponds to a prediction about the future represented as a value function. We first provide an objective for optimizing GVFNs, and derive several algorithms to optimize this objective. We then show that GVFNs are more robust to the truncation level, in many cases only requiring one-step gradient updates.
△ Less
Submitted 2 February, 2021; v1 submitted 17 July, 2018;
originally announced July 2018.
-
Remark on stabilization of second order evolution equations by unbounded dynamic feedbacks and applications
Authors:
Zainab Abbas,
Kais Ammari,
Denis Mercier
Abstract:
In this paper we consider second order evolution equations with unbounded dynamic feedbacks. Under a regularity assumption we show that observability properties for the undamped problem imply decay estimates for the damped problem. We consider both uniform and non uniform decay properties.
In this paper we consider second order evolution equations with unbounded dynamic feedbacks. Under a regularity assumption we show that observability properties for the undamped problem imply decay estimates for the damped problem. We consider both uniform and non uniform decay properties.
△ Less
Submitted 11 July, 2014;
originally announced July 2014.
-
M-ATTEMPT: A New Energy-Efficient Routing Protocol for Wireless Body Area Sensor Networks
Authors:
N. Javaid,
Z. Abbas,
M. S. Fareed,
Z. A. Khan,
N. Alrajeh
Abstract:
In this paper, we propose a new routing protocol for heterogeneous Wireless Body Area Sensor Networks (WBASNs); Mobility-supporting Adaptive Threshold-based Thermal-aware Energy-efficientMulti-hop ProTocol (M-ATTEMPT). A prototype is defined for employing heterogeneous sensors on human body. Direct communication is used for real-time traffic (critical data) or on-demand data while Multi-hop commun…
▽ More
In this paper, we propose a new routing protocol for heterogeneous Wireless Body Area Sensor Networks (WBASNs); Mobility-supporting Adaptive Threshold-based Thermal-aware Energy-efficientMulti-hop ProTocol (M-ATTEMPT). A prototype is defined for employing heterogeneous sensors on human body. Direct communication is used for real-time traffic (critical data) or on-demand data while Multi-hop communication is used for normal data delivery. One of the prime challenges in WBASNs is sensing of the heat generated by the implanted sensor nodes. The proposed routing algorithm is thermal-aware which senses the link Hot-spot and routes the data away from these links. Continuous mobility of human body causes disconnection between previous established links. So, mobility support and energy-management is introduced to overcome the problem. Linear Programming (LP) model for maximum information extraction and minimum energy consumption is presented in this study. MATLAB simulations of proposed routing algorithm are performed for lifetime and successful packet delivery in comparison with Multi-hop communication. The results show that the proposed routing algorithm has less energy consumption and more reliable as compared to Multi-hop communication.
△ Less
Submitted 21 March, 2013;
originally announced March 2013.
-
EAPESS: An Adaptive Transmission Scheme in Wireless Sensor Networks
Authors:
Z. Abbas,
N. Javaid,
A. Javaid,
Z. A. Khan,
M. A. Khan,
U. Qasim
Abstract:
Reduced energy consumption in sensor nodes is one of the major challenges in Wireless Sensor Networks (WSNs) deployment. In this regard, Error Control Coding (ECC) is one of techniques used for energy optimization in WSNs. Similarly, critical distance is another term being used for energy efficiency, when used with ECC provides better results of energy saving. In this paper three different critica…
▽ More
Reduced energy consumption in sensor nodes is one of the major challenges in Wireless Sensor Networks (WSNs) deployment. In this regard, Error Control Coding (ECC) is one of techniques used for energy optimization in WSNs. Similarly, critical distance is another term being used for energy efficiency, when used with ECC provides better results of energy saving. In this paper three different critical distance values are used against different coding gains for sake of energy saving. If distance lies below critical distance values then particular encoders are selected with respect to their particular coding gains. Coding gains are used for critical distances estimation of all encoders. This adaptive encoder and transmit power selection scheme with respect to their coding gain results in a significant energy saving in WSNs environment. Simulations provide better results of energy saving achieved by using this adaptive scheme.
△ Less
Submitted 19 March, 2013;
originally announced March 2013.
-
Decreasing defect rate of test cases by designing and analysis for recursive modules of a program structure: Improvement in test cases
Authors:
Muhammad Javed,
Bashir Ahmad,
Zaffar Abbas,
Allah Nawaz,
Muhammad Ali Abid,
Ihsan Ullah
Abstract:
Designing and analysis of test cases is a challenging tasks for tester roles especially those who are related to test the structure of program. Recently, Programmers are showing valuable trend towards the implementation of recursive modules in a program structure. In testing phase of software development life cycle, test cases help the tester to test the structure and flow of program. The implemen…
▽ More
Designing and analysis of test cases is a challenging tasks for tester roles especially those who are related to test the structure of program. Recently, Programmers are showing valuable trend towards the implementation of recursive modules in a program structure. In testing phase of software development life cycle, test cases help the tester to test the structure and flow of program. The implementation of well designed test cases for a program leads to reduce the defect rate and efforts needed for corrective maintenance. In this paper, author proposed a strategy to design and analyze the test cases for a program structure of recursive modules. This strategy will definitely leads to validation of program structure besides reducing the defect rate and corrective maintenance efforts.
△ Less
Submitted 26 August, 2012;
originally announced August 2012.
-
Simulation Analysis of IEEE 802.15.4 Non-beacon Mode at Varying Data Rates
Authors:
Z. Abbas,
N. Javaid,
M. A. Khan,
S. Ahmed,
U. Qasim,
Z. A. Khan
Abstract:
IEEE 802.15.4 standard is designed for low power and low data rate applications with high reliability. It operates in beacon enable and non-beacon enable modes. In this work, we analyze delay, throughput, load, and end-to-end delay of nonbeacon enable mode. Analysis of these parameters are performed at varying data rates. Evaluation of non beacon enabled mode is done in a 10 node network. We limit…
▽ More
IEEE 802.15.4 standard is designed for low power and low data rate applications with high reliability. It operates in beacon enable and non-beacon enable modes. In this work, we analyze delay, throughput, load, and end-to-end delay of nonbeacon enable mode. Analysis of these parameters are performed at varying data rates. Evaluation of non beacon enabled mode is done in a 10 node network. We limit our analysis to non beacon or unslotted version because, it performs better than other. Protocol performance is examined by changing different Medium Access Control (MAC) parameters. We consider a full size MAC packet with payload size of 114 bytes. In this paper we show that maximum throughput and lowest delay is achieved at highest data rate.
△ Less
Submitted 12 August, 2012;
originally announced August 2012.
-
Automatic Vehicle Checking Agent (VCA)
Authors:
Bashir Ahmad,
Shakeel Ahmad,
Shahid Hussain,
Muhammad Zaheer Aslam,
Zafar Abbas
Abstract:
A definition of intelligence is given in terms of performance that can be quantitatively measured. In this study, we have presented a conceptual model of Intelligent Agent System for Automatic Vehicle Checking Agent (VCA). To achieve this goal, we have introduced several kinds of agents that exhibit intelligent features. These are the Management agent, internal agent, External Agent, Watcher agent…
▽ More
A definition of intelligence is given in terms of performance that can be quantitatively measured. In this study, we have presented a conceptual model of Intelligent Agent System for Automatic Vehicle Checking Agent (VCA). To achieve this goal, we have introduced several kinds of agents that exhibit intelligent features. These are the Management agent, internal agent, External Agent, Watcher agent and Report agent. Metrics and measurements are suggested for evaluating the performance of Automatic Vehicle Checking Agent (VCA). Calibrate data and test facilities are suggested to facilitate the development of intelligent systems.
△ Less
Submitted 3 December, 2011; v1 submitted 9 April, 2011;
originally announced April 2011.
-
Molecular dynamics simulation of nanocolloidal amorphous silica particles: Part II
Authors:
S. Jenkins,
S. R. Kirk,
M. Persson,
J. Carlen,
Z. Abbas
Abstract:
Explicit molecular dynamics simulations were applied to a pair of amorphous silica nanoparticles of diameter 3.2 nm immersed in a background electrolyte. Mean forces acting between the pair of silica nanoparticles were extracted at four different background electrolyte concentrations. Dependence of the inter-particle potential of mean force on the separation and the silicon to sodium ratio, as w…
▽ More
Explicit molecular dynamics simulations were applied to a pair of amorphous silica nanoparticles of diameter 3.2 nm immersed in a background electrolyte. Mean forces acting between the pair of silica nanoparticles were extracted at four different background electrolyte concentrations. Dependence of the inter-particle potential of mean force on the separation and the silicon to sodium ratio, as well as on the background electrolyte concentration, are demonstrated. The pH was indirectly accounted for via the ratio of silicon to sodium used in the simulations. The nature of the interaction of the counter-ions with charged silica surface sites (deprotonated silanols) was also investigated. The effect of the sodium double layer on the water ordering was investigated for three Si:Na+ ratios. The number of water molecules trapped inside the nanoparticles was investigated as the Si:Na+ ratio was varied. Differences in this number between the two nanoparticles in the simulations are attributed to differences in the calculated electric dipole moment. The implications of the form of the potentials for aggregation are also discussed.
△ Less
Submitted 11 September, 2007; v1 submitted 19 August, 2007;
originally announced August 2007.
-
Molecular dynamics simulation of nanocolloidal amorphous silica particles: Part I
Authors:
S. Jenkins,
S. R. Kirk,
M. Persson,
J. Carlen,
Z. Abbas
Abstract:
Explicit molecular dynamics simulations were applied to a pair of amorphous silica nanoparticles in aqueous solution, of diameter 4.4 nm with four different background electrolyte concentrations, to extract the mean force acting between the pair of silica nanoparticles. Dependences of the interparticle forces with separation and the background electrolyte concentration were demonstrated. The nat…
▽ More
Explicit molecular dynamics simulations were applied to a pair of amorphous silica nanoparticles in aqueous solution, of diameter 4.4 nm with four different background electrolyte concentrations, to extract the mean force acting between the pair of silica nanoparticles. Dependences of the interparticle forces with separation and the background electrolyte concentration were demonstrated. The nature of the interaction of the counter-ions with charged silica surface sites (deprotonated silanols) was investigated. A 'patchy' double layer of adsorbed sodium counter-ions. was observed. Dependences of the interparticle potential of mean force with separation and the background electrolyte concentration were demonstrated. Direct evidence of the solvation forces is presented in terms of changes of the water ordering at the surfaces of the isolated and double nanoparticles. The nature of the interaction of the counter-ions with charged silica surface sites (deprotonated silanols) was investigated in terms of quantifying the effects of the number of water molecules separately inside each of the pair of nanoparticles by defining an impermeability measure. A direct correlation was found between impermeability (related to the silica surface 'hairiness') and the disruption of water ordering. Differences in the impermeability between the two nanoparticles are attributed to differences in the calculated electric dipole moment.
△ Less
Submitted 16 September, 2007; v1 submitted 19 August, 2007;
originally announced August 2007.
-
A Semantic Grid-based E-Learning Framework (SELF)
Authors:
Zaheer Abbas,
Muhammad Umer,
Mohammed Odeh,
Richard McClatchey,
Arshad Ali,
Farooq Ahmad
Abstract:
E-learning can be loosely defined as a wide set of applications and processes, which uses available electronic media (and tools) to deliver vocational education and training. With its increasing recognition as an ubiquitous mode of instruction and interaction in the academic as well as corporate world, the need for a scaleable and realistic model is becoming important. In this paper we introduce…
▽ More
E-learning can be loosely defined as a wide set of applications and processes, which uses available electronic media (and tools) to deliver vocational education and training. With its increasing recognition as an ubiquitous mode of instruction and interaction in the academic as well as corporate world, the need for a scaleable and realistic model is becoming important. In this paper we introduce SELF; a Semantic grid-based E-Learning Framework. SELF aims to identify the key-enablers in a practical grid-based E-learning environment and to minimize technological reworking by proposing a well-defined interaction plan among currently available tools and technologies. We define a dichotomy with E-learning specific application layers on top and semantic grid-based support layers underneath. We also map the latest open and freeware technologies with various components in SELF.
△ Less
Submitted 9 February, 2005;
originally announced February 2005.