default search action
Shalabh Bhatnagar
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j83]Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
Truncated Cauchy random perturbations for smoothed functional-based stochastic optimization. Autom. 162: 111528 (2024) - [j82]Arghyadeep Barat, Prabuchandran K. J., Shalabh Bhatnagar:
Energy Management in a Cooperative Energy Harvesting Wireless Sensor Network. IEEE Commun. Lett. 28(1): 243-247 (2024) - [j81]Lakshmi Mandal, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Variance-Reduced Deep Actor-Critic With an Optimally Subsampled Actor Recursion. IEEE Trans. Artif. Intell. 5(7): 3607-3623 (2024) - [c87]Mizhaan Prajit Maniyar, Prashanth L. A., Akash Mondal, Shalabh Bhatnagar:
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning. AISTATS 2024: 4708-4716 - [c86]Ashish Srivastava, Shalabh Bhatnagar, M. Narasimha Murty, Jagannathan Ramanujam:
Learning Dynamic Representations in Large Language Models for Evolving Data Streams. ICPR (5) 2024: 239-253 - [c85]V. P. Vivek, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Dynamic Energy Management in Competing Microgrids using Reinforcement Learning. ISGT 2024: 1-5 - [c84]Joji Joseph, Bharadwaj Amrutur, Shalabh Bhatnagar:
Segmentation of 3D Gaussians using Masked Gradients. SIGGRAPH Asia Posters 2024: 77:1-77:2 - [i85]Prashansa Panda, Shalabh Bhatnagar:
Critic-Actor for Average Reward MDPs with Function Approximation: A Finite-Time Analysis. CoRR abs/2402.01371 (2024) - [i84]Joji Joseph, Bharadwaj Amrutur, Shalabh Bhatnagar:
Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks. CoRR abs/2409.11681 (2024) - 2023
- [j80]Shalabh Bhatnagar, Vivek S. Borkar, Soumyajit Guin:
Actor-Critic or Critic-Actor? A Tale of Two Time Scales. IEEE Control. Syst. Lett. 7: 2671-2676 (2023) - [c83]Soumyajit Guin, Shalabh Bhatnagar:
A Policy Gradient Approach for Finite Horizon Constrained Markov Decision Processes. CDC 2023: 3353-3359 - [c82]Shalabh Bhatnagar, Prashanth L. A.:
Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias. CISS 2023: 1-6 - [c81]Naman Saxena, Subhojyoti Khastagir, Shishir Kolathaya, Shalabh Bhatnagar:
Off-Policy Average Reward Actor-Critic with Deterministic Policy Search. ICML 2023: 30130-30203 - [c80]Sambhu H. Karumanchi, Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
Autonomous UAV Navigation in Complex Environments using Human Feedback. RO-MAN 2023: 499-506 - [i83]Lakshmi Mandal, Shalabh Bhatnagar:
n-Step Temporal Difference Learning with Optimal n. CoRR abs/2303.07068 (2023) - [i82]Mizhaan Prajit Maniyar, Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning. CoRR abs/2304.10951 (2023) - [i81]Arunselvan Ramaswamy, Shalabh Bhatnagar, Naman Saxena:
A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks. CoRR abs/2305.12125 (2023) - [i80]Naman Saxena, Subhojyoti Khastagir, Shishir Kolathaya, Shalabh Bhatnagar:
Off-Policy Average Reward Actor-Critic with Deterministic Policy Search. CoRR abs/2305.12239 (2023) - [i79]Shalabh Bhatnagar:
The Reinforce Policy Gradient Algorithm Revisited. CoRR abs/2310.05000 (2023) - [i78]Arghyadeep Barat, Prabuchandran K. J., Shalabh Bhatnagar:
Energy Management in a Cooperative Energy Harvesting Wireless Sensor Network. CoRR abs/2310.05911 (2023) - [i77]Prashansa Panda, Shalabh Bhatnagar:
Finite Time Analysis of Constrained Actor Critic and Constrained Natural Actor Critic Algorithms. CoRR abs/2310.16363 (2023) - [i76]Lakshmi Mandal, Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
Approximate Linear Programming and Decentralized Policy Improvement in Cooperative Multi-agent Markov Decision Processes. CoRR abs/2311.11789 (2023) - 2022
- [j79]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Analyzing Approximate Value Iteration Algorithms. Math. Oper. Res. 47(3): 2138-2159 (2022) - [j78]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Generalized Second-Order Value Iteration in Markov Decision Processes. IEEE Trans. Autom. Control. 67(8): 4241-4247 (2022) - [j77]Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar:
A Generalized Minimax Q-Learning Algorithm for Two-Player Zero-Sum Stochastic Games. IEEE Trans. Autom. Control. 67(9): 4816-4823 (2022) - [c79]Rohan Deb, Shalabh Bhatnagar:
Gradient Temporal Difference with Momentum: Stability and Convergence. AAAI 2022: 6488-6496 - [c78]Rohan Deb, Meet Gandhi, Shalabh Bhatnagar:
Schedule Based Temporal Difference Algorithms. Allerton 2022: 1-6 - [c77]Priya Shanmugasundaram, Shalabh Bhatnagar:
Co-operative Multi-agent Twin Delayed DDPG for Robust Phase Duration Optimization of Large Road Networks. ICAART (Revised Selected Paper 2022: 122-142 - [c76]Priya Shanmugasundaram, Shalabh Bhatnagar:
Robust Traffic Signal Timing Control using Multiagent Twin Delayed Deep Deterministic Policy Gradients. ICAART (2) 2022: 477-485 - [c75]Utkarsh A. Mishra, Soumya R. Samineni, Prakhar Goel, Chandravaran Kunjeti, Himanshu Lodha, Aman Singh, Aditya Sagi, Shalabh Bhatnagar, Shishir Kolathaya:
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning. ICRA 2022: 1631-1637 - [c74]Raghuram Bharadwaj Diddigi, Prateek Jain, Prabuchandran K. J., Shalabh Bhatnagar:
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm. IJCNN 2022: 1-10 - [c73]Ashish Kumar Jayant, Shalabh Bhatnagar:
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm. NeurIPS 2022 - [c72]Sindhu Padakandla, Prabuchandran K. J., Sourav Ganguly, Shalabh Bhatnagar:
Data Efficient Safe Reinforcement Learning. SMC 2022: 1167-1172 - [i75]Arun Raman, Keerthan Shagrithaya, Shalabh Bhatnagar:
Reinforcement Learning for Task Specifications with Action-Constraints. CoRR abs/2201.00286 (2022) - [i74]Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
A Gradient Smoothed Functional Algorithm with Truncated Cauchy Random Perturbations for Stochastic Optimization. CoRR abs/2208.00290 (2022) - [i73]Shalabh Bhatnagar, Vivek S. Borkar, Soumyajit Guin:
Actor-Critic or Critic-Actor? A Tale of Two Time Scales. CoRR abs/2210.04470 (2022) - [i72]Soumyajit Guin, Shalabh Bhatnagar:
A policy gradient approach for Finite Horizon Constrained Markov Decision Processes. CoRR abs/2210.04527 (2022) - [i71]Ashish Kumar Jayant, Shalabh Bhatnagar:
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm. CoRR abs/2210.07573 (2022) - [i70]Shalabh Bhatnagar, Prashanth L. A.:
Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias. CoRR abs/2212.10477 (2022) - 2021
- [j76]Prabuchandran K. J., Santosh Penubothula, Chandramouli Kamanchi, Shalabh Bhatnagar:
Novel First Order Bayesian Optimization with an Application to Reinforcement Learning. Appl. Intell. 51(3): 1565-1579 (2021) - [j75]Prasenjit Karmakar, Shalabh Bhatnagar:
On tight bounds for function approximation error in risk-sensitive reinforcement learning. Syst. Control. Lett. 150: 104899 (2021) - [j74]Arunselvan Ramaswamy, Shalabh Bhatnagar, Daniel E. Quevedo:
Asynchronous Stochastic Approximations With Asymptotically Biased Errors and Deep Multiagent Learning. IEEE Trans. Autom. Control. 66(9): 3969-3983 (2021) - [j73]Prasenjit Karmakar, Shalabh Bhatnagar:
Stochastic Approximation With Iterate-Dependent Markov Noise Under Verifiable Conditions in Compact State Space With the Stability of Iterates Not Ensured. IEEE Trans. Autom. Control. 66(12): 5941-5954 (2021) - [j72]Abhik Singla, Sindhu Padakandla, Shalabh Bhatnagar:
Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV With Limited Environment Knowledge. IEEE Trans. Intell. Transp. Syst. 22(1): 107-118 (2021) - [c71]P. Parnika, Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Shalabh Bhatnagar:
Attention Actor-Critic Algorithm for Multi-Agent Constrained Co-operative Reinforcement Learning. AAMAS 2021: 1616-1618 - [i69]P. Parnika, Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Shalabh Bhatnagar:
Attention Actor-Critic algorithm for Multi-Agent Constrained Co-operative Reinforcement Learning. CoRR abs/2101.02349 (2021) - [i68]Raghuram Bharadwaj Diddigi, Prateek Jain, Prabuchandran K. J., Shalabh Bhatnagar:
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm. CoRR abs/2110.10017 (2021) - [i67]Vivek VP, Shalabh Bhatnagar:
Finite Horizon Q-learning: Stability, Convergence and Simulations. CoRR abs/2110.15093 (2021) - [i66]Rohan Deb, Shalabh Bhatnagar:
Gradient Temporal Difference with Momentum: Stability and Convergence. CoRR abs/2111.11004 (2021) - [i65]Rohan Deb, Meet Gandhi, Shalabh Bhatnagar:
Schedule Based Temporal Difference Algorithms. CoRR abs/2111.11768 (2021) - [i64]Utkarsh A. Mishra, Soumya R. Samineni, Prakhar Goel, Chandravaran Kunjeti, Himanshu Lodha, Aman Singh, Aditya Sagi, Shalabh Bhatnagar, Shishir Kolathaya:
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning. CoRR abs/2112.02999 (2021) - [i63]Rohan Deb, Shalabh Bhatnagar:
N-Timescale Stochastic Approximation: Stability and Convergence. CoRR abs/2112.03515 (2021) - 2020
- [j71]Sindhu Padakandla, Prabuchandran K. J., Shalabh Bhatnagar:
Reinforcement learning algorithm for non-stationary environments. Appl. Intell. 50(11): 3590-3606 (2020) - [j70]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Successive Over-Relaxation ${Q}$ -Learning. IEEE Control. Syst. Lett. 4(1): 55-60 (2020) - [j69]Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar:
Generalized Speedy Q-Learning. IEEE Control. Syst. Lett. 4(3): 524-529 (2020) - [j68]Vinayaka G. Yaji, Shalabh Bhatnagar:
Stochastic Recursive Inclusions in Two Timescales with Nonadditive Iterate-Dependent Markov Noise. Math. Oper. Res. 45(4): 1405-1444 (2020) - [j67]Vinayaka G. Yaji, Shalabh Bhatnagar:
Analysis of Stochastic Approximation Schemes With Set-Valued Maps in the Absence of a Stability Guarantee and Their Stabilization. IEEE Trans. Autom. Control. 65(3): 1100-1115 (2020) - [j66]Prashanth L. A., Shalabh Bhatnagar, Nirav Bhavsar, Michael C. Fu, Steven I. Marcus:
Random Directions Stochastic Approximation With Deterministic Perturbations. IEEE Trans. Autom. Control. 65(6): 2450-2465 (2020) - [c70]Akshay Dharmavaram, Matthew Riemer, Shalabh Bhatnagar:
Hierarchical Average Reward Policy Gradient Algorithms (Student Abstract). AAAI 2020: 13777-13778 - [c69]Kartik Paigwar, Lokesh Krishna, Sashank Tirumala, Naman Khetan, Aditya Varma, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach. CoRL 2020: 2257-2267 - [c68]Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar:
A Convergent Off-Policy Temporal Difference Algorithm. ECAI 2020: 1103-1110 - [c67]Indu John, Shalabh Bhatnagar:
Deep Reinforcement Learning with Successive Over-Relaxation and its Application in Autoscaling Cloud Resources. IJCNN 2020: 1-6 - [c66]Shravan Nayak, Chanakya Ajit Ekbote, Annanya Pratap Singh Chauhan, Raghuram Bharadwaj Diddigi, Prishita Ray, Abhinava Sikdar, Sai Koti Reddy Danda, Shalabh Bhatnagar:
Stochastic Game Frameworks for Efficient Energy Management in Microgrid Networks. ISGT-Europe 2020: 116-120 - [c65]Sindhu Padakandla, Shilpa Rao, Shalabh Bhatnagar:
Learning-Based Resource Allocation in Industrial IoT Systems. PIMRC 2020: 1-7 - [c64]Sashank Tirumala, Sagar Venkatesh Gubbi, Kartik Paigwar, Aditya Sagi, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Learning Stable Manoeuvres in Quadruped Robots from Expert Demonstrations. RO-MAN 2020: 1107-1112 - [i62]Shravan Nayak, Chanakya Ajit Ekbote, Annanya Pratap Singh Chauhan, Raghuram Bharadwaj Diddigi, Prishita Ray, Abhinava Sikdar, Sai Koti Reddy Danda, Shalabh Bhatnagar:
A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks. CoRR abs/2002.02084 (2020) - [i61]Sashank Tirumala, Sagar Venkatesh Gubbi, Kartik Paigwar, Aditya Sagi, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Learning Stable Manoeuvres in Quadruped Robots from Expert Demonstrations. CoRR abs/2007.14290 (2020) - [i60]Meet Gandhi, Atreyee Kundu, Shalabh Bhatnagar:
A reinforcement learning approach to hybrid control design. CoRR abs/2009.00821 (2020) - [i59]Dhuruva Priyan G. M, Abhik Singla, Shalabh Bhatnagar:
Hindsight Experience Replay with Kronecker Product Approximate Curvature. CoRR abs/2010.06142 (2020) - [i58]Kartik Paigwar, Lokesh Krishna, Sashank Tirumala, Naman Khetan, Aditya Sagi, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach. CoRR abs/2010.16342 (2020)
2010 – 2019
- 2019
- [j65]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
An Online Sample-Based Method for Mode Estimation Using ODE Analysis of Stochastic Approximation Algorithms. IEEE Control. Syst. Lett. 3(3): 697-702 (2019) - [j64]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Stability of Stochastic Approximations With "Controlled Markov" Noise and Temporal Difference Learning. IEEE Trans. Autom. Control. 64(6): 2614-2620 (2019) - [c63]Ajin George Joseph, Shalabh Bhatnagar:
Stochastic Approximation Trackers for Model-Based Search. Allerton 2019: 741-748 - [c62]Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Prabuchandran K. J., Shalabh Bhatnagar:
Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning. AAMAS 2019: 1931-1933 - [c61]Ajin George Joseph, Shalabh Bhatnagar:
An Adaptive and Incremental Approach to Quantile Estimation. CDC 2019: 6025-6031 - [c60]Indu John, Shalabh Bhatnagar:
Efficient Budget Allocation and Task Assignment in Crowdsourcing. COMAD/CODS 2019: 318-321 - [c59]Indu John, Ravikumar Karumanchi, Shalabh Bhatnagar:
Predictive and Prescriptive Analytics for Performance Optimization: Framework and a Case Study on a Large-Scale Enterprise System. ICMLA 2019: 876-881 - [c58]Abhik Singla, Shounak Bhattacharya, Dhaivat Dholakiya, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives. ICRA 2019: 7434-7440 - [c57]Shounak Bhattacharya, Abhik Singla, Abhimanyu, Dhaivat Dholakiya, Shalabh Bhatnagar, Bharadwaj Amrutur, Ashitava Ghosal, Shishir Kolathaya:
Learning Active Spine Behaviors for Dynamic and Efficient Locomotion in Quadruped Robots. RO-MAN 2019: 1-6 - [c56]Shishir Kolathaya, Ashitava Ghosal, Bharadwaj Amrutur, Ashish Joglekar, Suhan Shetty, Dhaivat Dholakiya, Abhimanyu, Aditya Sagi, Shounak Bhattacharya, Abhik Singla, Shalabh Bhatnagar:
Trajectory based Deep Policy Search for Quadrupedal Walking. RO-MAN 2019: 1-6 - [c55]Indu John, Aiswarya Sreekantan, Shalabh Bhatnagar:
Efficient Adaptive Resource Provisioning for Cloud Applications using Reinforcement Learning. FAS*W@SASO/ICAC 2019: 271-272 - [i57]Dhaivat Dholakiya, Shounak Bhattacharya, Ajay Gunalan, Abhik Singla, Shalabh Bhatnagar, Bharadwaj Amrutur, Ashitava Ghosal, Shishir Kolathaya:
Design, Development and Experimental Realization of a Quadrupedal Research Platform: Stoch. CoRR abs/1901.00697 (2019) - [i56]Chandramouli K, Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
An Online Sample Based Method for Mode Estimation using ODE Analysis of Stochastic Approximation Algorithms. CoRR abs/1902.03806 (2019) - [i55]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Successive Over Relaxation Q-Learning. CoRR abs/1903.03812 (2019) - [i54]Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Prabuchandran K. J., Shalabh Bhatnagar:
Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning. CoRR abs/1905.02907 (2019) - [i53]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Second Order Value Iteration in Reinforcement Learning. CoRR abs/1905.03927 (2019) - [i52]Sindhu Padakandla, Prabuchandran K. J., Shalabh Bhatnagar:
Reinforcement Learning in Non-Stationary Environments. CoRR abs/1905.03970 (2019) - [i51]Shounak Bhattacharya, Abhik Singla, Abhimanyu, Dhaivat Dholakiya, Shalabh Bhatnagar, Bharadwaj Amrutur, Ashitava Ghosal, Shishir Kolathaya:
Learning Active Spine Behaviors for Dynamic and Efficient Locomotion in Quadruped Robots. CoRR abs/1905.06077 (2019) - [i50]Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar:
Solution of Two-Player Zero-Sum Game by Successive Relaxation. CoRR abs/1906.06659 (2019) - [i49]Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar:
Generalized Speedy Q-learning. CoRR abs/1911.00397 (2019) - [i48]Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar:
A Convergent Off-Policy Temporal Difference Algorithm. CoRR abs/1911.05697 (2019) - [i47]Akshay Dharmavaram, Matthew Riemer, Shalabh Bhatnagar:
Hierarchical Average Reward Policy Gradient Algorithms. CoRR abs/1911.08826 (2019) - [i46]Sashank Tirumala, Aditya Sagi, Kartik Paigwar, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Gait Library Synthesis for Quadruped Robots via Augmented Random Search. CoRR abs/1912.12907 (2019) - 2018
- [j63]Enlu Zhou, Shalabh Bhatnagar:
Gradient-Based Adaptive Stochastic Search for Simulation Optimization Over Continuous Space. INFORMS J. Comput. 30(1): 154-167 (2018) - [j62]Ajin George Joseph, Shalabh Bhatnagar:
An incremental off-policy search in a model-free Markov decision process using a single sample path. Mach. Learn. 107(6): 969-1011 (2018) - [j61]Ajin George Joseph, Shalabh Bhatnagar:
An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method. Mach. Learn. 107(8-10): 1385-1429 (2018) - [j60]Prasenjit Karmakar, Shalabh Bhatnagar:
Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning. Math. Oper. Res. 43(1): 130-151 (2018) - [j59]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar, Csaba Szepesvári:
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes. IEEE Trans. Autom. Control. 63(4): 1185-1191 (2018) - [j58]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Analysis of Gradient Descent Methods With Nondiminishing Bounded Errors. IEEE Trans. Autom. Control. 63(5): 1465-1471 (2018) - [j57]Shalabh Bhatnagar, Sanjeev Patel, Karmeshu:
A stochastic approximation approach to active queue management. Telecommun. Syst. 68(1): 89-104 (2018) - [j56]Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
Novel Sensor Scheduling Scheme for Intruder Tracking in Energy Efficient Sensor Networks. IEEE Wirel. Commun. Lett. 7(5): 712-715 (2018) - [c54]Chandramouli K, Prabuchandran K. J., Sai Koti Reddy Danda, Shalabh Bhatnagar:
Generalized Deterministic Perturbations For Stochastic Gradient Search. CDC 2018: 5734-5739 - [c53]Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Shalabh Bhatnagar:
A unified decision making framework for supply and demand management in microgrid networks. SmartGridComm 2018: 1-7 - [i45]Ajin George Joseph, Shalabh Bhatnagar:
An Incremental Off-policy Search in a Model-free Markov Decision Process Using a Single Sample Path. CoRR abs/1801.10287 (2018) - [i44]Ajin George Joseph, Shalabh Bhatnagar:
A Cross Entropy based Optimization Algorithm with Global Convergence Guarantees. CoRR abs/1801.10291 (2018) - [i43]Ajin George Joseph, Shalabh Bhatnagar:
An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method. CoRR abs/1806.06720 (2018) - [i42]Prashanth L. A., Shalabh Bhatnagar, Nirav Bhavsar, Michael C. Fu, Steven I. Marcus:
Random directions stochastic approximation with deterministic perturbations. CoRR abs/1808.02871 (2018) - [i41]Abhik Singla, Shounak Bhattacharya, Dhaivat Dholakiya, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives. CoRR abs/1810.03842 (2018) - [i40]Abhik Singla, Sindhu Padakandla, Shalabh Bhatnagar:
Memory-based Deep Reinforcement Learning for Obstacle Avoidance in UAV with Limited Environment Knowledge. CoRR abs/1811.03307 (2018) - 2017
- [j55]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
A stability criterion for two timescale stochastic approximation schemes. Autom. 79: 108-114 (2017) - [j54]K. Lakshmanan, Shalabh Bhatnagar:
Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization. Comput. Optim. Appl. 66(3): 533-556 (2017) - [j53]Arunselvan Ramaswamy, Shalabh Bhatnagar:
A Generalization of the Borkar-Meyn Theorem for Stochastic Recursive Inclusions. Math. Oper. Res. 42(3): 648-661 (2017) - [j52]Prashanth L. A., Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus:
Adaptive System Optimization Using Random Directions Stochastic Approximation. IEEE Trans. Autom. Control. 62(5): 2223-2238 (2017) - [j51]Karmeshu, Sanjeev Patel, Shalabh Bhatnagar:
Adaptive mean queue size and its rate of change: queue management with random dropping. Telecommun. Syst. 65(2): 281-295 (2017) - [c52]Sandeep Kumar, Sindhu Padakandla, Chandrashekar Lakshminarayanan, Priyank Parihar, K. Gopinath, Shalabh Bhatnagar:
Scalable Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach. CLOUD 2017: 375-382 - [c51]Ajin George Joseph, Shalabh Bhatnagar:
A model based search method for prediction in model-free Markov decision process. IJCNN 2017: 170-177 - [c50]Ajin George Joseph, Shalabh Bhatnagar:
Bounds for off-policy prediction in reinforcement learning. IJCNN 2017: 3991-3997 - [c49]Ajin George Joseph, Shalabh Bhatnagar:
An Incremental Fast Policy Search Using a Single Sample Path. PReMI 2017: 3-10 - [i39]Vinayaka G. Yaji, Shalabh Bhatnagar:
Analysis of stochastic approximation schemes with set-valued maps in the absence of a stability guarantee and their stabilization. CoRR abs/1701.07590 (2017) - [i38]Chandramouli K, Sai Koti Reddy Danda, Shalabh Bhatnagar:
Deterministic Perturbations For Simultaneous Perturbation Methods Using Circulant Matrices. CoRR abs/1702.06250 (2017) - [i37]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar, Csaba Szepesvári:
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes. CoRR abs/1704.02544 (2017) - [i36]Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Shalabh Bhatnagar:
Multi-Agent Q-Learning for Minimizing Demand-Supply Power Deficit in Microgrids. CoRR abs/1708.07732 (2017) - [i35]Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
Novel Sensor Scheduling Scheme for Intruder Tracking in Energy Efficient Sensor Networks. CoRR abs/1708.08113 (2017) - [i34]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Conditions for Stability and Convergence of Set-Valued Stochastic Approximations: Applications to Approximate Value and Fixed point Iterations with Noise. CoRR abs/1709.04673 (2017) - [i33]Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Krishnasuri Narayanam, Shalabh Bhatnagar:
A unified decision making framework for supply and demand management in microgrid networks. CoRR abs/1711.05078 (2017) - [i32]Jayvant Anantpur, Nagendra Dwarakanath Gulur, Shivaram Kalyanakrishnan, Shalabh Bhatnagar, R. Govindarajan:
RLWS: A Reinforcement Learning based GPU Warp Scheduler. CoRR abs/1712.04303 (2017) - 2016
- [j50]Shalabh Bhatnagar, K. Lakshmanan:
Multiscale Q-learning with linear function approximation. Discret. Event Dyn. Syst. 26(3): 477-509 (2016) - [j49]Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra:
A constrained optimization perspective on actor-critic algorithms and application to network routing. Syst. Control. Lett. 92: 46-51 (2016) - [j48]Prabuchandran K. J., Shalabh Bhatnagar, Vivek S. Borkar:
Actor-Critic Algorithms with Online Feature Adaptation. ACM Trans. Model. Comput. Simul. 26(4): 24:1-24:26 (2016) - [c48]Sai Koti Reddy Danda, Prashanth L. A., Shalabh Bhatnagar:
Improved Hessian estimation for adaptive random directions stochastic approximation. CDC 2016: 3682-3687 - [c47]Ajin George Joseph, Shalabh Bhatnagar:
Revisiting the Cross Entropy Method with Applications in Stochastic Global Optimization and Reinforcement Learning. ECAI 2016: 1026-1034 - [c46]Raj Kumar Maity, Chandrashekar Lakshminarayanan, Sindhu Padakandla, Shalabh Bhatnagar:
Shaping Proto-Value Functions Using Rewards. ECAI 2016: 1690-1691 - [c45]Ranganath B. N., Shalabh Bhatnagar:
Scalable focussed entity resolution. IJCNN 2016: 3570-3577 - [c44]Ajin George Joseph, Shalabh Bhatnagar:
A randomized algorithm for continuous optimization. WSC 2016: 907-918 - [i31]Karmeshu, Sanjeev Patel, Shalabh Bhatnagar:
Adaptive Mean Queue Size and Its Rate of Change: Queue Management with Random Dropping. CoRR abs/1602.02241 (2016) - [i30]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Gradient-based learning algorithms with constant-error estimators: stability and convergence. CoRR abs/1604.00151 (2016) - [i29]Prasenjit Karmakar, Raj Kumar Maity, Shalabh Bhatnagar:
On a convergent off -policy temporal difference learning algorithm in on-line learning environment. CoRR abs/1605.06076 (2016) - [i28]Vinayaka Yaji, Shalabh Bhatnagar:
Stochastic Recursive Inclusions with Non-Additive Iterate-Dependent Markov Noise. CoRR abs/1607.04735 (2016) - [i27]Ajin George Joseph, Shalabh Bhatnagar:
A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation. CoRR abs/1609.09449 (2016) - [i26]Vinayaka Yaji, Shalabh Bhatnagar:
Stochastic Recursive Inclusions in two timescales with non-additive iterate dependent Markov noise. CoRR abs/1611.05961 (2016) - [i25]Sandeep Kumar, Sindhu Padakandla, Chandrashekar Lakshminarayanan, Priyank Parihar, K. Gopinath, Shalabh Bhatnagar:
Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach. CoRR abs/1611.10052 (2016) - [i24]Prasenjit Karmakar, Shalabh Bhatnagar:
A note on the function approximation error bound for risk-sensitive reinforcement learning. CoRR abs/1612.07562 (2016) - 2015
- [j47]Shalabh Bhatnagar, Prashanth L. A.:
Simultaneous Perturbation Newton Algorithms for Simulation Optimization. J. Optim. Theory Appl. 164(2): 621-643 (2015) - [j46]Vinayaka Yaji, Shalabh Bhatnagar:
Necessary and sufficient conditions for optimality in constrained general sum stochastic games. Syst. Control. Lett. 85: 8-15 (2015) - [j45]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Dasgupta:
Simultaneous perturbation methods for adaptive labor staffing in service systems. Simul. 91(5): 432-455 (2015) - [j44]Sindhu Padakandla, Prabuchandran K. J., Shalabh Bhatnagar:
Energy Sharing for Multiple Sensor Nodes With Finite Buffers. IEEE Trans. Commun. 63(5): 1811-1823 (2015) - [c43]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
A Generalized Reduced Linear Program for Markov Decision Processes. AAAI 2015: 2722-2728 - [c42]H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar:
Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games. AAMAS 2015: 1371-1379 - [c41]Prabuchandran K. J., Hemanth Kumar A. N, Shalabh Bhatnagar:
Decentralized learning for traffic signal control. COMSNETS 2015: 1-6 - [c40]Ajin George Joseph, Shalabh Bhatnagar:
A Stochastic Approximation Algorithm for Quantile Estimation. ICONIP (2) 2015: 311-319 - [i23]Arunselvan Ramaswamy, Shalabh Bhatnagar:
A Generalization of the Borkar-Meyn Theorem for Stochastic Recursive Inclusions. CoRR abs/1502.01953 (2015) - [i22]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Stochastic recursive inclusions with two timescales. CoRR abs/1502.01956 (2015) - [i21]Prashanth L. A., Shalabh Bhatnagar:
Adaptive system optimization using (simultaneous) random directions stochastic approximation. CoRR abs/1502.05577 (2015) - [i20]Sindhu Padakandla, Prabuchandran K. J., Shalabh Bhatnagar:
Energy Sharing for Multiple Sensor Nodes with Finite Buffers. CoRR abs/1503.04964 (2015) - [i19]Prasenjit Karmakar, Shalabh Bhatnagar:
Two Timescale Stochastic Approximation with Controlled Markov noise. CoRR abs/1503.09105 (2015) - [i18]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Stability of Stochastic Approximations with 'Controlled Markov' Noise and Temporal Difference Learning. CoRR abs/1504.06043 (2015) - [i17]Vinayaka Yaji, Shalabh Bhatnagar:
A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm. CoRR abs/1504.06828 (2015) - [i16]H. L. Prasad, Shalabh Bhatnagar:
A Study of Gradient Descent Schemes for General-Sum Stochastic Games. CoRR abs/1507.00093 (2015) - [i15]Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra:
A constrained optimization perspective on actor critic algorithms and application to network routing. CoRR abs/1507.07984 (2015) - [i14]Chandrashekar Lakshmi Narayanan, Raj Kumar Maity, Shalabh Bhatnagar:
Shaping Proto-Value Functions via Rewards. CoRR abs/1511.08589 (2015) - 2014
- [j43]Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar:
Newton-based stochastic optimization using q-Gaussian smoothed functional algorithms. Autom. 50(10): 2606-2614 (2014) - [j42]Saswata Chakravarty, Sindhu Padakandla, Shalabh Bhatnagar:
A simulation-based algorithm for optimal pricing policy under demand uncertainty. Int. Trans. Oper. Res. 21(5): 737-760 (2014) - [j41]Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar:
Smoothed Functional Algorithms for Stochastic Optimization Using q-Gaussian Distributions. ACM Trans. Model. Comput. Simul. 24(3): 17:1-17:26 (2014) - [j40]Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar:
Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks. Wirel. Networks 20(8): 2589-2604 (2014) - [c39]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
Approximate Dynamic Programming with (min; +) linear function approximation for Markov decision processes. CDC 2014: 1588-1593 - [c38]Prabuchandran K. J., Shalabh Bhatnagar, Vivek S. Borkar:
An actor critic algorithm based on Grassmanian search. CDC 2014: 3597-3602 - [c37]Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar:
Adaptive sleep-wake control using reinforcement learning in sensor networks. COMSNETS 2014: 1-8 - [c36]Chandrashekar Lakshminarayanan, Ayush Dubey, Shalabh Bhatnagar, Chithralekha Balamurugan:
A Markov Decision Process Framework for Predictable Job Completion Times on Crowdsourcing Platforms. HCOMP 2014: 34-35 - [c35]Prabuchandran K. J., Hemanth Kumar A. N, Shalabh Bhatnagar:
Multi-agent reinforcement learning for traffic signal control. ITSC 2014: 2529-2534 - [c34]Hengshuai Yao, Csaba Szepesvári, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar:
Universal Option Models. NIPS 2014: 990-998 - [c33]Enlu Zhou, Shalabh Bhatnagar, Xi Chen:
Simulation optimization via gradient-based stochastic search. WSC 2014: 3869-3879 - [i13]H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar:
Algorithms for Nash Equilibria in General-Sum Stochastic Games. CoRR abs/1401.2086 (2014) - [i12]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
Approximate Dynamic Programming based on Projection onto the (min, +) subsemimodule. CoRR abs/1403.4175 (2014) - [i11]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
Approximate dynamic programming with $(\min, +)$ linear function approximation for Markov decision processes. CoRR abs/1403.4179 (2014) - [i10]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
A Generalized Reduced Linear Program for Markov Decision Processes. CoRR abs/1409.3536 (2014) - 2013
- [j39]Shalabh Bhatnagar, Vivek S. Borkar, Prabuchandran K. J.:
Feature Search in the Grassmanian in Online Reinforcement Learning. IEEE J. Sel. Top. Signal Process. 7(5): 746-758 (2013) - [j38]Prabuchandran K. J., Sunil Kumar Meena, Shalabh Bhatnagar:
Q-Learning Based Energy Management Policies for a Single Sensor Node with Finite Buffer. IEEE Wirel. Commun. Lett. 2(1): 82-85 (2013) - [c32]Prashanth Lakshmanrao Ananthapadmanabharao, Horabailu Laxminarayana Prasad, Nirmit Desai, Shalabh Bhatnagar:
Mechanisms for hostile agents with capacity constraints. AAMAS 2013: 659-666 - [i9]Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar:
Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms. CoRR abs/1311.2296 (2013) - [i8]Prashanth Lakshmanrao Ananthapadmanabharao, Abhranil Chatterjee, Shalabh Bhatnagar:
Reinforcement Learning for Sleep-Wake Scheduling in Sensor Networks. CoRR abs/1312.7292 (2013) - [i7]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Dasgupta:
Simultaneous Perturbation Methods for Adaptive Labor Staffing in Service Systems. CoRR abs/1312.7430 (2013) - 2012
- [j37]H. L. Prasad, Shalabh Bhatnagar:
General-sum stochastic games: Verifiability conditions for Nash equilibria. Autom. 48(11): 2923-2930 (2012) - [j36]Koteswara Rao Vemu, Shalabh Bhatnagar, N. Hemachandra:
Optimal multi-layered congestion based pricing schemes for enhanced QoS. Comput. Networks 56(4): 1249-1262 (2012) - [j35]Shalabh Bhatnagar, K. Lakshmanan:
An Online Actor-Critic Algorithm with Function Approximation for Constrained Markov Decision Processes. J. Optim. Theory Appl. 153(3): 688-708 (2012) - [j34]Prashanth L. A., Shalabh Bhatnagar:
Threshold Tuning Using Stochastic Optimization for Graded Signal Control. IEEE Trans. Veh. Technol. 61(9): 3865-3880 (2012) - [c31]K. Lakshmanan, Shalabh Bhatnagar:
A novel Q-learning algorithm with function approximation for constrained Markov decision processes. Allerton Conference 2012: 400-405 - [c30]Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar:
q-Gaussian based Smoothed Functional algorithms for stochastic optimization. ISIT 2012: 1059-1063 - [i6]Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar:
q-Gaussian based Smoothed Functional Algorithm for Stochastic Optimization. CoRR abs/1202.5665 (2012) - [i5]Debarghya Ghoshdastidar, Ambedkar Dukkipati, Shalabh Bhatnagar:
Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions. CoRR abs/1206.4832 (2012) - 2011
- [j33]Shalabh Bhatnagar:
The Borkar-Meyn theorem for asynchronous stochastic approximations. Syst. Control. Lett. 60(7): 472-478 (2011) - [j32]Shalabh Bhatnagar, Vivek Kumar Mishra, N. Hemachandra:
Stochastic Algorithms for Discrete Parameter Simulation Optimization. IEEE Trans Autom. Sci. Eng. 8(4): 780-793 (2011) - [j31]Karmeshu, Shalabh Bhatnagar, Vivek Kumar Mishra:
An Optimized SDE Model for Slotted Aloha. IEEE Trans. Commun. 59(6): 1502-1508 (2011) - [j30]Prashanth L. A., Shalabh Bhatnagar:
Reinforcement Learning With Function Approximation for Traffic Signal Control. IEEE Trans. Intell. Transp. Syst. 12(2): 412-421 (2011) - [j29]Shalabh Bhatnagar, N. Hemachandra, Vivek Kumar Mishra:
Stochastic approximation algorithms for constrained optimization via simulation. ACM Trans. Model. Comput. Simul. 21(3): 15:1-15:22 (2011) - [c29]K. Lakshmanan, Shalabh Bhatnagar:
Smoothed Functional and Quasi-Newton Algorithms for Routing in Multi-stage Queueing Network with Constraints. ICDCIT 2011: 175-186 - [c28]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Banerjee Dasgupta:
Stochastic Optimization for Adaptive Labor Staffing in Service Systems. ICSOC 2011: 487-494 - [c27]Prashanth L. A., Shalabh Bhatnagar:
Reinforcement learning with average cost for adaptive control of traffic lights at intersections. ITSC 2011: 1640-1645 - 2010
- [j28]Shalabh Bhatnagar:
An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes. Syst. Control. Lett. 59(12): 760-766 (2010) - [j27]Anshuk Chakraborty, Shalabh Bhatnagar:
Optimized Policies for the Retransmission Probabilities in Slotted Aloha. Simul. 86(4): 247-261 (2010) - [j26]G. Ramana Reddy, Shalabh Bhatnagar, V. Rakesh, Vijay Prakash Chaturvedi:
An efficient algorithm for scheduling in bluetooth piconets and scatternets. Wirel. Networks 16(7): 1799-1816 (2010) - [c26]Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard S. Sutton:
Toward Off-Policy Learning Control with Function Approximation. ICML 2010: 719-726
2000 – 2009
- 2009
- [j25]Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee:
Natural actor-critic algorithms. Autom. 45(11): 2471-2482 (2009) - [j24]Shalabh Bhatnagar, Rajesh Kumar Patro:
A proof of convergence of the B-RED and P-RED algorithms for random early detection. IEEE Commun. Lett. 13(10): 809-811 (2009) - [j23]Rajesh Kumar Patro, Shalabh Bhatnagar:
A probabilistic constrained nonlinear optimization framework to optimize RED parameters. Perform. Evaluation 66(2): 81-104 (2009) - [j22]Shalabh Bhatnagar, Karmeshu, Vivek Kumar Mishra:
Optimal parameter trajectory estimation in parameterized SDEs: An algorithmic procedure. ACM Trans. Model. Comput. Simul. 19(2): 8:1-8:27 (2009) - [c25]Hengshuai Yao, Shalabh Bhatnagar, Csaba Szepesvári:
LMS-2: Towards an algorithm that is as cheap as LMS and almost as efficient as RLS. CDC 2009: 1181-1188 - [c24]Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora:
Fast gradient-descent methods for temporal-difference learning with linear function approximation. ICML 2009: 993-1000 - [c23]Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton:
Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009: 1204-1212 - [c22]Hengshuai Yao, Richard S. Sutton, Shalabh Bhatnagar, Diao Dongcui, Csaba Szepesvári:
Multi-Step Dyna Planning for Policy Evaluation and Control. NIPS 2009: 2187-2195 - [r1]P. Viswanath, M. Narasimha Murty, Shalabh Bhatnagar:
Pattern Synthesis for Nonparametric Pattern Recognition. Encyclopedia of Data Warehousing and Mining 2009: 1511-1516 - 2008
- [j21]Shalabh Bhatnagar, K. Mohan Babu:
New algorithms of the Q-learning type. Autom. 44(4): 1111-1119 (2008) - [j20]Sudha Velusamy, Lakshmi Gopal, Shalabh Bhatnagar, Sridhar Varadarajan:
An efficient ad recommendation system for TV programs. Multim. Syst. 14(2): 73-87 (2008) - [j19]Shalabh Bhatnagar, Mohammed Shahid Abdulla:
Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes. Simul. 84(12): 577-600 (2008) - [c21]Sudha Velusamy, Shalabh Bhatnagar, S. V. Basavaraja, V. Sridhar:
SPSA based feature relevance estimation for video retrieval. MMSP 2008: 598-603 - [c20]Sudha Rani Kolavali, Shalabh Bhatnagar:
Ant Colony Optimization Algorithms for Shortest Path Problems. NET-COOP 2008: 37-44 - 2007
- [j18]Mohammed Shahid Abdulla, Shalabh Bhatnagar:
Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes. Discret. Event Dyn. Syst. 17(1): 23-52 (2007) - [j17]Ambedkar Dukkipati, Shalabh Bhatnagar, M. Narasimha Murty:
Gelfand-Yaglom-Perez theorem for generalized relative entropy functionals. Inf. Sci. 177(24): 5707-5714 (2007) - [j16]Shalabh Bhatnagar:
Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization. ACM Trans. Model. Comput. Simul. 18(1): 2:1-2:35 (2007) - [c19]Mohammed Shahid Abdulla, Shalabh Bhatnagar:
Parametrized Actor-Critic Algorithms for Finite-Horizon MDPs. ACC 2007: 534-539 - [c18]Mohammed Shahid Abdulla, Shalabh Bhatnagar:
Solving MDPs using Two-timescale Simulated Annealing with Multiplicative Weights. ACC 2007: 2428-2433 - [c17]Koteswara Rao Vemu, Shalabh Bhatnagar, N. Hemachandra:
Link route pricing for enhanced QoS. CDC 2007: 1504-1509 - [c16]Vivek Kumar Mishra, Shalabh Bhatnagar, N. Hemachandra:
Discrete parameter simulation optimization algorithms with applications to admission control with dependent service times. CDC 2007: 2986-2991 - [c15]Mohammed Shahid Abdulla, Shalabh Bhatnagar:
Network flow-control using asynchronous stochastic approximation. CDC 2007: 5857-5862 - [c14]Sudha Velusamy, Lakshmi Gopal, Sridhar Varadarajan, Shalabh Bhatnagar:
Fuzzy Clustering Based Ad Recommendation for TV Programs. EuroITV 2007: 175-184 - [c13]Vijay Prakash Chaturvedi, V. Rakesh, Shalabh Bhatnagar:
An Efficient and Optimized Bluetooth Scheduling Algorithm for Piconets. ICDCIT 2007: 19-30 - [c12]Koteswara Rao Vemu, Shalabh Bhatnagar, N. Hemachandra:
An Optimal Weighted-Average Congestion Based Pricing Scheme for Enhanced QoS. ICDCIT 2007: 135-145 - [c11]Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee:
Incremental Natural Actor-Critic Algorithms. NIPS 2007: 105-112 - 2006
- [j15]Shalabh Bhatnagar, J. Ranjan Panigrahi:
Actor-critic algorithms for hierarchical Markov decision processes. Autom. 42(4): 637-644 (2006) - [j14]Shalabh Bhatnagar, Vivek S. Borkar, Madhukar Akarapu:
A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events. J. Mach. Learn. Res. 7: 1937-1962 (2006) - [j13]P. Viswanath, M. Narasimha Murty, Shalabh Bhatnagar:
Partition based pattern synthesis technique with efficient algorithms for nearest neighbor classification. Pattern Recognit. Lett. 27(14): 1714-1724 (2006) - [j12]Rahul Vaidya, Shalabh Bhatnagar:
Robust optimization of Random Early Detection. Telecommun. Syst. 33(4): 291-316 (2006) - [c10]Rajesh Kumar Patro, Shalabh Bhatnagar:
A Four-Timescale Algorithm for Constrained Stochastic Optimization of RED. CDC 2006: 1930-1935 - [c9]Shalabh Bhatnagar, Mohammed Shahid Abdulla:
A Reinforcement Learning Based Algorithm for Finite Horizon Markov Decision Processes. CDC 2006: 5519-5524 - [c8]Mohammed Shahid Abdulla, Shalabh Bhatnagar:
SPSA algorithms with measurement reuse. WSC 2006: 320-328 - [i4]Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar:
On Measure Theoretic definitions of Generalized Information Measures and Maximum Entropy Prescriptions. CoRR abs/cs/0601080 (2006) - 2005
- [j11]P. Viswanath, M. Narasimha Murty, Shalabh Bhatnagar:
Overlap pattern synthesis with an efficient nearest neighbor classifier. Pattern Recognit. 38(8): 1187-1195 (2005) - [j10]Shalabh Bhatnagar, Hemant J. Kowshik:
A Discrete Parameter Stochastic Approximation Algorithm for Simulation Optimization. Simul. 81(11): 757-772 (2005) - [j9]Shalabh Bhatnagar, I. Bala Bhaskar Reddy:
Optimal Threshold Policies for Admission Control in Communication Networks via Discrete Parameter Stochastic Approximation. Telecommun. Syst. 29(1): 9-31 (2005) - [j8]Shalabh Bhatnagar:
Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization. ACM Trans. Model. Comput. Simul. 15(1): 74-107 (2005) - [c7]Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar:
Information theoretic justification of Boltzmann selection and its generalization to Tsallis case. Congress on Evolutionary Computation 2005: 1667-1674 - [c6]Mohammed Shahid Abdulla, Shalabh Bhatnagar:
Solution of Mdps Using Simulation-Based Value Iteration. AIAI 2005: 765-775 - [c5]Ambedkar Dukkipati, Narasimha Murty Musti, Shalabh Bhatnagar:
Properties of Kullback-Leibler cross-entropy minimization in nonextensive framework. ISIT 2005: 2374-2378 - [i3]Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar:
Uniqueness of Nonextensive entropy under Renyi's Recipe. CoRR abs/cs/0511078 (2005) - 2004
- [j7]P. Viswanath, M. Narasimha Murty, Shalabh Bhatnagar:
Fusion of multiple approximate nearest neighbor classifiers for fast and efficient classification. Inf. Fusion 5(4): 239-250 (2004) - [j6]Shalabh Bhatnagar, Shishir Kumar:
A simultaneous perturbation stochastic approximation-based actor-critic algorithm for Markov decision processes. IEEE Trans. Autom. Control. 49(4): 592-598 (2004) - [c4]Jnana Ranjan Panigrahi, Shalabh Bhatnagar:
Hierarchical decision making in semiconductor fabs using multi-time scale Markov decision processes. CDC 2004: 4387-4392 - [c3]Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar:
Cauchy annealing schedule: an annealing schedule for Boltzmann selection scheme in evolutionary algorithms. IEEE Congress on Evolutionary Computation 2004: 55-62 - [c2]P. Viswanath, M. Narasimha Murty, Shalabh Bhatnagar:
A Pattern Synthesis Technique with an Efficient Nearest Neighbor Classifier for Binary Pattern Recognition. ICPR (4) 2004: 416-419 - [i2]Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar:
Generalized Evolutionary Algorithm based on Tsallis Statistics. CoRR cs.AI/0407037 (2004) - [i1]Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar:
Cauchy Annealing Schedule: An Annealing Schedule for Boltzmann Selection Scheme in Evolutionary Algorithms. CoRR cs.AI/0408055 (2004) - 2003
- [j5]Shalabh Bhatnagar, Vivek S. Borkar:
Multiscale Chaotic SPSA and Smoothed Functional Algorithms for Simulation Optimization. Simul. 79(10): 568-580 (2003) - [j4]Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus, I-Jeng Wang:
Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences. ACM Trans. Model. Comput. Simul. 13(2): 180-209 (2003) - [c1]Ambedkar Dukkipati, M. Narasimha Murty, Shalabh Bhatnagar:
Quotient evolutionary space: abstraction of evolutionary process w.r.t macroscopic properties. IEEE Congress on Evolutionary Computation 2003: 846-853 - 2002
- [j3]Xi-Ren Cao, Zhiyuan Ren, Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus:
A time aggregation approach to Markov decision processes. Autom. 38(6): 929-943 (2002) - 2001
- [j2]Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus, Pedram Jaefari Fard:
Optimal structured feedback policies for ABR flow control using two-timescale SPSA. IEEE/ACM Trans. Netw. 9(4): 479-491 (2001)
1990 – 1999
- 1995
- [j1]Shalabh Bhatnagar, Vivek S. Borkar:
A Convex Analytic Framework for Ergodic Control of Semi-Markov Processes. Math. Oper. Res. 20(4): 923-936 (1995)
Coauthor Index
aka: Prashanth Lakshmanrao Ananthapadmanabharao
aka: Bharadwaj Amrutur
aka: Ajin George Joseph
aka: Chandramouli Kamanchi
aka: Chandrashekar Lakshmi Narayanan
aka: Narasimha Murty Musti
aka: Horabailu Laxminarayana Prasad
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-23 19:33 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint