TR2017-075
Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller
-
- "Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller", The Multi-disciplinary Conference on Reinforcement Learning and Decision Making, June 2017.BibTeX TR2017-075 PDF
- @inproceedings{Farahmand2017jun,
- author = {Farahmand, Amir-massoud and Benosman, Mouhacine},
- title = {Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller},
- booktitle = {The Multi-disciplinary Conference on Reinforcement Learning and Decision Making},
- year = 2017,
- month = jun,
- url = {https://www.merl.com/publications/TR2017-075}
- }
,
- "Towards Stability in Learning-based Control: A Bayesian Optimization-based Adaptive Controller", The Multi-disciplinary Conference on Reinforcement Learning and Decision Making, June 2017.
-
Research Areas:
Artificial Intelligence, Control, Optimization, Dynamical Systems, Machine Learning
Abstract:
We propose to merge together techniques from control theory and machine learning to design a stable learning-based controller for a class of nonlinear systems. We adopt a modular adaptive control design approach that has two components. The first is a model-based robust nonlinear state feedback, which guarantees stability during learning, by rendering the closed-loop system input-to-state stable (ISS). The input is considered to be the error in the estimation of the uncertain parameters of the dynamics, and the state is considered to be the closed-loop output tracking error. The second component is a data-driven Bayesian optimization method for estimating the uncertain parameters of the dynamics, and improving the overall performance of the closed-loop system. In particular, we suggest using Gaussian Process Upper Confidence Bound (GP-UCB) algorithm, which is a method for trading-off exploration-exploitation in continuous-armed bandits. GP-UCB searches the space of uncertain parameters and gradually finds the parameters that maximize the performance of the closed-loop system. These two systems together ensure that we have a stable learning-based control algorithm.