TR2022-108
Learning Optimization-based Control Policies Directly from Digital Twin Simulations
-
- "Learning Optimization-based Control Policies Directly from Digital Twin Simulations", IEEE Conference on Control Technology and Applications (CCTA), DOI: 10.1109/CCTA49430.2022.9966077, August 2022, pp. 895-900.BibTeX TR2022-108 PDF
- @inproceedings{Menner2022aug,
- author = {Menner, Marcel and Chakrabarty, Ankush and Berntorp, Karl and Di Cairano, Stefano},
- title = {Learning Optimization-based Control Policies Directly from Digital Twin Simulations},
- booktitle = {IEEE Conference on Control Technology and Applications (CCTA)},
- year = 2022,
- pages = {895--900},
- month = aug,
- doi = {10.1109/CCTA49430.2022.9966077},
- url = {https://www.merl.com/publications/TR2022-108}
- }
,
- "Learning Optimization-based Control Policies Directly from Digital Twin Simulations", IEEE Conference on Control Technology and Applications (CCTA), DOI: 10.1109/CCTA49430.2022.9966077, August 2022, pp. 895-900.
-
MERL Contacts:
-
Research Areas:
Abstract:
This paper proposes to use a digital twin of a dynamical system directly for optimization-based control. It proposes an algorithm based on an Unscented Kalman Filter (UKF) to solve optimization-based control problems, where the system dynamics is encoded in the digital twin. The UKF- based algorithm uses simulations of a digital twin directly to optimize the control policy and does not require gradients to be computed—making it suitable for differential-algebraic constraints, where gradients may be inaccessible. The proposed UKF-based algorithm does not require explicit knowledge of the internal model of the digital twin, nor the control map; that is, it is a purely simulation data-driven approach. The main advantage is that a high-precision simulation-oriented digital twin can approximate the physical dynamical system more accurately than an analytical control-oriented model and thus, can improve the performance of the controller. The digital twin-based optimal control approach is evaluated on two case studies. First, a pendulum on a cart is optimized to swing up and stabilize. Second, a crane controller is optimized to avoid oscillations of the load.