Lectures: Joschka Boedecker and Moritz Diehl
Guest Lectures: Sebastien Gros (NTNU Trondheim) and Sergey Levine (UC Berkeley)
Exercises: Katrin Baumgärtner and Jasper Hoffmann
University of Freiburg, July 26 to August 4, 2021
(online available, all times are Central European Summer Time)
This block course of 8 days duration is intended for master and PhD students from engineering, computer science, mathematics, physics, and other mathematical sciences. The aim is that participants understand the main concepts of model predictive control (MPC) and reinforcement learning (RL) as well the similarities and differences between the two approaches. In hands-on exercises and project work they learn to apply the methods to practical optimal control problems from science and engineering.
The course consists of lectures, exercises, and project work. The lectures in the first week will be given by Prof. Dr. Joschka Boedecker and Prof. Dr. Moritz Diehl from the University of Freiburg. In the second week of the course, invited guest lectures will be given by Prof. Dr. Sebastien Gros from NTNU Trondheim (Norway) and Prof. Dr. Sergey Levine from UC Berkeley (California, US).
Topics include
- Optimal Control Problem (OCP) formulations - constrained, infinite horizon, discrete time, stochastic, robust
- Markov Decision Processes (MDP)
- From continuous to discrete: discretization in space and time
- Dynamic Programming (DP) concepts and algorithms - value iteration and policy iteration
- Linear Quadratic Regulator (LQR) and Riccati equations
- Convexity considerations in DP for constrained linear systems
- Model predictive control (MPC) formulations and stability guarantees
- MPC algorithms - quadratic programming, direct multiple shooting, Gauss-Newton, real-time iterations
- Differential Dynamic Programming (DDP) for the solution of unconstrained MPC problems
- Reinforcement Learning (RL) formulations and approaches
- Model-free RL: Monte Carlo, temporal differences, model learning, direct policy search
- RL with function approximation
- Model-based RL and combinations of model-based and model-free methods
- Similarities and differences between MPC and RL
The course will be conducted as a mixed virtual and real event if the Corona situation in July 2021 allows, and otherwise be fully virtual. All lecture videos and exercises will be made openly available. Each course day starts at 9:00 and ends at 17:00 (5 p.m.) Freiburg time (MET). Lectures are typically followed by computer exercises in Python. A mandatory requirement for officially passing the course is successful participation in an online microexam on Friday July 30, 2021, at 9:00. In the second week, on August 2-4, 2021, participants will work on application projects which apply at least one of the MPC and RL methods to self-chosen application problems from any area of science or engineering. The results of the projects, that can be performed in teams of either one or two people (preferred), will be presented in a public poster presentation on the last day of the course, and a short report to be submitted two weeks after the course. The report will determine the final grade of the course.
This course has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 953348. |
The course can be followed in different levels of participation:
Participation Level | Passing Requirements | Certificate | Max. No of Participants |
---|---|---|---|
Level A - Online Listening | - | - |
unlimited |
Level B - First Week with Exam | exercises and microexam on July 30 | "Certificate B", without grade |
120 |
Level C - Full Time with Project | Level B requirements plus project presentation on Aug. 4 | "Certificate C", without grade |
90 |
Level D - Full Time with Report | Level C requirements plus report on Aug. 18 | "Certificate D", with or without grade, 3 ECTS. |
60 |
The maximum number of physical participants in Freiburg is 24. For Level D participation, priority will be given to students of the University of Freiburg.
Registration to the course is closed!
Any questions on the course can be addressed to Katrin Baumgärtner (katrin.baumgaertner@imtek.uni-freiburg.de).
Certificates:
If you have participated in the course and you want to get a certificate, please fill out the form until the 20th of August.
Microexam:
Projects: guidelines, teams, slides
Please upload your project reports here until the 18th of August.
Preparation for the exercises:
For the exercises you need:
- An editor that can handle Jupyter Notebooks.
- A way to install python packages like pip/pip3. The packages are listed in the requirements.txt.
To achieve this, it is helpful to make yourself first familiar with how to install python packages. After that, download the requirements file linked above and install the packages as described here. This should also install Jupyter Lab, which is an editor that lives in your browser. To start Jupyter Lab type jupyter lab in your terminal. Further instructions for Jupyter Lab can be found here.
Optional: There is an optional exercise using the open-source software acados
for high-performance embedded NMPC, which is developed within the group of Prof. Diehl. If you would like to try it, please follow the installation instructions here.
Exercises:
Monday: exercise01 (solution), exercise02 (solution)
Tuesday: exercise03 (solution), exercise04 (solution)
Wednesday: exercise05 (solution), exercise06 (solution)
Thursday: exercise07 (solution), exercise08 (solution)
Lecture location:
HS 1015, Kollegiengebäude I, Platz der Universität 3 , D-79098 Freiburg, Germany
At the lecture location, there will be basic catering like coffee, water and fruits.
All Lectures and Exercise Sessions are broadcasted via Zoom:
Join Zoom Meeting
https://uni-freiburg.zoom.us/j/65439158240?pwd=OHJrb2p2U2xDUEpVQTRBWWlvZGo0UT09
Meeting ID: 654 3915 8240
Passcode: 0uu1tq0ek
Get-to-know-each-other session:
There will be a get to know each other session on the first Monday. For the participants that are taking part via Zoom, we will meet in wonder.me under the following link:
https://www.wonder.me/r?id=20f5f8da-411c-4c79-8560-05fbe4475ebf
For everyone else, we will just meet in the lecture room.
Schedule
Please note that the lecture recordings might not show a video if you open them in your browser. Downloading the lecture recordings and watching them locally should be working fine.
Monday |
Lecture 1 - Introduction - Joschka Boedecker and Moritz Diehl |
|
|
||
|
Exercise 1 - Dynamic System Simulation - Katrin Baumgärtner und Jasper Hoffmann |
|
|
||
|
Exercise 2 - Numerical Optimization - Katrin Baumgärtner |
|
Tuesday |
||
|
Exercise 3 - Dynamic Programming and LQR - Katrin Baumgärtner |
|
|
||
|
Lecture 6 - Monte Carlo RL, Temporal Difference and Q-Learning - Joschka Boedecker |
|
|
Exercise 4 - Q-Learning - Jasper Hoffmann |
|
Wednesday |
Lecture 7 - Numerical Optimal Control - Moritz Diehl |
|
|
Exercise 5 - Numerical Optimal Control - Katrin Baumgärtner |
|
|
Lecture 8 - MPC Stability Theory - Moritz Diehl (Blackboard) |
|
|
Lecture 9 - MPC Algorithms - Moritz Diehl - cancelled - |
|
|
Exercise 6 - Model Predictive Control - Katrin Baumgärtner |
|
Thursday |
Lecture 10 - Onpolicy RL with Function Approximation - Joschka Boedecker |
|
|
Lecture 11 - Offpolicy RL with Function Approximation - Joschka Boedecker |
|
|
Exercise 7 - RL with Function Approximators - Jasper Hoffmann |
|
|
Lecture 12 - Policy Gradient Methods - Joschka Boedecker |
|
|
Lecture 13 - Advanced Value-based Methods - Joschka Boedecker |
|
Friday |
Lecture 14 - Recent Algorithms for Nonlinear and Robust MPC - Moritz Diehl. Slides: PartA1, PartA2, PartB, |
|
|
Lecture 15 - Planning and Learning - Joschka Boedecker |
|
|
Lecture 16 - Differences and Similarities of MPC and RL - Joschka Boedecker and Moritz Diehl |
|
|
Extra: LP formulation of DP - Moritz Diehl |
|
Monday |
Guest Lecture Sebastien Gros: Adaptation of MPC via RL: fundamental principles |
|
|
Guest Lecture Sebastien Gros: RL and MPC: safety, stability, and some more recent results |
|
|
Guest Lecture - Sergey Levine: Model-Free and Model-Based Reinforcement Learning from Offline Data |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
DP as LP |
|
|
|
|
|
|
|