Reinforcement Learning

Price: $2,000.00


Reinforcement Learning

Available to Vector Institute sponsors only.

Registration closes on January 31, 2021

Date: March 16 - May 4, 2021
Time: Lectures, Tuesdays 2 - 4 PM. Tutorials, TBA
Location: This course will be delivered online via D2L Brightspace, through live lectures and tutorials
Instructors: Pascal Poupart

Fees: $5,000 $2,000

Please read: Terms and Conditions

Reinforcement learning is a powerful paradigm for modeling autonomous and intelligent agents interacting with the environment, and it is relevant to an enormous range of tasks, including robotics, game playing, consumer modeling and healthcare. This course provides an introduction to reinforcement learning intelligence, which focuses on the study and design of agents that interact with a complex, uncertain world to achieve a goal. We will study agents that can make near-optimal decisions in a timely manner with incomplete information and limited computational resources.

The course will cover Markov decision processes, reinforcement learning, planning, and function approximation (online supervised learning). The course will take an information-processing approach to the concept of mind and briefly touch on perspectives from psychology, neuroscience, and philosophy.

List of Topics covered in this course (expected)
With a focus on AI as the design of agents learning from experience to predict and control their environment, topics will include

  • Markov decision processes
  • Planning by approximate dynamic programming
  • Monte Carlo and Temporal Difference Learning for prediction/li>
  • Monte Carlo, Sarsa, and Q-learning for control
  • Dyna and planning with a learned model
  • Prediction and control with function approximation
  • Policy gradient methods

Who Should Attend

  • Vector sponsor employees only- further details TBA
  • If you wish to discuss requirements with a member of the Vector Institute, please contact:

At the end of this course, you will have gained both knowledge and system building abilities in:

  • Define the key features of reinforcement learning that distinguishes it from AI and non-interactive machine learning (as assessed by the exam)
  • Given an application problem (e.g. from computer vision, robotics, etc), decide if it should be formulated as a RL problem; if yes be able to define it formally (in terms of the state space, action space, dynamics and reward model), state what algorithm (from class) is best suited for addressing it and justify your answer (as assessed by the project and the exam)
  • Implement in code common RL algorithms (as assessed by the homeworks)
  • Describe (list and define) multiple criteria for analyzing RL algorithms and evaluate algorithms on these metrics: e.g. regret, sample complexity, computational complexity, empirical performance, convergence, etc (as assessed by homeworks and the exam)
  • MDescribe the exploration vs exploitation challenge and compare and contrast at least two approaches for addressing this challenge (in terms of performance, scalability, complexity of implementation, and theoretical guarantees) (as assessed by an assignment and the exam)

Course Load

Participants can expect to spend approximately 10-15 hours per week reading and engaging with the material, attending Tutorials, and completing assignments.

Pascal Poupart
Pascal Poupart is a Professor in the David R. Cheriton School of Computer Science at the University of Waterloo, Waterloo (Canada). He is also a Canada CIFAR AI Chair at the Vector Institute and a member of the Waterloo AI Institute. He served as Research Director and Principal Research Scientist at the Waterloo Borealis AI Research Lab funded by the Royal Bank of Canada (2018-2020). He also served as scientific advisor for ProNavigator (2017-2019), ElementAI (2017-2018) and DialPad (2017-2018). He received the B.Sc. in Mathematics and Computer Science at McGill University, Montreal (Canada) in 1998, the M.Sc. in Computer Science at the University of British Columbia, Vancouver (Canada) in 2000 and the Ph.D. in Computer Science at the University of Toronto, Toronto (Canada) in 2005. His research focuses on the development of algorithms for Machine Learning with application to Natural Language Processing, Health Informatics, Computational Finance, Telecommunication Networks and Sports Analytics. He is most well known for his contributions to the development of Reinforcement Learning algorithms. Notable projects that his research team are currently working on include probabilistic deep learning, robust machine learning, data efficient reinforcement learning, conversational agents, automated document editing, adaptive satisfiability, sports analytics and knowledge graphs.
Pascal Poupart received a Canada CIFAR AI Chair (2018-2021), a Cheriton Faculty Fellowship (2015-2018), a best student paper honourable mention (SAT-2017), a silver medal at the SAT-2017 competition, a top reviewer award (ICML-2016), a gold medal at the SAT-2016 competition, a best reviewer award (NIPS-2015), an Early Researcher Award from the Ontario Ministry of Research and Innovation (2008), two Google research awards (2007-2008), a best paper award runner up (UAI-2008) and the IAPR best paper award (ICVS-2007). He serves as member of the editorial board of the Journal of Machine Learning Research (JMLR) (2009 - present), guest editor for the Machine Learning Journal (MLJ) (2012 - present) and associate editor of the Journal of Artificial Intelligence Research (JAIR) (2017-2019), He routinely serves as area chair or senior program committee member for NeurIPS, ICML, AISTATS, ICLR, IJCAI, AAAI and UAI. His research collaborators include Microsoft, RBC Borealis AI, Google, Intel, Ford, ProNavigator, SportLogic, Scribendi, Kik Interactive, In the Chat, Slyce, HockeyTech, the Alzheimer Association, the UW-Schlegel Research Institute for Aging, Sunnybrook Health Science Centre and the Toronto Rehabilitation Institute.

All course Units will be released online according to the schedule below, and supplemented by online tutorials. All lectures and tutorials will be recorded, though we encourage you to attend the sessions live where possible.


  • Lectures - Tuesdays 2 - 4 PM
  • Tutorials - TBA 2h/week
  • TA Office Hour - 1h/week

Week 1: Week of Mar 15, 2021

  • Lecture 1
  • Lecture 2
  • Tutorial 1 + TA office hour


Week 2: Week of Mar 22, 2021

  • Lecture 3
  • Lecture 4
  • Tutorial 2+ TA office hour


Week 3: Week of Mar 29, 2021

  • Lecture 5
  • Lecture 6
  • Tutorial 3 + TA office hour


Week 4: Week of Apr 5, 2021

  • Lecture 7
  • Lecture 8
  • Tutorial 4 + TA office hour


Week 5: Week of Apr 12, 2021

  • Lecture 9
  • Lecture 10
  • Tutorial 5 + TA office hour


Week 6: Week of Apr 19, 2021

  • Lecture 11
  • Lecture 12
  • Tutorial 6 + TA office hour


Week 7: Week of Apr 26, 2021

  • Lecture 13
  • Lecture 14
  • Tutorial 7 + TA office hour


Week 8: Week of May 3, 2021

  • Lecture 15
  • Lecture 16
  • Tutorial 8 + TA office hour