This course will introduce students to recent developments in the area of learning-based robotics. The course will start with the instructor providing an overview of background material from relevant sub-fields: computer vision, machine learning, reinforcement learning, control theory and robotics. This will be followed by discussion of advanced techniques for arriving at policies for robots, such as model learning, model-based RL with learned models, imitation learning, inverse reinforcement learning, self-supervised learning, exploration, and hierarchical reinforcement learning, and application of these concepts to robot navigation and manipulation. This part of the course will be covered via student-led discussion of recent research papers that develop and validate these techniques. Course also includes open-ended project work that will provide students a flavor of how to conduct research in this emerging area.
After taking this course you will be able to:
This is an advanced gradate course aimed at graduate students conducting research in relevant research areas. The course will largely cover relevant papers published within the last few years in computer vision, robotics, and machine learning. Students should be familiar with reading and critiquing research papers, and should have a basic understanding of concepts in artificial intelligence, and machine learning. Students must have taken at least one of the following (or equivalent) courses: ECE 448 / CS 440 (Introduction to Artificial Intelligence), ECE 544NA (Pattern Recognition), ECE 549 / CS 543 (Computer Vision). If you are not sure whether you meet the prerequisites, talk to the instructor after the first class or in office hours.
Here is a tentative syllabus for the course. TBD readings will be filled in over time, but at least 2 weeks before class.
|Date||Topic||Material / Readings|
|Aug 27||Introduction and Course Overview|
|Part 1: Background|
|Aug 29||Computer Vision Review||3D Reconstruction, Recognition, CNNs for Recognition.
See also Szeliski Chapters 4, 7, 14.
|Sep 03||Robotics Review||Configuration Space, Forward Kinematics, Inverse Kinematics, Motion Planning, Optimal Control.
See also Modern Robotics Chapters 2, 4, 6, 10.
|Sep 05||MDP Review||Terminology, Policy Evaluation, Policy Improvement, Policy Iteration, Value Iteration. Class slides.
See also: David Silver’s slides here and here, and Sutton and Barto Chapters 3, and 4.
|Sep 10||MDP Review||Model Free Reinforcement Learning: Monte-Carlo and Temporal Difference Learning and Control, Off-policy learning. Class Slides.
See also: David Silver’s slides here and here, and Sutton and Barto Chapters 5, 6, and 7, Playing Atari with Deep Reinforcement Learning.
|Sep 12||MDP Review||Model Free Reinforcement Learning: Deep Q-learning, Policy gradients. See also: David Silver’s slides here, and Sutton and Barto Chapters 13|
|Sep 17||Deep RL||Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Mastering the game of Go without human knowledge
|Part 2: Alternatives to Solving Unknown MDPs|
|Sep 19||Model Building||PILCO: A Model-Based and Data-Efficient Approach to Policy Search
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
|Sep 24||Model Building||Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control
SPNets: Differentiable Fluid Dynamics for Deep Neural Networks
|Sep 26||Imitation Learning||A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
End-to-End Training of Deep Visuomotor Policies
|Oct 01||Inverse Reinforcement Learning||Maximum Entropy Inverse Reinforcement Learning
Apprenticeship Learning via Inverse Reinforcement Learning
|Oct 03||Self-Supervised and Unsupervised Learning in Computer Vision||Learning Features by Watching Objects Move
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
|Oct 08||Self-Supervision in Robotics||Visual Reinforcement Learning with Imagined Goals
Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours
|Oct 10||Exploration||Curiosity-driven Exploration by Self-supervised Prediction
Diversity is All You Need: Learning Skills without a Reward Function
|Oct 15||Exploration||Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play
Intrinsic Motivation for Encouraging Synergistic Behavior
|Oct 17||Hierarchies||Feudal Reinforcement Learning
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
|Oct 22||Social Learning||Time-Contrastive Networks: Self-Supervised Learning from Video
Learning Navigation Subroutines by Watching Videos
|Part 3: Case Studies|
|Oct 24||Navigation||Cognitive Mapping and Planning for Visual Navigation
Beauty and the Beast: Optimal Methods Meet Learning for Drone Racing
|Oct 29||Navigation||Semi-parametric Topological Memory for Navigation
Bayesian Relational Memory for Semantic Visual Navigation
|Oct 31||Manipulation||Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics
More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch
|Nov 05||Manipulation||Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost
Learning Task-Oriented Grasping for Tool Manipulation with Simulated Self-Supervision
|Nov 07||Hardware and Sensors||A Soft Robot that Navigates its Environment through Growth
|Nov 12||Multi-task Learning||Task2Vec: Task Embedding for Meta-Learning
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Towards Generalization and Simplicity in Continuous Control
|Nov 19||Lessons from Cognitive Science and Psychology||The Development of Embodied Cognition: Six Lessons from Babies
|Nov 21||Big Data vs Clever Algorithms||The Bitter Lesson, Re: A Bitter Lesson
Intelligence without Representation
|Dec 03||Modern Deep RL vs Classical Control||A Tour of Reinforcement Learning: The View from Continuous Control
Towards Generalization and Simplicity in Continuous Control
|Dec 05||Project Presentations|
|Dec 10||Project Presentations|
Tentative, certain details may be adjusted based on how the class size evolves.