Full Stack Deep Learning
  • Full Stack Deep Learning
  • Course Content
    • Setting up Machine Learning Projects
      • Overview
      • Lifecycle
      • Prioritizing
      • Archetypes
      • Metrics
      • Baselines
    • Infrastructure and Tooling
      • Overview
      • Software Engineering
      • Computing and GPUs
      • Resource Management
      • Frameworks and Distributed Training
      • Experiment Management
      • Hyperparameter Tuning
      • All-in-one Solutions
    • Data Management
      • Overview
      • Sources
      • Labeling
      • Storage
      • Versioning
      • Processing
    • Machine Learning Teams
      • Overview
      • Roles
      • Team Structure
      • Managing Projects
      • Hiring
    • Training and Debugging
      • Overview
      • Start Simple
      • Debug
      • Evaluate
      • Improve
      • Tune
      • Conclusion
    • Testing and Deployment
      • Project Structure
      • ML Test Score
      • CI / Testing
      • Docker
      • Web Deployment
      • Monitoring
      • Hardware/Mobile
    • Research Areas
    • Labs
    • Where to go next
  • Guest Lectures
    • Xavier Amatriain (Curai)
    • Chip Huyen (Snorkel)
    • Lukas Biewald (Weights & Biases)
    • Jeremy Howard (Fast.ai)
    • Richard Socher (Salesforce)
    • Raquel Urtasun (Uber ATG)
    • Yangqing Jia (Alibaba)
    • Andrej Karpathy (Tesla)
    • Jai Ranganathan (KeepTruckin)
    • Franziska Bell (Toyota Research)
  • Corporate Training and Certification
    • Corporate Training
    • Certification
Powered by GitBook
On this page
  • Slides
  • Video
  • Summary
  • Few-Shot Learning
  • Reinforcement Learning
  • Imitation Learning
  • Domain Randomization
  • Architecture Search
  • Unsupervised Learning
  • Overall Research Theme
  • Bridge The Research and Real-World Gap
  • How To Keep Up

Was this helpful?

  1. Course Content

Research Areas

Professor Pieter Abbeel covers state of the art deep learning methods that are just now becoming usable in production.

PreviousHardware/MobileNextLabs

Last updated 4 years ago

Was this helpful?

Slides

Video

Summary

Few-Shot Learning

  • Model-Agnostic Meta-Learning is an end-to-end learning paradigm of a parameter vector that is a good initialization for fine-tuning many tasks.

  • It works well for classification problems, optimization tasks, and generative models.

Reinforcement Learning

  • Compared to supervised learning, reinforcement learning has additional challenges in terms of credit assignment, stability, and exploration.

  • Success stories of reinforcement learning are predominantly in the domains of game-playing and robotics.

  • Meta-Reinforcement Learning helps us **develop "fast" reinforcement learning algorithms that can adapt to real-world scenarios.

  • Multi-Armed Bandits is a solid evaluation scheme for reinforcement learning methods.

  • Contextual Bandits is basically a simpler version of reinforcement learning with no states.

  • In the real world, reinforcement learning works well whenever we have a great simulator/demonstration of the environment and an inexpensive data collection process.

Imitation Learning

  • In the imitation learning paradigm, we collect many demonstrations and turn them into a policy that can interact with the environment.

  • One-Shot Imitation Learning only needs a single demonstration of a new task to figure out the next action.

  • In the real world, imitation learning works well whenever we have access to previous data and it's easy to predict what happens next.

Domain Randomization

  • The motivation here is how can we learn useful real-world skills in the simulator?

  • Domain randomization operates under the assumption that if the model sees enough simulated variation, the real world may look like just the next simulator.

  • A well-known example is the OpenAI's robot hand that solves the Rubik Cube.

Architecture Search

  • The idea behind architecture search is to use search algorithms to determine the optimal architecture for our neural networks.

  • We can use reinforcement learning to perform this search process.

  • Small architectures sometimes match the performance of the bigger architectures.

  • Furthermore, we can use reinforcement learning to design the right data augmentation scheme to maximize performance.

Unsupervised Learning

  • Unsupervised learning deals with unlabeled data. We can learn the network that embeds the data or learn the weights of the network architecture.

  • The main families of models include Variational Autoencoders, Generative Adversarial Networks, Exact Likelihood Models, and "puzzles" ones.

  • Contrastive Predictive Coding is an unsupervised learning scheme that breaks up the input into pieces, removes specific pieces, and asks the network to fill those pieces back in the final output.

  • A well-known example is the OpenAI's GPT-2 system that can generate text.

Overall Research Theme

  • Researchers use more computing power to get better results.

  • You should focus on problem territory with a lot of data and compute than human ingenuity.

Bridge The Research and Real-World Gap

  • Research is about going from 0 to 1.

  • In real-world applications, often 90% performance is not enough.

How To Keep Up

  • Learn to read academic papers.

  • Helpful resources to get papers are Import AI newsletter, Arxiv Sanity, Twitter, Facebook Groups, and ML Subreddit.

  • Form a reading group.

Research Areas as explained by Professor Pieter Abbeel.