Full Stack Deep Learning
Search…
Full Stack Deep Learning
Course Content
Setting up Machine Learning Projects
Infrastructure and Tooling
Data Management
Machine Learning Teams
Training and Debugging
Testing and Deployment
Project Structure
ML Test Score
CI / Testing
Docker
Web Deployment
Monitoring
Hardware/Mobile
Research Areas
Labs
Where to go next
Guest Lectures
Xavier Amatriain (Curai)
Chip Huyen (Snorkel)
Lukas Biewald (Weights & Biases)
Jeremy Howard (Fast.ai)
Richard Socher (Salesforce)
Raquel Urtasun (Uber ATG)
Yangqing Jia (Alibaba)
Andrej Karpathy (Tesla)
Jai Ranganathan (KeepTruckin)
Franziska Bell (Toyota Research)
Corporate Training and Certification
Corporate Training
Certification
Powered By
GitBook
Web Deployment
How to deploy your models to the web?
Web Deployment - Testing and Deployment
Summary
For web deployment, you need to be familiar with the concept of
REST API.
You can deploy the code to Virtual Machines, and then scale by adding instances.
You can deploy the code as containers, and then scale via orchestration.
You can deploy the code as a “server-less function.”
You can deploy the code via a model serving solution.
If you are making
CPU inference
, you can get away with scaling by launching more servers (Docker), or going serverless (AWS Lambda).
If you are using
GPU inference
, things like TF Serving and
Ray Serve
become useful with features such as adaptive batching.
Previous
Docker
Next
Monitoring
Last modified
1yr ago
Copy link
Contents
Summary