How to monitor your machine learning system?


  • It is crucial to monitor serving systems, training pipelines, and input data. A typical monitoring system can raise alarms when things go wrong and provide the records for tuning things.

  • Cloud providers have decent monitoring solutions.

  • Anything that can be logged can be monitored: dependency changes, distribution shift in data, model instabilities, etc.

  • Data distribution monitoring is an underserved need!

  • It is important to monitor the business uses of the model, not just its statistics. Furthermore, it is important to be able to contribute failures back to the dataset.

Last updated