# Overview

{% embed url="<https://www.youtube.com/watch?v=xz-Uzcpc4AE>" %}
Overview - Data Management
{% endembed %}

## Summary

* Data science has never been as much about machine learning as it has about cleaning, shaping, and moving data from place to place.
* Here are the important concepts in data management:
  * **Sources -** how to get training data
  * **Labeling -** how to label proprietary data at scale
  * **Storage -** how to store data and metadata appropriately
  * **Versioning -** how to update data through user activity or additional labeling
  * **Processing -** how to aggregate and convert raw data and metadata
