> For the complete documentation index, see [llms.txt](https://fall2019.fullstackdeeplearning.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://fall2019.fullstackdeeplearning.com/course-content/data-management/overview.md).

# Overview

{% embed url="<https://www.youtube.com/watch?v=xz-Uzcpc4AE>" %}
Overview - Data Management
{% endembed %}

## Summary

* Data science has never been as much about machine learning as it has about cleaning, shaping, and moving data from place to place.
* Here are the important concepts in data management:
  * **Sources -** how to get training data
  * **Labeling -** how to label proprietary data at scale
  * **Storage -** how to store data and metadata appropriately
  * **Versioning -** how to update data through user activity or additional labeling
  * **Processing -** how to aggregate and convert raw data and metadata