Why is data management important?
Overview - Data Management
- Data science has never been as much about machine learning as it has about cleaning, shaping, and moving data from place to place.
- Here are the important concepts in data management:
- Sources - how to get training data
- Labeling - how to label proprietary data at scale
- Storage - how to store data and metadata appropriately
- Versioning - how to update data through user activity or additional labeling
- Processing - how to aggregate and convert raw data and metadata