Team Structure

How to structure a Machine Learning team inside an organization?

Summary

  • The Nascent and Ad-Hoc Machine Learning organization:

    • No one is doing Machine Learning, or Machine Learning is done on an ad-hoc basis.

    • There is often low-hanging fruit for Machine Learning.

    • But there is little support for Machine Learning projects and it’s very difficult to hire and retain good talent.

  • The Research and Development Machine Learning organization:

    • Machine Learning efforts are centered in the R&D arm of the organization. Often hire Machine Learning researchers and doctorate students with experience publishing papers.

    • They can hire experienced researchers and work on long-term business priorities to get big wins.

    • However, it is very difficult to get quality data. Most often, this type of research work rarely translates into actual business value, so usually the amount of investment remains small.

  • The Business and Product Embedded Machine Learning organization:

    • Certain product teams or business units have Machine Learning expertise along-side their software or analytics talent. These Machine Learning individuals report up to the team’s engineering/tech lead.

    • Machine Learning improvements are likely to lead to business value. Furthermore, there is a tight feedback cycle between idea iteration and product improvement.

    • Unfortunately, it is still very hard to hire and develop top talent, and access to data & compute resources can lag. There are also potential conflicts between Machine Learning project cycles and engineering management, so long-term Machine Learning projects can be hard to justify.

  • The Independent Machine Learning organization:

    • Machine Learning division reports directly to senior leadership. The Machine Learning Product Managers work with Researchers and Engineers to build Machine Learning into client-facing products. They can sometimes publish long-term research.

    • Talent density allows them to hire and train top practitioners. Senior leaders can marshal data and compute resources. This gives the organizations to invest in tooling, practices, and culture around Machine Learning development.

    • A disadvantage is that model handoffs to different business lines can be challenging, since users need to buy-in to Machine Learning benefits and get educated on the model use. Also, feedback cycles can be slow.

  • The Machine Learning First organization:

    • CEO invests in Machine Learning and there are experts across the business focusing on quick wins. The Machine Learning division works on challenging and long-term projects.

    • They have the best data access (data thinking permeates the organization), the most attractive recruiting funnel (challenging Machine Learning problems tends to attract top talent), and the easiest deployment procedure (product teams understand Machine Learning well enough).

    • This type of organization archetype is hard to implement in practice since it is culturally difficult to embed Machine Learning thinking everywhere.

  • Organizational design follow 3 broad categories:

    • Software Engineer vs Research: To what extent is the Machine Learning team responsible for building or integrating with software? How important are Software Engineering skills on the team?

    • Data Ownership: How much control does the Machine Learning team have over data collection, warehousing, labeling, and pipelining?

    • Model Ownership: Is the Machine Learning team responsible for deploying models into production? Who maintains the deployed models?

Last updated