Netflix open-sources Maestro, a large-scale workflow management system



Netflix has open-sourced its in-house workflow orchestrator, Maestro . Maestro is in active use within Netflix and is used to manage large-scale workflows such as data pipelines and machine learning pipelines.

Maestro: Netflix's Workflow Orchestrator | by Netflix Technology Blog | Jul, 2024 | Netflix TechBlog
https://netflixtechblog.com/maestro-netflixs-workflow-orchestrator-ee13a06f9c78



Netflix uses machine learning to predict what users will watch next, and at the time of writing, it launches thousands of machine learning workflow instances, running an average of 500,000 jobs per day, and about 2 million jobs on busy days. Because the workflows running at Netflix are so huge, the technique of 'dividing workflows into small groups and managing them as different clusters' creates problems such as 'increased complexity,' 'requiring additional mechanisms to coordinate fragmented workflows,' and 'degraded user experience.' For this reason, Netflix manages all workflows with a single workflow orchestrator.

Netflix has been using its own orchestrator ' Meson ' to manage machine learning workflows since around 2016. Since then, as the number of machine learning instances running has continued to increase, the problem of 'Meson needs to scale the system vertically, and we are approaching the upper limit of AWS instances' has become apparent. For this reason, Netflix developed and started operating a new orchestrator 'Maestro' in 2020 that can scale the system horizontally.

Netflix then announced that it had open-sourced Maestro under the Apache-2.0 license on July 23, 2024. The Maestro source code is already publicly available in the following GitHub repository:

GitHub - Netflix/maestro: Maestro: Netflix's Workflow Orchestrator
https://github.com/Netflix/maestro



Maestro is a workflow orchestrator designed to manage large-scale workflows such as data pipelines and machine learning model training pipelines, and allows the system to scale horizontally. Users can manage their business logic in formats such as 'Docker images,' 'Notebooks,' 'bash scripts,' 'SQL,' and 'Python,' and workflow definitions can be written in JSON format. Examples of workflow definitions are available at the following links.

Workflow definition example · Netflix/maestro Wiki · GitHub
https://github.com/Netflix/maestro/wiki/Workflow-definition-example



in Software, Posted by log1o_hf