Orchestrating Workflows Is a 'cron-ic' Systems Problem. Airflow Is the Modern Solution.

Short Talk at 12:27 pm

As a developer, devops specialist, or SRE, you almost certainly have recurring computational jobs running on your systems. cron is the simple, time-tested sysadmin tool for making a Unix host run a task on a regular schedule.

However, with the ongoing migration to cloud-based microservices and APIs, many computational tasks have a large, complex, and widely distributed graph of upstream dependencies. These dependencies come in many different forms: for example, a file or other resource arrives; a service or API becomes available; a database finishes a maintenance task; the clock strikes midnight.

Teams that try to manage such complex dependencies with cron inevitably end up writing brittle, custom code and scripts to ensure that their jobs execute in the correct order.

This raises the question: how can a team more effectively define, manage, visualize, and monitor such complex workflows? An increasingly popular answer is Apache Airflow, the open-source system for workflow orchestration.

From this talk, you will learn about the use cases for Airflow, walk through some introductory examples of the Python code that defines workflows, and watch these workflows operating in real-time in the web UI.

Presented by

Jack Bennett