We all have workflows in our daily lives. From simple ones in our personal lives, to terribly complex ones in our daily work, we could all benefit from automating these workflows. Airflow is the mechanism with which we can do this. In this talk, we will explore what Airflow is, and how we can leverage it to automate some of the tedium out of our daily lives.
Apache Airflow is an open-source Python project that, in the words of Apache, "is a platform to programmatically author, schedule and monitor workflows.". In this talk, I will focus on the basics of Airflow: what is it, and what can it do for me? In a nutshell, Airflow is a library for workflow management. The most common uses of Airflow revolve around data pipelines and processing, e.g. ETL pipes, but in reality, it can be used for pretty much any workflow that has discrete steps which can be performed independently. In addition, the scheduling capabilities of Airflow provide a high powered replacement for cron tables. I will discuss how to set up workflows, connect to other systems, and leverage the automation power of Airflow. Concepts covered will include DAGs, Operators, and Hooks. Finally, I will share some gotchas that I have come across while setting up my own workflows.
Leo has recently been sucked into the realm of Python, finally realizing what many others already have: Python will fix everything wrong in this world. Hoping to speed up this process, he has started focusing on how to evangelize the rest of the world to this fact. Prior to this revelation, he has spent the past several years focused on full stack development within the JVM and Angular/React ecosystems. Leo has recently joined the ranks of Fuse by Cardinal Health, having had previous stops at Fusion Alliance and OCLC. Leo holds a Bachelor of Science in Computer Science from Ohio University and a Master of Software Engineering from Regis University.