gusty allows you to control your Airflow DAGs, Task Groups, and Tasks with greater ease. gusty manages collections of tasks, represented as any number of YAML, Python, SQL, Jupyter Notebook, or R Markdown files. A directory of task files is instantly rendered into a DAG by passing a file path to gusty's
create_dag
function.
gusty also manages dependencies (within one DAG) and external dependencies (dependencies on tasks in other DAGs) for each task file you define. All you have to do is provide a list of
dependencies
or
external_dependencies
inside of a task file, and gusty will automatically set each task's dependencies and create external task sensors for any external dependencies listed.
gusty works with all major Airflow versions and has even more features, all of which aim to make the creation, management, and iteration of DAGs more fluid, so that you can intuitively design your DAG and build your tasks.
The official documentation for gusty is hosted here: https://pipeline-tools.github.io/gusty-docs/
gusty will turn every file in a DAG directory into a task. By default gusty supports five different file types, which offer convenient ways to specify an operator and operator parameters for task creation.
File Type How It Works Simply write Python code and by default gusty will execute your file using a
PythonOperator
. Other options available
Declare an
operator
in a YAML header, then write SQL in the main .sql file. The SQL automatically gets sent to the operator
.ipynb
Put a YAML block at the top of your notebook and specify an
operator
that renders your Jupyter Notebook
Use the YAML block at the top of your notebook and specify an
operator
that renders your R Markdown Document
Here is quick example of a YAML task file, which might be called something like
hello_world.yml
:
operator: airflow.providers.standard.operators.bash.BashOperator
bash_command: echo hello world
The resulting task would be a
BashOperator
with the task id
hello_world
.
Here is the same approach using a Python file instead, named
hello_world.py
, which gusty will automatically turn into a
PythonOperator
by default:
phrase = "hello world"
print(phrase)
Lastly, here's a slightly different
.sql
example: