Applicable Versions
All
Problem
Worker pod hits a DagBag Import Timeout error trying to parse a DAG that uses dbt cosmos. Error logs will show:
[2023-12-01 20:15:16,455: WARNING/ForkPoolWorker-4] [[34m2023-12-01T20:15:16.452+0000[0m] {[34mdagbag.py:[0m346} ERROR[0m - [35m(astronomer-cosmos)[0m - Failed to import: /usr/local/airflow/dags/dag_app_fraudsight_v3.py[0m
...airflow.exceptions.AirflowTaskTimeout: DagBag import timeout for /usr/local/airflow/dags/dag_app_fraudsight_v3.py after 30.0s.
Explanation
By default, the cosmos DbtDag will use a manifest parsing method of "automatic", which tries to find a user-supplied manifest.json
file. If it can’t find one, it will run dbt ls
to generate one. If that fails, it will use Cosmos’ dbt parser. As a manifest.json
file likely doesn't exist, the parser will fall back to running dbt ls
which can be quite slow.
Solution
When the astro deploy
command is run, a few other commands are also run, once of them being astro dev pytest
which by default run tests/test_dag_example.py
. As stated in the Astro Docs:
This test checks that:
- All Airflow tasks have required arguments.
- DAG IDs are unique across the Astro project.
- DAGs have no cycles.
- There are no general import or syntax errors.
This is where the tests are "failing" because the DAG using cosmos is unable to import in under 30 seconds. This is what the error line:
E airflow.exceptions.AirflowTaskTimeout: DagBag import timeout for /your/dag/file.py
Is referring to.
There are 3 approaches to workaround/fix this issue. I've ordered them by how stable each is, and not coincidentally they're also ordered by how the greatest level of effort to the least:
1. Refactor your Cosmos DAGs to use a pre-generated manifest.json
file.
-
- Rather than have cosmos/dbt generate the manifest at runtime, you can generate it ahead-of-time, and then reference it using
LoadMode.DBT_MANIFEST
- You can generate the
manifest.json
file by running the commanddbt ls
as part of yourDockerfile
image build. - Note that any time your dbt models change, in order to update them you need to do a full image deploy with
astro deploy --image
- Cosmos Documentation - dbt_manifest
- Rather than have cosmos/dbt generate the manifest at runtime, you can generate it ahead-of-time, and then reference it using
2. Increase the dagbag_import_timeout
to a value that will allow enough time for your CI/CD pipeline to parse & build the dbt manifest while using dbt_ls
-
-
- You can gradually increase the dagbag_import_timeout from it's default of 30s, to 60s. Increasing by 30s until you find a value that allows the DAG to be parsed.
- Because the timeout is occurring on the CI/CD pipeline worker, we need to set the environment variable there. This can be accomplished a few ways:
- Bake the env var into the image by adding
ENV AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT=60
to yourDockerfile
- This will set the timeout var everywhere the image is used. This includes local dev environments, your CI/CD pipeline, and on Astro
- Add the env var to your pipeline's worker env settings. This timeout will then only apply to the CI/CD env.
- NOTE: Setting the var in the Astro UI will not apply to your CI/CD pipeline and will not solve this issue. Env vars set in the Astro UI apply only to the Astro cloud environment.
- Bake the env var into the image by adding
- Airflow Configuration Reference - dagbag_import_timeout
-
3. Force deploy even when the tests fail.
-
- append
-f
or--force
to the end of theastro deploy
command run on the CI/CD pipeline - This may be the quickest option to get your development team unstuck while you implement one of the solutions above.
-
Please be aware that this is a potentially dangerous option that could result in pushing broken DAGs/code to your Astro deployment.
- This will skip any and all errors generated by the tests, including ones for other DAGs apart for this error we're troubleshooting currently.
- For this reason it's recommended never to use the
--force
option for production deployments.
- Astro Documentation - astro deploy
- append
Comments
0 comments
Please sign in to leave a comment.