Hi Adi! I can think of 2 ways to run validation checks on lakeFS data with Airflow:
1. Add an Airflow step after the lakeFS commit to run checks on that specific branch. The checks can use whatever application you see fit, Spark or Great Expectations.
2. Use lakeFS CI/CD mechanism named
Hooks . Just like with Github actions, you can wire predefined checks (by webhooks/Airflow triggering) to run on your data pre/post-commit/merge with the option to block bad changes.
The main difference between the 2 approaches is that (1) runs after changes were already committed/merged, thus exposed to external consumers.