Hi, I have this setting with: - local airflow - la...
# help
a
Hi, I have this setting with: • local airflow • lakeFS on aws • s3 • emr serverless studio The flow is - local airflow triggers lakeFS to create a new branch to test new logic, later it starts an emr application that leverage this new branch. To make this complete for testing environment, I would like to upload my python code from local to my lakeFS installation on the cloud using airflow, or something else that can be automated in python with the same script. any idea how to go about it? I couldn't find it in the LakeFS-Airflow operators.
a
Not sure if this is relevant, but AFAIK we do have an upload operator in the lakeFS Airflow provider.
a
🙈oh no. I missed it in the docs. giving it a try right now!
d
lmk if you have any other questions about emr serverless @Adi Polak - I maintain the emr-serverless-samples repo on github. 🙂 There is also an emr serverless airflow operator in there if it’s helpful.
jumping lakefs 1
a
You are too awesome @Damon C! I used the operator, and it works with the existing studio -so thank YOU! having said that, I continue building the environment and there would be more questions soon! stay tuned 😊
dancing lakefs 1
d
Awesome! 👏 Not sure if it’s helpful, but I also have a repo with a bunch of GitHub actions around orchestrating EMR Serverless. I’m also working on a CLI to make deploying/copying to s3 easy, but it’s still very very beta. 😄
a
Thanks, really excited sunglasses lakefs to read that
As of apache-airflow-providers-amazon==5.0.0, the EMR Serverless Operator is now part of the official Apache Airflow Amazon Provider and has been tested with open source Apache Airflow v2.2.2.
d
Heh, yea it’s been a little awkward. MWAA for a while hasn’t supported upgrading the amazon provider. 🫣 I think it works now, but the internal ticket is still open.