Hi, I have LakeFS working locally, and I am: - in...
# help
c
Hi, I have LakeFS working locally, and I am: • integrating with airflow • writing a user guide When working in a pipeline, I would like to be able to push and commit updates to a repo to a specific location in the repo with a commit message, and without cloning the location previously. For example, I run a process that generates 1GB of data for the 10th time into a directory called
results
in branch
pipeline
. I would like to update the
results
folder in
my-repo:pipeline
with the results I have in the current operating Pod without also cloning the results folder. This way, I can use LakeFS to maintain a record of my pipeline runs. How would I push update from a local running context? I see there is an
import
command, but the
ingest
command is now deprecated. I can also push specific files using
lakectl fs
, but I don't have: • a way to push a directory • and commit with a message Without cloning some prior state.
a
c
While some of that is useful, our deployment is almost exclusively in KubernetesPodOperator form. Right now, I've chosen to model LakeFS + airflow Kubernetes Pod as an Initialization Container (to get the data). Now, I am working out how to send the data. I have faith that it is possible, judging by the documentation, but I'm not seeing any examples integrating lakefs with the kubernetes pod operators. Do you know of anything I can review?
a
c
ah looks like I am good to go!
Thank you Amit!
🤘 1
a
👍