Hi all, is it possible to have a different storage...
# help
i
Hi all, is it possible to have a different storage account per branch?
a
No, unfortunately that would also go out of the lakeFS model. Given that lakeFS branch checkouts and merges are zero-copy, it would also be hard to understand what this does. You might be able to do this for "external" objects, whose lifetime is not controlled by lakeFS. You'd have to use the lakeFS API to link those objects in, the S3 gateway obviously creates internal lakeFS objects and cannot do that.
i
I see, how would you setup LakeFS in a dev, acc prod environment? I am thinking of then only applying lakefs in development, since we only require the zero CoPy branch in dev
Also we need isolation of data between environments, because of security rules and compliance
a
You can of course always export from lakeFS to external storage as needed. We obviously strongly believe in lakeFS for prod as well, it allows tracking and undo. If you had them all on lakeFS you could use different repositories, with separate storage namespaces. That would probably work to separate accounts too, and I imagine even if not it would be a possible feature request. Would that make sense?
i
But then you would deploy a LakeFS server per environment? Yeah depends a bit if I can solve the concurrency part, otherwise I rather write directly to ADLS
A different repo is linked to a different storage account?
e
It is very common to start using lakeFS only for dev, by using the "import" functionality from production periodically.
i
I see, I think that could work for data platform teams, we however also do data science intertwined, so that would be tricky, our dev changes are usually quite far ahead of prod
👍 1
i
Check out step 4 here
👍 1
🙏 1