https://lakefs.io/ logo
Title
c

CC

12/12/2022, 7:51 PM
i

Iddo Avneri

12/12/2022, 10:17 PM
Hi @CC, good to hear from you again. I remember having these conversations last time with lakeFS engineering last we spoke, but can’t remember all the nuances which are very important in this case. Let us regroup and get back to you.
a

Adi Polak

12/13/2022, 6:42 AM
hi CC, great questions and diagram! As for number 1 - assuming you are discussing a distributed service where the API calls are using the same domain name and having a load balancer in front of the lakeFS instances - then yes. for 1-b - different domain name and region, I believe it should also work as lakeFS service itself is stateless. the source of truth is saved in Dynamodb where the locking and synchronization take place. Having said that, I suggest asking in #help to get more information about that. This can also answer number 2. I hope this helps. and welcome to the lake 🤿 !
a

Ariel Shaqed (Scolnicov)

12/13/2022, 11:52 AM
Hi! As promised... In (1), I assume you trying to share the same repository across sites. That would work, we do not assume any tight bound on latencies for S3 or DynamoDB, and there is no direct connection between different instances. However it may be slow, and it may be expensive (depending on how you pay Amazon, pumping data outside of a region can mean paying a lot). I think (2) might work, at least for core functionality. That uses only immutable data operations and never lists objects, so the cache should work. Bear in mind that this is not a tested configuration. Also some functions require read-after-write consistency, and will have some minor failure modes. Things like garbage collection and hooks will probably best be configured and run only from the directly connected instances.
c

CC

12/13/2022, 3:26 PM
Thanks, Ariel and Adi. My experience has been that (1) works, though I'm not sure about the speed. Not sure about (2), because I couldn't get LakeFS to accept a Multi-Region Access Point for S3, and haven't worked out what to try as a single-region "equivalent" yet.
a

Ariel Shaqed (Scolnicov)

12/19/2022, 11:04 AM
Can you please open us an issue about lakeFS not accepting a multi-region S3 access point? I'm not sure why it wouldn't work.