https://lakefs.io/ logo
#help
Title
# help
t

Tongguo Pang

02/14/2024, 3:42 PM
Hi LakeFS team. I have a question on concurrent commit. On your website you said "Avoid concurrent commits/merges", but in reality this scenario seems inevitable. I do see in another section in the online documents, it seems this can be overcome if LakeFS is deployed with a PostgreSQL instead of the KV store if you allow retries. If so, can we implement a commit/merge hook to do this on a fist come first serve approach?
waving axolotl 2
👋 2
a

Ariel Shaqed (Scolnicov)

02/14/2024, 3:59 PM
Hi @Tongguo Pang, Obviously lakeFS does the right thing when there are concurrent merges and commits. But that can be slower. The issue is always present with an architecture that does optimistic concurrency control, which is typical of Git. In fact the pre-kv architecture did use Postgres transactions; this was not faster or cheaper by any means. I would recommend trying it out before optimizing. If you have tens of concurrent commits then you will see performance issues - and I'd really like to hear and talk about them!
t

Tongguo Pang

02/14/2024, 4:06 PM
Hi @Ariel Shaqed (Scolnicov) thanks for the quick response. Could you be more specific on "I would recommend trying it out before optimizing"? How does it work? Currently in my code I randomly sleep 5~10 seconds and try 5 times before I fail a merge/commit, so far I see this works well but I am not sure if this is the best way, and theoretically this still has a potential race condition
a

Ariel Shaqed (Scolnicov)

02/14/2024, 4:27 PM
You will rarely see a failed commit due to concurrency. I'm not sure I've ever seen one. Concurrent merges may retry. Typically this will happen within the lakeFS server, and you will see no functional difference. It will just be slower. I am not aware of any "race condition" - something that would cause incorrect results - at any concurrency. That is to say, we explicitly designed commits and merged not to have race conditions, and in addition we have never seen a bug or something that could cause a race condition. The failing case here is always slowness that can cause a timeout. If you can estimate the distribution of commit actions by size over time then we might be able to see whether there is any worry of failure. If you try many concurrent merges you may be able to push lakeFS into very long operations, and then typically connections will time out. If you need that to happen then I really do want to hear from you! We have a design , I'm optimistic about it, but we'll obviously not prioritize optimizing performance around a bottleneck that is not relevant to a user.
t

Tongguo Pang

02/14/2024, 4:29 PM
@Ariel Shaqed (Scolnicov) I will read the document. Thanks again for the response
a

Ariel Shaqed (Scolnicov)

02/14/2024, 4:34 PM
Basically that doc is a way to improve concurrent merge performance. I put it here as a reference for the kind of thing that we will do... if the current solution isn't good enough. Are you seeing performance trouble with concurrent merges, or are you preparing for possible performance issues in future?
t

Tongguo Pang

02/14/2024, 4:38 PM
In my situation we have 2 (or more) airbyte connectors writes to the same Delta table, and sometimes they commit their changes at same time and thus fail. I am looking into this and try to solve the problem, that's why I retried the commits in an Airbyte connector. Do you see any problem with this approach?
a

Ariel Shaqed (Scolnicov)

02/14/2024, 4:52 PM
Nope, it should work. I believe what you will see is merely one of the commit operations refusing to create an empty commit, because the other commit took all the objects. If you want to get 2 commits with each containing the objects from one run, you can do that! Create a branch for each run, work on that one, and merge back.
t

Tongguo Pang

02/14/2024, 5:49 PM
this is very good and will follow it. Thanks for the suggestion.
sunglasses lakefs 1
4 Views