Question about LakeFS and Delta table: can you con...
# help
h
Question about LakeFS and Delta table: can you confirm if I understand correctly. Does LakeFs capable of merging 2 delta table ? Use case: let say in my repo, I have a Delta Table. From branch
main
,
userA
branch out (
branchA
).
userB
branch out from the same commit on
main
to
branchB
. Both user start modifying the delta Table in their own way.
userA
merge her
branchA
to
main
first: no issue, no conflict, nothing fancy. Then
userB
merge his branch
branchB
to
main
. If lakefs is not aware of delta Table, you will have conflict on the table and need to choose which version from
userA
or
userB
to be the final copy. From what understand, lakefs is aware of Delta Table, does it means that LakeFs can merge the delta table from
userA
and
userB
? So any new row in branchA and branchB will appear in main ? How do you handle the case where the same row been modified by both users ??
i
Hi @HT, lakeFS is indeed aware of delta tables. In the scenario you described you would be able to see the
diff
between the 2 tables in the lakeFS UI, i.e. a real delta diff of the tables (e.g. 2 rows added in
branchA
, a column was deleted in
branchB
, etc.). Merging the branches is a different story as we’ll need to merge the delta log that was changed in both branches. It is on our roadmap and planned for Q4 this year.
h
looking forward to that being implemented 👀 My current solution is to break out my rows as little files !! In my use case, we don't have millions of row ... yet ... Hopefully the day we hit gazillion rows, delta lake table merge, or any other form of table merge is available in lakeFS
👍 2
jumping lakefs 1