Hi everyone, I am trying to setup lakefs with delt...
# help
n
Hi everyone, I am trying to setup lakefs with delta lake or apache hudi . It seems that when you append into two branches (main and another one) and then you try to merge them it doesn't work. In the case of Delta lake it just refuses to merge because there is a conflict between the log files (this is mentioned in the lakefs doc but I would like to know if there is a hack to make lakefs ignore the files).However in the case of Hudi it merges successfully but I noticed that the records appended to the the main branch are deleted after the merge . I would like to know if someone here already encountered this problem when working with lakefs and how possibly it can be solved . thanks
i
Hey Nor, sorry that you are encountering issues with the integration. 1. Regarding Delta, currently there isn’t an option to ignore files when you merge between branches. Also, what would be the desired behaviour for the log files in such cases in your opinion? Ignoring one branch or the other might result in inconsistent state when you try to read those again. 2. For Hudi, not sure I understand how these files were deleted. lakeFS doesn’t delete files if it wasn’t requested specifically to do so. Is it possible that Hudi deleted those just before/after the merge? Generally speaking, we’d love to hear more about your use case and understand how we can improve the integration with the two formats.
Just for future reference, the docs for delta with lakeFS and relevant blog post.