Isn’t there file-level deduplication in lakeFS? I ...
# help
Isn’t there file-level deduplication in lakeFS? I thought there might be, but then I uploaded the same file twice and got two different physical addresses.
Hi @Angela Bovo, if you upload the same exact file twice, there will be no uncommitted changes (so of course, data duplication).
Please notice that when you upload a file into lakeFS the file is copied over to the repository’s bucket (storage namespace). Instead, you can import a file or directory to lakeFS and then the files will not be copied at all.
Thank you Iddo for your answer. Importing data is interesting and I will look into it. Meanwhile, with upload, here are the results of my commands, you will see that there are two different physical addresses. How do you explain this?