when writing data to lakeFS, where does data is persisted before issuing commit?
10/03/2022, 9:44 AM
In the object storage you are using as your underlying storage.
A commit is a metadata operation only.
10/04/2022, 12:36 AM
Thank you. so is it during metadata operation range, meta range is updated with data pointers? Also, to get a point in view of data can a commit operation be issued? and if there are no data changes have happened between 2 commit commands will there still be 2 commits?
10/04/2022, 6:41 AM
1. Pointers to the data are created when the data is saved to the object storage, when it is still uncommitted. Only during a commit will the ranges/Meta ranges be updated to reflect a new version and a commit ID will be created. Commits and merges are metadata operations: they create updated range and metarange objects containing data pointers.
2. commits are immutable - once you commit, you get back a commit id that you can use to view and query the data as it existed when the commit was created
3. currently, empty commits are not allowed. if no changes were applied, an error will be returned by lakeFS stating there are no changes to the target branch
10/04/2022, 12:31 PM
how are multiple updates to a file after a commit are tracked?
10/04/2022, 2:07 PM
lakeFS only tracks changes to the object/file level. A change is versioned only when committed, so if several changes were made between two commits, the latest state of the object is the change that will be recorded in the commit. All other intermidiate states between the two commits will not be versioned.
10/04/2022, 2:21 PM
great, that answers my questions. Thank you so much.