Hello, could I create a policy to disallow a group to read uncommitted changes?
u
user
07/27/2022, 5:48 PM
Hi Juan,
Not a beginner question... :-)
You cannot disallow reading committed changes.
You could open a branch, write to that branch, then merge back. This is the lakeFS way of ensuring readers see only complete writes of multiple files.
Does this work for you?
u
user
07/27/2022, 6:05 PM
You mean uncommitted, right? “You cannot disallow reading committed changes”
u
user
07/27/2022, 6:06 PM
Yeah, sorry. Long day...
u
user
07/27/2022, 6:09 PM
Agree with you, the branch approach is the right way. But I am a bit confused with this process compared to git. usually for git, uncommitted is local change, not visible to others. but with cloud storage, there is not really “local” concept.
u
user
07/27/2022, 6:10 PM
so collaborate on the same branch is not recommended with lakeFS?
u
user
07/27/2022, 6:14 PM
Yes, it's a bit weird. lakeFS has no local storage, only staging. We want it to be an object store. Which pretty much requires distributed access.
So think of a Spark job that needs to run on lakeFS. All executors (and the driver) need to "see" the same object store of files. Local storage cannot do this...
u
user
07/27/2022, 6:16 PM
make sense. I could map all staging tables to feature branches, and production tables to main branch.
u
user
07/27/2022, 6:25 PM
Yeah!
You could even open a branch per run with a unique component in the name, work on that, merge it back, and forget that branch ever existed. The results of the run will appear atomically on the merge target.