Is there any way to restrict a user permissions to only view the “HEAD” of a specific branch and no other branches?
I’m using the open-source deployment and have been reading in the ACL list, but with the 4 groups, Reader, Writer, Super, Admin, I don’t see a way to restrict access that granularly.
The use case is that I wish to version a dataset used in machine learning model development. I want to have lineage of the data transformations from its raw ingest all the way through the data splits into Train/Validation/Test. However, for policy reasons, the developers actually building models cannot have access to anything other than the Train and Validation sets.
I did a quick test of how I thought this might work, and I’m able to make a “Training” branch that only has the training data, but even as a Reader permissioned user, I could view the commit history which traced back to the full dataset before performing data splits, which means that the test set is visible to a “misbehaving” developer, and this is not acceptable for the use case I’m working.
My other thought is to give developers access to the underlying S3 objects for the train/validation data which wouldn’t include the metadata about the versioned history. I’m about to run a test to see what this looks like. I’m still learning LakeFS so I’m not yet sure what the “underlying s3 objects” look like or if they’re accessible in this way.