https://lakefs.io/ logo
Title
n

Niro

09/11/2022, 8:30 AM
Hi All, I'd like to understand the usage of lakeFS StageObject API (*PUT*/repositories/{repository}/branches/{branch}/objects). Specifically if there are any usecases which provide a physical address which is inside the repository storage namespace. We are in the works of adding a garbage collection process for uncommitted data and considering allowing this API only on physical addresses which are outside the repository namespace
👀 2
o

Oz Katz

09/11/2022, 10:44 AM
Perhaps the lakeFS Hadoop Filesystem is using that API?
y

Yoni Augarten

09/11/2022, 2:01 PM
@Niro the lakeFS HadoopFS uses linkPhysicalAddress - which AFAIK is almost the same and is going to be unified with Stageobject
n

Niro

09/11/2022, 2:21 PM
The intention is to allow LinkPhysicalAddress only on previously issued addresses via GetPhysicalAddress. I'm wondering if there's another different usecase for StageObject
y

Yoni Augarten

09/11/2022, 2:21 PM
I see
Interesting
a

Ariel Shaqed (Scolnicov)

09/11/2022, 2:50 PM
AFAIK ingest uses StageObject outside of a received address. This is a feature that has seen real use, e.g. we saw someone ingesting data into the playground!
(I know because they opened a bug that involved granting the playground some fairly subtle S3 permissions)