user
04/12/2022, 9:47 PMuser
04/12/2022, 10:02 PMuser
04/12/2022, 10:03 PMuser
04/12/2022, 10:04 PMuser
04/12/2022, 10:07 PMuser
04/12/2022, 10:18 PMuser
04/13/2022, 5:54 AMuser
04/13/2022, 2:35 PM<lakefs://some/path/to/spark/dataframe/>
). A user would need list the LakeFS files there, retrieve the physical path for one or more of those, and then overwrite or delete each individual file. That shows a certain intent if the user is to go down that path. Maybe that is enough for us? To only provide LakeFS URLs to our Spark jobs and instruct users to not attempt to retrieve the physical paths. Doing so should prevent accidental overwrites and deletions.
FWIW, we use Spark both in interactive notebooks and automated jobs. Automated jobs go through code review, so the real concern for accidents is in the interactive notebooks. If we instruct users to only use the LakeFS paths, then, again, we may be able to avoid accidental deletions.user
04/13/2022, 2:40 PMuser
04/13/2022, 2:41 PMuser
04/17/2022, 6:49 AM