taylor schneider

09/08/2022, 10:04 PM
Hey folks. I am trying to understand the "LakeFS Hadoop Filesystem" concept. Based on this article. It seems like it is an "overlay" or "meta" filesystem that acts as an abstraction layer between the hadoop filesystem api and the underlying lakefs filesystem (ie graveler etc). Can anyone help me understand this
Oz Katz

Oz Katz

09/09/2022, 12:10 AM
that's correct, it's a wrapper around existing hadoopfs implementations that enables spark users to efficiently write and read data managed by lakefs without the data itself going through the lakeFS server.