Hello everyone I have 2 questions In the Open Source questio lakeFS #dev

Hello everyone! I have 2 questions: In the Open S...

Jacobo Calderon

07/22/2025, 9:54 PM

Hello everyone! I have 2 questions: In the Open Source questions, is there a way to manage multiple workspaces or user access to Repos/branches? Is there a configuration to tune LakeFS to be able to query delta tables or create a table using the external location (a pointer to actual delta files) as it can do it with parquets?

Tal Sofer

07/23/2025, 6:14 AM

Hi @Jacobo Calderon!

In the Open Source questions, is there a way to manage multiple workspaces or user access to Repos/branches?

There is no built-in support for access control in the open source. You have the option of building and maintaining an ACL server. as for working with Delta tables over lakeFS - absolutely. You have two options: 1. Importing tables into lakeFS (without copying table data) 2. Writing delta tables directly to lakeFS You may want to check out our delta lake integration docs

Jacobo Calderon

07/23/2025, 10:59 AM

Hi Tal! Thank you so much for your anwsers! The first one is very insightful! For the second one I have the following issue (or probably this is what you meant): I'm only able to query by partition as is. Is there a way to create a table out of all the partitions? For example, in Databricks you create an external DB as follows:

Copy code

CREATE OR REPLACE TABLE [table_name]
USING delta
LOCATION `/path/to/delta_log`

Can we do something similar to point to the parquet files and create a logical table around it? So when I query:

select * from [repo].[branch].[table]

I can actually query all partitioned tables (current and incremental)

Tal Sofer

07/23/2025, 12:17 PM

IIUC you are asking if you can create an external table in the sense that the delta table data and metadata sit outside of lakeFS? can you help me understand the use case?

Tal Sofer

07/23/2025, 12:18 PM

And you think of lakeFS as a catalog? Please correct me if I didn’t understand you correctly

Jacobo Calderon

07/23/2025, 12:22 PM

The external table was just a Databricks Unity Catalog example. In this scenario, I want to consolidate partitioned tables into a single entry point for Downstream processes to query data in SQL syntax, or for spark to be able to load the data as table instead of filesystem. For the second question, yes, I'm trying to use LakeFS as a Catalog. Is that a wrong understandment of the platform?

Tal Sofer

07/23/2025, 12:37 PM

Thanks for clarifying! > For the second question, yes, I’m trying to use LakeFS as a Catalog. Is that a wrong understandment of the platform? lakeFS isn’t a catalog, it’s a data version control system that manages any type of data (including structured data). We do have an Iceberg REST catalog, but it’s for Iceberg. For delta lake the case is different - it does not require a catalog and you can use lakeFS to manage your delta lake tables (Data + metadata), as you can see in our delta lake docs. Hope this helps making things clearer 🙂

Jacobo Calderon

07/23/2025, 12:38 PM

Interesting... I was hoping to use it as part of the metadata catalog and access control to data. Thanks for clarifying!

lakefs 1

Tal Sofer

07/23/2025, 12:38 PM

What metadata catalog are you using if I may ask?

Jacobo Calderon

07/23/2025, 12:39 PM

Right now, Athena + Glue. Which we are trying to substitute for a robust integral solution

Tal Sofer

07/23/2025, 12:41 PM

Got it, you can use lakeFS with Glue and Athena to read versioned tables managed by lakeFS. You may want to check this page out

2 Views

Open in Slack

Previous Next