user
08/05/2022, 6:46 AMdf = spark.read.parquet("<s3a://my-repo/main-branch/collections/foo/>")
I wonder what are the implications of having one branch per per asset (table) vs. one centralized prefix per branch. What would you recommend? How does the branch prefix map to a database schema (for reasons of discoverability) i.e. when someone tries to read the data with plain spark-sql from i.e. perhaps databricks`s catalog?user
08/05/2022, 6:50 AMuser
08/05/2022, 6:53 AMuser
08/05/2022, 6:54 AMuser
08/05/2022, 6:55 AMuser
08/05/2022, 7:01 AMuser
08/08/2022, 8:47 AMuser
08/09/2022, 5:08 PMuser
08/09/2022, 5:08 PMuser
08/09/2022, 5:11 PMuser
08/09/2022, 5:26 PMlakectl metastore copy
command to create a corresponding table pointing at your branch. If you make changes to the table's schema and want to merge them back, you would run the command again in the opposite direction to change your main schema again.
For example, the following command will create the table for you:
lakectl metastore copy --from-schema default --from-table inventory --to-branch example_branch
After you make the changes, running the following command will merge them into your original table:
lakectl metastore copy --from-schema example_branch --from-table inventory --to-branch main --to-schema default
The first command will choose the destination schema and table name according to our suggested model, so be sure to read the docs if you want to override it.user
08/09/2022, 5:29 PMlakectl metastore
tool provides the basic capabilities, it is still far from perfect. Therefore, I'm happy to discuss the use case with you and see how lakeFS can improve.