user
11/21/2021, 11:59 AMuser
11/21/2021, 12:11 PMdbt-chaim is an existing repo in the lakeFS installation
4. main is an existing branch in the dbt-chaim repository
Is this true? Could you maybe share the code that resulted in this error?user
11/21/2021, 12:12 PMuser
11/21/2021, 12:12 PMuser
11/21/2021, 12:12 PMuser
11/21/2021, 12:12 PMuser
11/21/2021, 12:14 PMuser
11/21/2021, 12:24 PMuser
11/21/2021, 12:29 PMCREATE DATABASE tries to ensure the path used is writable: it attempts to write (see PUT 0-byte object on main/ in the error you sent) to the destination you've passed. For some reason it gets back a 404 error, which usually happens when trying to write to a path or repo that doesn't exist.
The s3a driver implementation you're using is the proprietary Databricks one (see spark.hadoop.fs.s3a.impl com.databricks.s3a.S3AFileSystem in the Spark config), so not sure whether there are any good tools to debug it from CLI.
At the moment, I'm trying to recreate this setup and turn on debug logging on a lakeFS server: with logging.level: DEBUG , lakeFS should print a summary of every HTTP request it receives, so hopefully this will give me a clue of what their s3a implementation is attempting to do.user
11/21/2021, 12:30 PMuser
11/21/2021, 12:30 PMhive-site.xml file you're using?user
11/21/2021, 3:51 PMLOCATION field to <s3a://dbt-chaim/main/test1> ? There's perhaps an edge case related to creating tables or databases at the root of a branch (opening an issue for it regardless).user
11/21/2021, 4:09 PMuser
11/21/2021, 4:09 PMuser
11/21/2021, 4:09 PMuser
11/21/2021, 4:09 PMuser
11/21/2021, 4:17 PMhive-site.xml to get the location (Hive metastore only validates access to the path)
Databricks uses the s3a configurationsuser
11/21/2021, 4:19 PMspark.hadoop.fs.s3.impl com.databricks.s3a.S3AFileSystem the per bucket configuration is not supported.
Can you please try removing that configuration just for validation?user
11/21/2021, 4:22 PMuser
11/21/2021, 4:36 PMuser
11/22/2021, 9:34 AMuser
11/22/2021, 9:35 AMuser
11/22/2021, 10:38 AMuser
11/22/2021, 11:03 AMuser
11/22/2021, 11:53 AMuser
11/22/2021, 11:54 AMuser
11/22/2021, 12:13 PMuser
11/22/2021, 12:13 PMuser
11/22/2021, 12:48 PMuser
11/22/2021, 12:51 PMuser
11/22/2021, 7:38 PMuser
11/22/2021, 7:39 PMuser
11/22/2021, 7:43 PMuser
11/23/2021, 5:39 PMuser
11/23/2021, 5:39 PMuser
11/23/2021, 6:03 PMhive-site.xml to:
<property>
<name>fs.s3a.bucket.dbt-chaim.endpoint</name>
<value><http://lakefs:8000></value>
</property>
(with the relevant lakeFS endpoint)
Also, I believe the docs wasn't completely clear, I opened an issue to fix it.
Looking forward to hear if it worked. 🙂user
11/24/2021, 8:54 AMuser
11/24/2021, 8:54 AMuser
11/24/2021, 9:48 AMuser
11/24/2021, 9:49 AMuser
11/24/2021, 2:05 PMuser
11/24/2021, 2:06 PM