user
11/21/2021, 11:59 AMuser
11/21/2021, 12:11 PMdbt-chaim
is an existing repo in the lakeFS installation
4. main
is an existing branch in the dbt-chaim
repository
Is this true? Could you maybe share the code that resulted in this error?user
11/21/2021, 12:12 PMuser
11/21/2021, 12:12 PMuser
11/21/2021, 12:12 PMuser
11/21/2021, 12:12 PMuser
11/21/2021, 12:14 PMuser
11/21/2021, 12:24 PMuser
11/21/2021, 12:29 PMCREATE DATABASE
tries to ensure the path used is writable: it attempts to write (see PUT 0-byte object on main/
in the error you sent) to the destination you've passed. For some reason it gets back a 404 error, which usually happens when trying to write to a path or repo that doesn't exist.
The s3a driver implementation you're using is the proprietary Databricks one (see spark.hadoop.fs.s3a.impl com.databricks.s3a.S3AFileSystem
in the Spark config), so not sure whether there are any good tools to debug it from CLI.
At the moment, I'm trying to recreate this setup and turn on debug logging on a lakeFS server: with logging.level: DEBUG
, lakeFS should print a summary of every HTTP request it receives, so hopefully this will give me a clue of what their s3a implementation is attempting to do.user
11/21/2021, 12:30 PMuser
11/21/2021, 12:30 PMhive-site.xml
file you're using?user
11/21/2021, 3:51 PMLOCATION
field to <s3a://dbt-chaim/main/test1>
? There's perhaps an edge case related to creating tables or databases at the root of a branch (opening an issue for it regardless).user
11/21/2021, 4:09 PMuser
11/21/2021, 4:09 PMuser
11/21/2021, 4:09 PMuser
11/21/2021, 4:09 PMuser
11/21/2021, 4:17 PMhive-site.xml
to get the location (Hive metastore only validates access to the path)
Databricks uses the s3a configurationsuser
11/21/2021, 4:19 PMspark.hadoop.fs.s3.impl com.databricks.s3a.S3AFileSystem
the per bucket configuration is not supported.
Can you please try removing that configuration just for validation?user
11/21/2021, 4:22 PMuser
11/21/2021, 4:36 PMuser
11/22/2021, 9:34 AMuser
11/22/2021, 9:35 AMuser
11/22/2021, 10:38 AMuser
11/22/2021, 11:03 AMuser
11/22/2021, 11:53 AMuser
11/22/2021, 11:54 AMuser
11/22/2021, 12:13 PMuser
11/22/2021, 12:13 PMuser
11/22/2021, 12:48 PMuser
11/22/2021, 12:51 PMuser
11/22/2021, 7:38 PMuser
11/22/2021, 7:39 PMuser
11/22/2021, 7:43 PMuser
11/23/2021, 5:39 PMuser
11/23/2021, 5:39 PMuser
11/23/2021, 6:03 PMhive-site.xml
to:
<property>
<name>fs.s3a.bucket.dbt-chaim.endpoint</name>
<value><http://lakefs:8000></value>
</property>
(with the relevant lakeFS endpoint)
Also, I believe the docs wasn't completely clear, I opened an issue to fix it.
Looking forward to hear if it worked. 🙂user
11/24/2021, 8:54 AMuser
11/24/2021, 8:54 AMuser
11/24/2021, 9:48 AMuser
11/24/2021, 9:49 AMuser
11/24/2021, 2:05 PMuser
11/24/2021, 2:06 PM