Robin Moffatt
04/25/2023, 5:03 PMspark.sql.warehouse.dir
? I've tried that but getting org.apache.spark.SparkException: Unable to create database default as failed to create its directory <s3://example/main>.
and no obvious error on the lakeFS side that I can see.Oz Katz
04/25/2023, 5:23 PMAmit Kesarwani
04/25/2023, 9:28 PMRobin Moffatt
04/26/2023, 9:53 AMs3a
not s3
.
Is there a useful reference on why the URI is sometimes one or the other?CREATE DATABASE
from Spark SQL it fails:
org.apache.spark.SparkException: Unable to create database default as failed to create its directory <s3a://example/main>
java.io.FileNotFoundException: PUT 0-byte object on main/: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: null; S3 Extended Request ID: null; Proxy: null), S3 Extended Request ID: null:404 Not Found
However, if I create a dummy file (e.g. spark.range(0, 1).write.save('<s3a://example/main/dummy_file>')
) then the CREATE DATABASE
works fine.
Is there a more elegant/proper way to do this than creating a dummy file first?
Notebook: https://gist.github.com/rmoff/28211a28adf7c55607f7ed7e4c4efc8fOz Katz
04/26/2023, 5:07 PM<s3a://example/main/warehouse/>
or something of that sort. Otherwise, Spark will attempt to write an zero-length-named file at the root of that location, which is ambiguous with the branch URI.Robin Moffatt
04/26/2023, 5:07 PM<s3a://example/main>
worked just fine - it was the s3
prefix that broke thingsAriel Shaqed (Scolnicov)
04/27/2023, 3:44 AM