Hi, I’m new to lakfs. I’m trying to run lakefs loc...
# help
g
Hi, I’m new to lakfs. I’m trying to run lakefs locally and use spark to write Iceberg table. But the write failed with
Copy code
exception org.apache.spark.sql.connector.catalog.CatalogNotFoundException: Catalog 'lakefs' plugin class not found: spark.sql.catalog.lakefs is not defined
        at org.apache.spark.sql.errors.QueryExecutionErrors$.catalogPluginClassNotFoundError(QueryExecutionErrors.scala:1904)
Any suggestion? Thanks!
My spark conf Copy code
Copy code
conf.set("spark.jars.packages", "org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.3.0,io.lakefs:lakefs-iceberg:v0.1.2,org.apache.hadoop:hadoop-aws:3.3.3,io.lakefs:hadoop-lakefs-assembly:0.1.13")
conf.set("spark.sql.catalog.lakefs", "org.apache.iceberg.spark.SparkCatalog")
conf.set("spark.sql.catalog.lakefs.catalog-impl", "io.lakefs.iceberg.LakeFSCatalog")
conf.set("spark.sql.catalog.lakefs.warehouse", f"<lakefs://quickstart>")
conf.set("spark.sql.catalog.lakefs.uri", "<http://127.0.0.1:8000>")
conf.set("spark.sql.defaultCatalog", "lakefs")
conf.set("spark.hadoop.fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
conf.set("spark.hadoop.fs.s3a.endpoint", "<http://127.0.0.1:8000>")
conf.set("spark.hadoop.fs.s3a.access.key", "xxx")
conf.set("spark.hadoop.fs.s3a.secret.key", "xxx")
conf.set("spark.hadoop.fs.s3a.path.style.access", "true")
conf.set("spark.sql.catalog.lakefs.cache-enabled", "false")
a
Don’t use
v
here `io.lakefslakefs icebergv0.1.2`:
Copy code
conf.set("spark.jars.packages", "org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.3.0,io.lakefs:lakefs-iceberg:0.1.2,org.apache.hadoop:hadoop-aws:3.3.3,io.lakefs:hadoop-lakefs-assembly:0.1.13")
g
OK. Let me have a try
strange thing is, if I remove
v
, it would throws error cannot find the jar
Copy code
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: io.lakefs#lakefs-iceberg;0.1.2: not found]
        at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1537)
v0.1.2
can be found in central artifact
Copy code
:: resolving dependencies :: org.apache.spark#spark-submit-parent-1579771d-c966-4f4a-a52e-389072d95ac0;1.0
        confs: [default]
        found org.apache.iceberg#iceberg-spark-runtime-3.3_2.12;1.3.0 in central
        found io.lakefs#lakefs-iceberg;v0.1.2 in central
Is there any specific jar repo we need to add
a
I used 0.1.1. Check sample notebooks for Iceberg demo: https://github.com/treeverse/lakeFS-samples/
g
OK. Thanks! I’m trying out with 0.1.3
it works! Thanks!
jumping lakefs 2
a
👍