https://lakefs.io/ logo
#help
Title
# help
c

Constantine

02/14/2024, 4:16 PM
Hi all, im new here. I've been testing out the local container, and and now im going to try and test it in my own aws cloud. Today i use spark, glue and iceberg for my data lake, in my spark config, to make spark use my custom vpc endpoints, i had to use the following config `
conf.set("spark.sql.catalog.spark_catalog","org.apache.iceberg.spark.SparkCatalog")
conf.set("spark.sql.catalog.spark_catalog.io-impl","org.apache.iceberg.aws.s3.S3FileIO")
conf.set("spark.sql.catalog.spark_catalog.glue.endpoint","AWS_VPC_ENDPOINT_FOR_GLUE")
conf.set("spark.sql.catalog.spark_catalog.s3.endpoint","AWS_VPC_ENDPOINT_FOR_S3")
conf.set("spark.sql.catalog.spark_catalog.catalog-impl","org.apache.iceberg.aws.glue.GlueCatalog")
`conf.set("spark.sql.catalog.spark_catalog.lock.table","myGlueLockTable")`` and i notied that lakeFS has their own jars for this, does lakeFS still let you use a custom / different vpcs in aws for both glue and s3:
Copy code
.config("spark.sql.catalog.lakefs", "org.apache.iceberg.spark.SparkCatalog") \
.config("spark.sql.catalog.lakefs.catalog-impl", "io.lakefs.iceberg.LakeFSCatalog") \
.config("spark.sql.catalog.lakefs.warehouse", f"lakefs://{repo_name}") \
.config("spark.sql.catalog.lakefs.uri", lakefsEndPoint)
a

Ariel Shaqed (Scolnicov)

02/14/2024, 4:29 PM
Hi @Constantine, This configuration seems good! I'm not sure why VPCs would affect this.
2 Views