Hi, has anyone experienced this type of error duri...
# help
m
Hi, has anyone experienced this type of error during a commit? This is using routerfs and the lakefs-icerberg lib to create a table. It appears as though the metadata file is lost, or otherwise referenced incorrectly while attempting to rename it to
v1.metadata.json
?
org.apache.iceberg.exceptions.CommitFailedException: Failed to commit changes using rename: <s3a://lakefs-poc/main/rl_dev_datastage_01_ma_snapshot/sys_audit_event/metadata/v1.metadata.json>
(more stacktrace in reply)
Copy code
org.apache.iceberg.exceptions.CommitFailedException: Failed to commit changes using rename: <s3a://lakefs-poc/main/rl_dev_datastage_01_ma_snapshot/sys_audit_event/metadata/v1.metadata.json>
	at org.apache.iceberg.hadoop.HadoopTableOperations.renameToFinal(HadoopTableOperations.java:378) ~[iceberg-spark-runtime-3.3_2.12-1.3.1.jar:?]
	at org.apache.iceberg.hadoop.HadoopTableOperations.commit(HadoopTableOperations.java:162) ~[iceberg-spark-runtime-3.3_2.12-1.3.1.jar:?]
	at io.lakefs.iceberg.LakeFSTableOperations.commit(LakeFSTableOperations.java:37) ~[lakefs-iceberg-0.1.3.jar:0.1.3]
	at org.apache.iceberg.BaseTransaction.commitCreateTransaction(BaseTransaction.java:311) ~[iceberg-spark-runtime-3.3_2.12-1.3.1.jar:?]
	at org.apache.iceberg.BaseTransaction.commitTransaction(BaseTransaction.java:290) ~[iceberg-spark-runtime-3.3_2.12-1.3.1.jar:?]
	at org.apache.iceberg.spark.source.StagedSparkTable.commitStagedChanges(StagedSparkTable.java:34) ~[iceberg-spark-runtime-3.3_2.12-1.3.1.jar:?]
.
.
.
Caused by: java.io.FileNotFoundException: No such file or directory: <lakefs://lakefs-poc/main/rl_dev_datastage_01_ma_snapshot/sys_audit_event/metadata/25193118-7546-49de-b229-ef0f039bc2d9.metadata.json>
	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3866) ~[hadoop-aws-3.3.3-amzn-0.jar:?]
	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3688) ~[hadoop-aws-3.3.3-amzn-0.jar:?]
	at org.apache.hadoop.fs.s3a.S3AFileSystem.initiateRename(S3AFileSystem.java:1887) ~[hadoop-aws-3.3.3-amzn-0.jar:?]
	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerRename(S3AFileSystem.java:1988) ~[hadoop-aws-3.3.3-amzn-0.jar:?]
	at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$rename$7(S3AFileSystem.java:1846) ~[hadoop-aws-3.3.3-amzn-0.jar:?]
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499) ~[hadoop-client-api-3.3.3-amzn-0.jar:?]
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444) ~[hadoop-client-api-3.3.3-amzn-0.jar:?]
	at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337) ~[hadoop-aws-3.3.3-amzn-0.jar:?]
	at org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:1844) ~[hadoop-aws-3.3.3-amzn-0.jar:?]
	at io.lakefs.routerfs.RouterFileSystem.rename(RouterFileSystem.java:197) ~[hadoop-router-fs-hadoop-2.9.2-assembly-0.1.0.jar:?]
	at org.apache.iceberg.hadoop.HadoopTableOperations.renameToFinal(HadoopTableOperations.java:368) ~[iceberg-spark-runtime-3.3_2.12-1.3.1.jar:?]
i
Hey @Michael Gaebel can you share your Spark config and a minimal code snippet that causing this error?
m
Copy code
#General Spark configs
    ("hive.metastore.client.factory.class", "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"),
    ("spark.sql.sources.partitionOverwriteMode", "dynamic"),
    ("spark.sql.legacy.parquet.datetimeRebaseModeInRead", "CORRECTED"),
    #LakeFS configuration for Iceberg
    ("spark.jars.packages", "org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.3.1,io.lakefs:lakefs-iceberg:v0.1.3,io.lakefs:hadoop-router-fs-hadoop-2.9.2-assembly:0.1.0"),
    ("spark.sql.catalog.lakefs", "org.apache.iceberg.spark.SparkCatalog"),
    ("spark.sql.catalog.lakefs.catalog-impl", "io.lakefs.iceberg.LakeFSCatalog"),
    ("spark.sql.catalog.lakefs.warehouse", f"lakefs://{lakefs_repo}"),
    ("spark.sql.catalog.lakefs.uri", lakefs_endpoint),
    ("spark.sql.catalog.lakefs.cache-enabled", "false"),
    #LakeFs filesystem
    ("spark.hadoop.fs.s3a.impl", "io.lakefs.routerfs.RouterFileSystem"),
    ("spark.hadoop.routerfs.mapping.s3a.1.replace", f"s3a://{lakefs_repo}"),
    ("spark.hadoop.routerfs.mapping.s3a.1.with", f"lakefs://{lakefs_repo}"),
    ("spark.hadoop.routerfs.default.fs.s3a", "org.apache.hadoop.fs.s3a.S3AFileSystem"),

    ("spark.hadoop.fs.lakefs.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"),

    #LakeFS S3 access
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.endpoint", f"{lakefs_endpoint}"),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.access.key", lakefs_access_key),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.secret.key", lakefs_secret_key),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.path.style.access", "true"),

    #Regular S3 access
    ("spark.hadoop.fs.s3a.endpoint.region", "ca-central-1"),
    ("spark.hadoop.fs.s3a.endpoint", "<https://s3.ca-central-1.amazonaws.com>"),
    ("spark.hadoop.fs.s3a.path.style.access", "true"),

    #Configs needed for Iceberg
    ("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions"),
The commit is here:
Copy code
lakefs.commits.commit(repo.id, lakefs_branch, CommitCreation(
        message=f"Initial load table 'lakefs.{lakefs_branch}.{target_database}.{table}' for schemas '{schemaList}'",
        metadata={'author': "glue"}
    ))
where
lakefs
is the configured client and the repo is fetched from that client
Copy code
lakefs = LakeFSClient(lakefs_config)
repo = lakefs.repositories.get_repository(lakefs_repo)
i
Thanks, I see that you’re running lakeFS api call (
lakefs.commits.commit
) but the stack trace is Java / Spark / Iceberg - What’s the connection between them?
m
The command is executed in a Pyspark job running in AWS Glue. I truncated the stacktrace for readability, but if you want I can post the whole thing
oh, wait nevermind, it's occurring during the SQL that's creating the table, prior to the lakefs commit.
i
So I think that this won’t work
("spark.hadoop.fs.s3a.impl", "io.lakefs.routerfs.RouterFileSystem"),
You should try
("spark.hadoop.fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
Check this doc for more info
m
So, I need the router filesystem to be able to access regular S3 paths outside of lakefs. This was suggested previously on this channel here: https://lakefs.slack.com/archives/C016726JLJW/p1695999430843489?thread_ts=1695995997.176189&amp;cid=C016726JLJW
based on the stacktrace, it appears to be correctly calling the
S3AFileSystem
after routing...
Copy code
at org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:1844) ~[hadoop-aws-3.3.3-amzn-0.jar:?]
	at io.lakefs.routerfs.RouterFileSystem.rename(RouterFileSystem.java:197) ~[hadoop-router-fs-hadoop-2.9.2-assembly-0.1.0.jar:?]
	at
i
Hmm Im not quite sure that RouterFS and Iceberg integration works together. Let me verify that and I’ll get back to you by tomorrow
m
Thanks! I'll also try to poke around more
i
YW 🙏
j
Hi @Michael Gaebel I think that the RouterFS usage is redundant in your case. > So, I need the router filesystem to be able to access regular S3 paths outside of lakefs. This is actually done using your other configuration:
Copy code
#LakeFS S3 access
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.endpoint", f"{lakefs_endpoint}"),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.access.key", lakefs_access_key),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.secret.key", lakefs_secret_key),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.path.style.access", "true"),
The renaming of the scheme from
s3a
to
lakefs
is what’s causing the problem:
Copy code
Caused by: java.io.FileNotFoundException: No such file or directory: <lakefs://lakefs-poc/main/rl_dev_datastage_01_ma_snapshot/sys_audit_event/metadata/25193118-7546-49de-b229-ef0f039bc2d9.metadata.json>
Spark doesn’t know what to do with it (because no Filesystem was configured to handle it in the lakeFS catalog’s context). This is fine. Can you please change your configurations to:
Copy code
#General Spark configs
    ("hive.metastore.client.factory.class", "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"),
    ("spark.sql.sources.partitionOverwriteMode", "dynamic"),
    ("spark.sql.legacy.parquet.datetimeRebaseModeInRead", "CORRECTED"),
    #LakeFS configuration for Iceberg
    ("spark.jars.packages", "org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.3.1,io.lakefs:lakefs-iceberg:v0.1.3"),
    ("spark.sql.catalog.lakefs", "org.apache.iceberg.spark.SparkCatalog"),
    ("spark.sql.catalog.lakefs.catalog-impl", "io.lakefs.iceberg.LakeFSCatalog"),
    ("spark.sql.catalog.lakefs.warehouse", f"lakefs://{lakefs_repo}"),
    ("spark.sql.catalog.lakefs.uri", lakefs_endpoint),
    ("spark.sql.catalog.lakefs.cache-enabled", "false"),

    #LakeFS S3 access
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.endpoint", f"{lakefs_endpoint}"),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.access.key", lakefs_access_key),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.secret.key", lakefs_secret_key),
    (f"spark.hadoop.fs.s3a.bucket.{lakefs_repo}.path.style.access", "true"),

    #Regular S3 access
    ("spark.hadoop.fs.s3a.endpoint.region", "ca-central-1"),
    ("spark.hadoop.fs.s3a.endpoint", "<https://s3.ca-central-1.amazonaws.com>"),
    ("spark.hadoop.fs.s3a.path.style.access", "true"),

    #Configs needed for Iceberg
    ("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions"),
and test again?
👀 1
m
That worked! I've got a different error, but it's likely on my side. Thanks! I assumed to be able to use the other configuration I would need to route it, but what you said makes sense. Thanks again.
lakefs 1
j
Glad to hear that. If you have any other issue, do share
gratitude thank you 2
jumping lakefs 1