Joe M
05/19/2024, 4:04 AMItai Admi
05/19/2024, 6:46 AMUnable to copy file <s3a://ct-rawdata-ddd8ba83-f4f5-4f29-a4dd-b428791af2cd/ct-raw-export/badger1/2024/05/15/1716089427/readings.seg1-small.csv.gz> from source <s3://ct-rawdata-ddd8ba83-f4f5-4f29-a4dd-b428791af2cd/backfill/badger1/2024/05/15/readings.seg1-small.csv.gz>: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3"
Unable to copy file <s3a://ct-rawdata-ddd8ba83-f4f5-4f29-a4dd-b428791af2cd/ct-raw-export/badger1/2024/05/15/1716088435/readings.seg0-small.csv.gz> from source <s3://ct-rawdata-ddd8ba83-f4f5-4f29-a4dd-b428791af2cd/backfill/badger1/2024/05/15/readings.seg0-small.csv.gz>: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3"
It seems like the imported objects were not exported because the spark-submit runtime is unfamiliar with the s3
protocol. Here’s the code line that fails:
org.apache.hadoop.fs.FileUtil.copy(srcPath.getFileSystem(conf), srcPath, dstFS, dstPath, false, conf)
I think the following config should resolve the issue:
spark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem
Itai Admi
05/19/2024, 6:48 AMItai Admi
05/19/2024, 6:53 AMItai Admi
05/19/2024, 7:01 AMJoe M
05/19/2024, 4:55 PMJoe M
05/21/2024, 2:32 AMOffir Cohen
05/21/2024, 11:08 AM