Sid Senthilnathan
04/19/2021, 8:10 PMGuy Hardonag
04/19/2021, 8:32 PMSid Senthilnathan
04/19/2021, 8:34 PM{"action":"delete_objects","file":"lakeFS/pkg/gateway/middleware.go:97","func":"pkg/gateway.EnrichWithOperation.func1.1","level":"debug","message_type":"action","msg":"performing S3 action","time":"2021-04-19T19:48:18Z"}
{"file":"lakeFS/pkg/gateway/operations/deleteobjects.go:75","func":"pkg/gateway/operations.(*DeleteObjects).Handle","host":"<http://origin.s3.lakefs.paigeai.net|origin.s3.lakefs.paigeai.net>","key":"branch-1/foo/bar/","level":"debug","method":"POST","msg":"object set for deletion","path":"/?delete","repository":"origin","request_id":"e31e6b81-a3b0-4db0-b4a7-3fe329558be9","service_name":"s3_gateway","time":"2021-04-19T19:48:18Z","user":"svc.emr"}
{"file":"lakeFS/pkg/gateway/operations/deleteobjects.go:75","func":"pkg/gateway/operations.(*DeleteObjects).Handle","host":"<http://origin.s3.lakefs.paigeai.net|origin.s3.lakefs.paigeai.net>","key":"branch-1/foo/","level":"debug","method":"POST","msg":"object set for deletion","path":"/?delete","repository":"origin","request_id":"e31e6b81-a3b0-4db0-b4a7-3fe329558be9","service_name":"s3_gateway","time":"2021-04-19T19:48:18Z","user":"svc.emr"}
{"error":"argument path: required value","file":"lakeFS/pkg/gateway/operations/deleteobjects.go:67","func":"pkg/gateway/operations.(*DeleteObjects).Handle","host":"<http://origin.s3.lakefs.paigeai.net|origin.s3.lakefs.paigeai.net>","key":"branch-1/","level":"error","method":"POST","msg":"failed deleting object","path":"/?delete","repository":"origin","request_id":"e31e6b81-a3b0-4db0-b4a7-3fe329558be9","service_name":"s3_gateway","time":"2021-04-19T19:48:18Z","user":"svc.emr"}
{"file":"lakeFS/pkg/httputil/logging.go:77","func":"pkg/httputil.DebugLoggingMiddleware.func1","host":"<http://origin.s3.lakefs.paigeai.net|origin.s3.lakefs.paigeai.net>","level":"debug","method":"POST","msg":"HTTP call ended","path":"/?delete","repository":"origin","request_id":"e31e6b81-a3b0-4db0-b4a7-3fe329558be9","sent_bytes":249,"service_name":"s3_gateway","status_code":200,"time":"2021-04-19T19:48:18Z","took":23154915,"user":"svc.emr"}
Sid Senthilnathan
04/19/2021, 8:35 PMGuy Hardonag
04/19/2021, 8:48 PMSid Senthilnathan
04/19/2021, 8:54 PMSid Senthilnathan
04/19/2021, 8:56 PMGuy Hardonag
04/19/2021, 9:04 PMGuy Hardonag
04/19/2021, 9:12 PMGuy Hardonag
04/19/2021, 9:13 PMGuy Hardonag
04/20/2021, 9:10 AMSid Senthilnathan
04/20/2021, 1:07 PMSid Senthilnathan
04/20/2021, 2:38 PMGuy Hardonag
04/20/2021, 2:48 PMSid Senthilnathan
04/20/2021, 7:44 PMINSERT INTO lakefs.stg_image
SELECT * FROM dbt_stg.stg_image
) which takes some data from our regular S3 bucket and writes to the lakefs path. This command works with a few hundred lines but runs indefinitely when its the whole table (about a million records).Sid Senthilnathan
04/20/2021, 7:45 PM_temporary/0
directory is created, and it seems like it wants to delete some files that aren't thereSid Senthilnathan
04/20/2021, 7:45 PM{"file":"lakeFS/pkg/logging/logger.go:79","func":"pkg/logging.logrusEntryWrapper.Debug","host":"<http://origin.s3.lakefs.paigeai.net|origin.s3.lakefs.paigeai.net>","key":"lakefs_lakefs.db/stg_image/_temporary/0/_temporary/attempt_20210420193535_202823_m_000002_442722/part-00002-ed3e7766-1673-487a-b9a7-69e0de32d18d_00002.c000.snappy.orc","level":"debug","method":"DELETE","msg":"aborted multipart upload","path":"lakefs_lakefs.db/stg_image/_temporary/0/_temporary/attempt_20210420193535_202823_m_000002_442722/part-00002-ed3e7766-1673-487a-b9a7-69e0de32d18d_00002.c000.snappy.orc","qualified_key":"lakefs_lakefs.db/stg_image/_temporary/0/_temporary/attempt_20210420193535_202823_m_000002_442722/part-00002-ed3e7766-1673-487a-b9a7-69e0de32d18d_00002.c000.snappy.orc","qualified_ns":"paige-data-s3-preprod-use1-lakefs-datalake","ref":"master","repository":"origin","request_id":"fc00fb53-de16-4a93-a1d2-6c2f05f55171","service_name":"s3_gateway","time":"2021-04-20T19:44:04Z","upload_id":"Qgj7c3Uio1sJiODl7SnUyoajVgmAXQJaRUGIH5hHdKktsSmXVqvRWcYTVzr2AKJkSZparmreqTCjcXRtd6kHs4EteFpmDNxYcjUkb3O3R.wb32AJ7xYKgYk0JMH_fvrJ","user":"svc.emr"}
{"error":"NoSuchUpload: The specified upload does not exist. The upload ID may be invalid, or the upload may have been aborted or completed.\n\tstatus code: 404, request id: 7AA7FH7WMZKBQRRF, host id: NmJ2kSWHUguuNChVK66er1gDqrstk0hmXNqvRHEoULmhmpXcD9Z4PmtcnDaBjYpNDKohokjdwUA=","file":"lakeFS/pkg/logging/logger.go:95","func":"pkg/logging.logrusEntryWrapper.Error","host":"<http://origin.s3.lakefs.paigeai.net|origin.s3.lakefs.paigeai.net>","level":"error","method":"DELETE","msg":"could not abort multipart upload","path":"lakefs_lakefs.db/stg_image/_temporary/0/_temporary/attempt_20210420193535_202823_m_000002_442722/part-00002-ed3e7766-1673-487a-b9a7-69e0de32d18d_00002.c000.snappy.orc","ref":"master","repository":"origin","request_id":"fc00fb53-de16-4a93-a1d2-6c2f05f55171","service_name":"s3_gateway","time":"2021-04-20T19:44:04Z","upload_id":"Qgj7c3Uio1sJiODl7SnUyoajVgmAXQJaRUGIH5hHdKktsSmXVqvRWcYTVzr2AKJkSZparmreqTCjcXRtd6kHs4EteFpmDNxYcjUkb3O3R.wb32AJ7xYKgYk0JMH_fvrJ","user":"svc.emr"}
Guy Hardonag
04/20/2021, 7:56 PMGuy Hardonag
04/20/2021, 8:20 PMSid Senthilnathan
04/20/2021, 9:00 PMSid Senthilnathan
04/20/2021, 9:01 PMSid Senthilnathan
04/20/2021, 9:02 PMGuy Hardonag
04/20/2021, 9:20 PMfs.s3a.connection.timeout
to be 600000
( the default is 200000
)Guy Hardonag
04/20/2021, 9:35 PMSid Senthilnathan
04/21/2021, 12:32 AMSid Senthilnathan
04/21/2021, 12:34 AMGuy Hardonag
04/21/2021, 12:45 AMGuy Hardonag
04/21/2021, 11:13 AMblockstore.s3.streaming_chunk_size (int : 1048576) - Object chunk size to buffer before streaming to blockstore (use a lower value for less reliable networks). Minimum is 8192.
blockstore.s3.streaming_chunk_timeout (time duration : "60s") - Per object chunk timeout for blockstore streaming operations (use a larger value for less reliable networks).
start by setting blockstore.s3.streaming_chunk_size
to 262144
Sid Senthilnathan
04/21/2021, 1:14 PMGuy Hardonag
04/21/2021, 1:21 PMSid Senthilnathan
04/21/2021, 6:53 PMfs.s3a.multipart.size
hadoop configuration, which defaults to 100MB (the threshold that I was noticing our problems started). I bumped it up to 200MB and everything seems to be working nowGuy Hardonag
04/21/2021, 7:18 PMblockstore.s3.streaming_chunk_size
?Sid Senthilnathan
04/21/2021, 7:50 PMItai Admi
04/21/2021, 8:14 PMSid Senthilnathan
04/21/2021, 8:17 PM