Hi I m using lakeFs filesystem within Spark io lakefs hadoop lakeFS #help

Hi, I'm using lakeFs filesystem (within Spark io.l...

Florentino Sainz

11/08/2023, 10:10 AM

Hi, I'm using lakeFs filesystem (within Spark io.lakefshadoop lakefs assembly0.2.1 https://docs.lakefs.io/integrations/spark.html#lakefs-hadoop-filesystem) with presigned URLs and 2hours expiration time and sometimes I have error 500 from AWS (which from reading says its S3 throttling, I checked and for sure its not expired). java.io.IOException: Server returned HTTP response code: 500 for URL: https://xxxx.s3.eu-west-1.amazonaws.com/... Is this normal? Can we somehow configure the Spark LakeFS filesystem to retry those requests individually without relying on Spark task retry? So Spark doesn't fail the whole task in a "known" issue (its wasting task retries on that and sometimes it fails, also it takes around 20secs to happen).

Ariel Shaqed (Scolnicov)

11/08/2023, 11:22 AM

Hi @Florentino Sainz, Sorry to hear you're running into issues. What version of lakeFS are you running on the server side? And can you share the configuration -- are you on K8s, what kind of access key or role are you using to authenticate to S3, etc.? This is a bit of a long shot, but we had an issue along with this one that may be relevant. IF you are running lakeFS with an assumed role for S3 on K8s, and IF you use almost only presigned URLs, then it might end up producing URLs that expire sooner than expected.

Florentino Sainz

11/08/2023, 11:28 AM

looking into that, just one additional info, Spark retries sometimes fix the issue btw (not sure if they reuse the same URL or they request a new one to LakeFS) If I try the URL in my web browser "a few minutes later" I do get expired token so it could be the reason, but its a 403, not a error 500, thats why I thought it was not related. We are using EKS (K8s) deployment of the open source version (enterprise version coming soon afaik), Using LakeFS 1.1.0 LakeFS uses a ServiceAccount which maps to an AWS role which has direct access to S3 (same account), no assume role or anything.

Ariel Shaqed (Scolnicov)

11/08/2023, 11:29 AM

OK, then you're safely past that bug.

Ariel Shaqed (Scolnicov)

11/08/2023, 11:33 AM

You can get a presigned URL from lakeFS by using the lakectl CLI - and if the expiration time that AWS encode in the URL is not actually correct, that command tells you when that URL will actually expire. Could you try running that for me?

Copy code

lakectl fs stat --pre-sign <lakefs://repo/branch/path/to/object>

and then if it says "Physical Address Expires" we can see. (But make sure NOT to share the "Physical Address", of course, that is literally a presigned URL to access your data!)

Florentino Sainz

11/08/2023, 11:34 AM

If I use one of the URls from the error it says: <X-Amz-Expires>7200</X-Amz-Expires> <-- this matches my 120m config <Expires>2023-11-08T101100Z</Expires>

Florentino Sainz

11/08/2023, 11:34 AM

(gonna do what you asked)

Ariel Shaqed (Scolnicov)

11/08/2023, 11:36 AM

So not an expiry. Too bad, I was hoping I'd already fixed that bug

Florentino Sainz

11/08/2023, 11:45 AM

hmm

Florentino Sainz

11/08/2023, 11:45 AM

Physical Address Expires: 2023-11-08 125837 +0100 CET

Florentino Sainz

11/08/2023, 11:45 AM

i did what you said with fs stat --pre-sign

Florentino Sainz

11/08/2023, 11:45 AM

and I got only 15 minutes (?)

Florentino Sainz

11/08/2023, 11:46 AM

Copy code

blockstore:
                type: s3
                default_namespace_prefix: s3://{data_s3_buckets[0].bucket_name}/lakefs/
                s3:
                    region: eu-west-1
                    pre_signed_expiry: 120m
                    disable_pre_signed_ui: false

Florentino Sainz

11/08/2023, 11:46 AM

that's my config (in the ones I got from Spark it says 7200 though)

Ariel Shaqed (Scolnicov)

11/08/2023, 11:47 AM

Yeah, your eks is probably giving you 15 minute tokens. It's documented somewhere on the AWS websites, I can look it up later.

Florentino Sainz

11/08/2023, 11:49 AM

will explore that route, no worries will check myself

Florentino Sainz

11/08/2023, 11:49 AM

will report back, thanks for the info though 🙂

Ariel Shaqed (Scolnicov)

11/08/2023, 12:14 PM

You might consider trying to set:

Copy code

s3:
  pre_signed_expiry: 1h
  web_identity:
    session_duration: 1h
    session_expiry_window: 50m

to get longer session expiry. Not sure that you're in the same EKS mode as intended for those settings, but they should not harm anything.

Florentino Sainz

11/08/2023, 12:16 PM

yeah im usingn web_identity, thanks on that. I already extended my role expiration to 12h (max our security team allows). Will try to configure that and check

Florentino Sainz

11/08/2023, 12:17 PM

btw those options are not in https://docs.lakefs.io/reference/configuration.html 🙂, but will try anyways

Ariel Shaqed (Scolnicov)

11/08/2023, 12:18 PM

I am not sure that you will be able to ask web_identity for >1h. If you're in the same account then it might work..

Florentino Sainz

11/08/2023, 12:18 PM

yeah im on the same, anyways 1h should be enough too, 2h was arbitrary

Ariel Shaqed (Scolnicov)

11/08/2023, 12:18 PM

Yeah, they're very niche and I was hoping not to make the configuration guide even more confusing.

Florentino Sainz

11/08/2023, 12:19 PM

oh ok, btw, expiry window 50m means that it will renew/consider them outdated/we them after 50minutes or after 10minutes?

Ariel Shaqed (Scolnicov)

11/08/2023, 12:20 PM

It wil renew them after 10 minutes. So you always get ~50m left on your token.

Florentino Sainz

11/08/2023, 12:21 PM

kk perfect, will set it to and test. ty 🙂

Copy code

pre_signed_expiry: 2h
                    web_identity:
                        session_duration: 3h
                        session_expiry_window: 2h

Ariel Shaqed (Scolnicov)

11/08/2023, 12:21 PM

If it doesn't work, reduce everything to 1h or less. STS has some weird hardcoded behaviours.

Florentino Sainz

11/08/2023, 12:23 PM

ok thanks, will do that, gonna try with 2h "just to test", but probably in prod will be under 1h anyways, I think that should be enough and dont want to risk the links being too long (even though all our devs who have access to the logs are kind of trusted)

Florentino Sainz

11/08/2023, 12:28 PM

confirmed, that did the trick, now it jumped to 2hours 🙂 lets see how it goes, thanks a lot!

Florentino Sainz

11/08/2023, 1:56 PM

Updating back, it did fix the issue on the long running process (the one which processes big .gz files slowly). However I still see some ERROR 500, and if I click the link where I have the error, I can download the file from my web browser. It's happening with Spark local (maybe that one does not have retrials, not sure how spark behaves locally?) during integration testing:

Copy code

Server returned HTTP response code: 500 for URL: <https://blalbalblalbla>
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1902)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1500)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
    at io.lakefs.storage.HttpRangeInputStream.updateInputStream(HttpRangeInputStream.java:54)
    at io.lakefs.storage.HttpRangeInputStream.read(HttpRangeInputStream.java:100)
    at java.io.InputStream.read(InputStream.java:170)
    at java.io.DataInputStream.read(DataInputStream.java:149)
    at org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:102)
    at org.apache.parquet.io.DelegatingSeekableInputStream.readFullyHeapBuffer(DelegatingSeekableInputStream.java:127)
    at org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:91)
    at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:576)
    at org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:777)
    at org.apache.spark.sql.execution.datasources.parquet.SpecificParquetRecordReaderBase.initialize(SpecificParquetRecordReaderBase.java:102)
    at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:180)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.$anonfun$buildReaderWithPartitionValues$2(ParquetFileFormat.scala:284)

Ariel Shaqed (Scolnicov)

11/08/2023, 1:58 PM

That is odd, because lakeFS is not on the read data path. Do you know how long this happens after the stage starts?

Florentino Sainz

11/08/2023, 1:58 PM

it is here

Copy code

io.lakefs.storage.HttpRangeInputStream.updateInputStream(HttpRangeInputStream.java:54)
    at io.lakefs.storage.HttpRangeInputStream.read(HttpRangeInputStream.java:100)

Florentino Sainz

11/08/2023, 1:58 PM

isnt it?

Florentino Sainz

11/08/2023, 1:58 PM

^im using Spark LakeFS filesystem

Florentino Sainz

11/08/2023, 1:59 PM

I don't have spark UI, but the whole process failed in less than 5 minutes (and the link is still valid)

Florentino Sainz

11/08/2023, 1:59 PM

its happening during a delta Merge btw, just in case. In Spark local mode there's no task maxFailures, thats why we are very sensitive to that though (I think, just did some minor testing)

Ariel Shaqed (Scolnicov)

11/08/2023, 2:05 PM

Yeah but a 500 from s3 should be fairly rare. This is real s3, right? Not some MinIO container in a test environment...

Florentino Sainz

11/08/2023, 2:06 PM

yes, its real S3

Florentino Sainz

11/08/2023, 2:07 PM

lakefs internally uses weird (many) prefixes so i dont know why the throttle either 😕 not sure if its related to presigned urls having different throttle rules

Ariel Shaqed (Scolnicov)

11/08/2023, 2:11 PM

Yeah, we are literally best practice for object naming. Also I don't think you're hitting that key very often. It only ever spreads your keys out more. I would expect lakeFS to throttle you well before S3 does.

Florentino Sainz

11/08/2023, 2:15 PM

on my side the connection should be mostly direct, EKS -> VPC Gateway endpoint (i.e. the one which uses routes, not privatelink) -> S3. Like no api gateways in the middle or anything.

12 Views

Open in Slack

Previous Next