https://lakefs.io/ logo
Title
k

Kevin Vasko

01/30/2023, 7:06 PM
To run garbage collection, is the assumption that I need to have a spark cluster set up?
e

Elad Lachmi

01/30/2023, 7:13 PM
Hi @Kevin Vasko, Yes, lakeFS assumes you have some means of submitting and running spark jobs See here for additional details on running GC jobs
k

Kevin Vasko

01/30/2023, 9:12 PM
Thanks.
I am seeing “Error while looking for metadata directory in the path: s3a://bucket/projects/myproject/_lakefs/retention/gc/commits/run_id={id}/commits.csv_
Do i have to create that path?
i

Iddo Avneri

01/30/2023, 9:24 PM
Hi Kevin! Which user are you using to run the spark job and can you share it’s permissions?
k

Kevin Vasko

01/30/2023, 9:24 PM
i’m using my account which is an admin account
well it’s in the “Admins” group
It seems to only be a warning but the next line is AWSBadRequestException: getFileStatus on {path i mentioned above}
i

Iddo Avneri

01/30/2023, 10:30 PM
Hi Kevin, let me try to collect some data to assist here: 1. Which version of the lakeFS Spark client are you using? 2. Can you share your spark-submit command (without the values for the keys of course) 3. What version of lakeFS are you running?
k

Kevin Vasko

01/30/2023, 10:32 PM
0.6.0 i tried the 3.0.1 and the 3.1.2 and got same result. I’ll send this tomorrow (not in front of computer) 0.80
i

Iddo Avneri

01/30/2023, 10:32 PM
Awesome. Thank you!
k

Kevin Vasko

01/30/2023, 10:34 PM
so i did see it was creating the commits file (i didn’t see it bc it wasn’t showing in lakefs ui).
but it’s in the aws s3 console
how closely tied to the spark version do I need?
i was just running a local standalone spark cluster on my local machine and grabbed the latest spark version
i

Iddo Avneri

01/30/2023, 10:38 PM
When you say “i did see it was creating the commits file” Do you mean the the list of expired objects? under
_lakefs/retention/gc/addresses/mark_id=MARK_ID
k

Kevin Vasko

01/30/2023, 10:42 PM
yes
i saw it in the s3 console but wasn’t seeing it lakefs ui so i didn’t think it was making it
but it is
i

Iddo Avneri

01/30/2023, 10:53 PM
Got it. Thanks
e

Elad Lachmi

01/31/2023, 3:42 PM
Hi @Kevin Vasko , Just to make sure - you're configuring the job with both your lakeFS and AWS S3 access key id/secret access key pairs?
k

Kevin Vasko

01/31/2023, 3:51 PM
@Iddo Avneri @Elad Lachmi Yup! See attached error log. I also included the script I’m using to run the code. https://gist.github.com/vaskokj/621cdcc328f4bbbf4586e96a3968a16b
e

Elad Lachmi

01/31/2023, 3:52 PM
Great, I'll take a look
k

Kevin Vasko

01/31/2023, 3:53 PM
I am using a local spark cluster. I just spun up a 3.3.1 Standalone spark cluster on my local box
Not sure if I need to match the lakefs-spark-client-312-hadoop3-assembly-0.6.0.jar
Essentially I’m just trying to clean up all the crap that people have deleted.
e

Elad Lachmi

01/31/2023, 3:56 PM
From the looks of it it's failing earlier It's getting a 400 HTTP error from AWS I'll need to dig into this a bit
Essentially I’m just trying to clean up all the crap that people have deleted.
Yep, that's exactly what it's for 🙂
k

Kevin Vasko

01/31/2023, 3:57 PM
@Elad Lachmi so what’s weird is the credentials are correct…because it creates the commit.csv file
e

Elad Lachmi

01/31/2023, 3:58 PM
I believe the
csv
file is created using lakeFS's role, while the cleanup is done with the AWS credentials, but I just want to make sure before we dig deeper
k

Kevin Vasko

01/31/2023, 4:00 PM
also could this be an issue with not passing an endpoint URL?
When i pass an endpoint url I had more issues
e

Elad Lachmi

01/31/2023, 4:12 PM
Still looking into it
k

Kevin Vasko

01/31/2023, 4:13 PM
no rush. i’ll be around. It’s probably a mistake on my end
e

Elad Lachmi

01/31/2023, 4:13 PM
Either way, I'd be happy to assist
k

Kevin Vasko

01/31/2023, 4:13 PM
Or something to do with my environment
e

Elad Lachmi

01/31/2023, 4:20 PM
Seems like there's an issue with STS credentials and S3A GetFileStatus It's an old issue, but the circumstances seem too similar to be a coincidence I'll read up on it a bit
k

Kevin Vasko

01/31/2023, 4:21 PM
Issue with lakeFS client or issue with something else?
The AWS account I have access to requires me to use a session token
e

Elad Lachmi

01/31/2023, 4:22 PM
Hadoop + STS credentials + S3A GetFileStatus
Yes, I understand
Can you try something real quick (if you haven't already) Can you try setting
AWS_REGION
to your region in your terminal session and trying again?
k

Kevin Vasko

01/31/2023, 4:37 PM
can you clarify?
it’s set in the command
not sure how to set it in “terminal session”
as in “export” a variable?
e

Elad Lachmi

01/31/2023, 4:38 PM
I mean
export AWS_REGION=<your region>
in the terminal and running the job again
k

Kevin Vasko

01/31/2023, 4:40 PM
same error
e

Elad Lachmi

01/31/2023, 4:43 PM
So I think you'll need to enable sigV4 and specifically configure an S3 endpoint both the driver and all worker Spark nodes must run Java with
-Dcom.amazonaws.services.s3.enableV4
if they want SigV4 to work and you'll probably need to configure an S3 endpoint with the Spark configuration property
spark.hadoop.fs.s3a.endpoint
k

Kevin Vasko

01/31/2023, 4:44 PM
ok let me see if i can figure that out
e

Elad Lachmi

01/31/2023, 4:45 PM
I'm not a Java expert by any stretch of the imagination, but I'll try to assist as best I can
Hopefully, we can work it out
I'll do some reading up meanwhile Let me know how it goes
Just in case you're searching for how to use that parameter, here's an example
spark-submit --conf spark.driver.extraJavaOptions='-Dcom.amazonaws.services.s3.enableV4' \
    --conf spark.executor.extraJavaOptions='-Dcom.amazonaws.services.s3.enableV4' \
    ... (other spark options)
k

Kevin Vasko

01/31/2023, 5:04 PM
yup, i got it. Different error now i’m pastebinning it
acts like it’s almost complaining about the structure of something
“the authorization header is malformed”
looks like a bug
e

Elad Lachmi

01/31/2023, 5:09 PM
I think I saw an issue with the AWS Java SDK re S3A + vpc endpoints Let me try to find it
I think you're right It's either a bug or a "feature" (a.k.a it's unsupported)
k

Kevin Vasko

01/31/2023, 5:10 PM
but that’s for aws android sdk core
e

Elad Lachmi

01/31/2023, 5:11 PM
I'm guessing Java SDK and Android SDK have common ancestry
I see that in the command you ran you didn't enable sigV4
e

Elad Lachmi

01/31/2023, 5:13 PM
I think both are needed
k

Kevin Vasko

01/31/2023, 5:14 PM
But my error seems to be more similar to the first in a parsing issue
e

Elad Lachmi

01/31/2023, 5:16 PM
Yes, but I'm not sure of all the differences between sigV2 and sigV4 I think it's worth making sure we have both the correct region picked up by the Java SDK, the correct endpoint, and that we're using sigV4 before we dig deeper AFAIK, you'll need to configure this anyway, so might as well take care of it now and then move on Otherwise we might hit on the right solution and not even know it
k

Kevin Vasko

01/31/2023, 5:19 PM
exact problem here
“Authorization Header is Malformed”(400) exception when PrivateLink URL is used in “fs.s3a.endpoint” When PrivateLink URL is used instead of standard s3a endpoint, it returns “authorization header is malformed” exception. So, if we set fs.s3a.endpoint=bucket.vpce -<some_string>.s3.ca-central-1.vpce.amazonaws.com and make s3 calls we get: com.amazonaws.services.s3.model.AmazonS3Exception: The authorization header is malformed; the region 'vpce' is wrong; expecting 'ca-central-1' (Service: Amazon S3; Status Code: 400; Error Code: AuthorizationHeaderMalformed; Request ID: req-id; S3 Extended Request ID: req-id-2), S3 Extended Request ID: req-id-2:AuthorizationHeaderMalformed: The authorization header is malformed; the region 'vpce' is wrong; expecting 'ca-central-1' (Service: Amazon S3; Status Code: 400; Error Code: AuthorizationHeaderMalformed; Request ID: req-id; Cause: Since, endpoint parsing is done in a way that it assumes the AWS S3 region would be the 2nd component of the fs.s3a.endpoint URL delimited by “.”, in case of PrivateLink URL, it can’t figure out the region and throws an authorization exception. Thus, to add support to using PrivateLink URLs we use fs.s3a.endpoint.region to set the region and bypass this parsing of fs.s3a.endpoint, in the case shown above to make it work we’ll set the AWS S3 region as ca-central-1.
e

Elad Lachmi

01/31/2023, 5:21 PM
Yes, looks like it
So you want to try that?
k

Kevin Vasko

01/31/2023, 5:22 PM
yup, i’m doing that now
:lakefs: 1
it hasn’t crashed…yet
been running longer than i’ve seen it run before
not sure if it’s hung up lol
e

Elad Lachmi

01/31/2023, 5:23 PM
well, that's a good sign at least
k

Kevin Vasko

01/31/2023, 5:23 PM
no logging
so i’m not seeing anything
e

Elad Lachmi

01/31/2023, 5:23 PM
java + aws = always a great time
k

Kevin Vasko

01/31/2023, 5:24 PM
aws + anything imo has some god awful esoteric errors that don’t mean anything half the time
e

Elad Lachmi

01/31/2023, 5:24 PM
that's true
k

Kevin Vasko

01/31/2023, 5:25 PM
like i was getting a “could not start instance error” no logs, no nothing, machine just died on start up. finally figured out that the system did have permissions to access the key system in aws
it couldn’t decrypt the disk drive or something…
it also doesn’t help that this account i don’t own it’s work so i don’t have global admin and i don’t know what’s “set up” behind the scenes
it still hasn’t crashed…but unsure on what it’s doing
e

Elad Lachmi

01/31/2023, 5:27 PM
yeah, I know the feeling flying blind
It can take a while Depending on the number of objects/prefixes/refs
k

Kevin Vasko

01/31/2023, 5:28 PM
only like 2500 objects
just a test location
e

Elad Lachmi

01/31/2023, 5:28 PM
that's not too bad, but much more strongly correlated to the number of refs
branches, commits, objects (in lakeFS, not S3)
k

Kevin Vasko

01/31/2023, 5:29 PM
like 2 commits 1 branch, 2500 objects total
in this “testproject”
e

Elad Lachmi

01/31/2023, 5:30 PM
hmm... then it shouldn't take too long, but it still takes time I wouldn't be too worried about it taking time
k

Kevin Vasko

01/31/2023, 5:32 PM
It should be doing it for only the single project right?
feature request…logging of some sort ;)
e

Elad Lachmi

01/31/2023, 5:34 PM
Yes, I think so
Noted 🙂
k

Kevin Vasko

01/31/2023, 5:35 PM
ah damn, didn’t work
e

Elad Lachmi

01/31/2023, 5:35 PM
😞
k

Kevin Vasko

01/31/2023, 5:35 PM
complaining it couldn’t find the end point
so i’m not sure what those instructions are telling me to do
i have to pass the endpoint
e

Elad Lachmi

01/31/2023, 5:37 PM
tbh I'm not sure not a Java expert and the minimalism in the "solution" isn't appreciated in this case
Maybe try searching for a similar issue with maybe a more detailed solution
?
Meanwhile I'm working on other things Please feel free to let me know if I can assist further or if you've made any interesting progress
k

Kevin Vasko

01/31/2023, 5:54 PM
yup! appreciate it! thanks!
:lakefs: 1
e

Elad Lachmi

01/31/2023, 5:55 PM
Sure, np
k

Kevin Vasko

01/31/2023, 8:33 PM
@Elad Lachmi ok I got it to successfully run… I had a couple problem… 1) The comment and troubleshooting issue here: https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/troubleshooting_s3a.html That spark.hadoop.fs.s3a.endpoint.region option didn’t exist until hadoop-aws:3.3.2 See this: https://issues.apache.org/jira/plugins/servlet/mobile#issue/HADOOP-17705 The other issue I had was subtle but…you have to specify fs.s3a.endpoint=bucket.vpce -<some_string>.s3.ca-central-1.vpce.amazonaws.com So now it successfully ran…however it didn’t clear anything out or this ref
e

Elad Lachmi

01/31/2023, 8:34 PM
ok I got it to successfully run…
First things first... 🥳
k

Kevin Vasko

01/31/2023, 8:36 PM
Should it not delete from s3 all of these object?
e

Elad Lachmi

01/31/2023, 8:37 PM
spark.hadoop.fs.s3a.endpoint.region option didn’t exist until hadoop-aws:3.3.2
I see... so it might have been related to the issue I saw in the issue tracker 🤔
k

Kevin Vasko

01/31/2023, 8:37 PM
there is 12k objects 508MB worth
e

Elad Lachmi

01/31/2023, 8:37 PM
Did you change anything else in the command you ran since you sent it to me last?
(besides the S3 configuration, of course)
k

Kevin Vasko

01/31/2023, 8:38 PM
yeah so in the documentation of lakeFS needed to change —packages org.apache.hadoop:hadoop-aws:3.3.2
no
e

Elad Lachmi

01/31/2023, 8:38 PM
ok
checking...
k

Kevin Vasko

01/31/2023, 8:41 PM
e

Elad Lachmi

01/31/2023, 8:41 PM
So lets look at the simpler options first...
• Any object that is accessible from any branch’s HEAD. • Objects stored outside the repository’s storage namespace. For example, objects imported using the lakeFS import UI are not collected. • Uncommitted objects, see below, These three categories of objects aren't candidates for GC, which is important to note
The second thing I'd double-check is the GC rules policy 1. That one exists 2. That it's configured in a sensible way (whatever sensible means in your context)
You can see a ref for configuring this either via
lakectl
or via the UI https://docs.lakefs.io/howto/garbage-collection.html#configuring-gc-rules
k

Kevin Vasko

01/31/2023, 8:47 PM
so if i uploaded a bunch of files and they were never committed they wouldn’t bc GCed it seems?
Also where is the “import UI”
nvm i’m an idiot
😅 1
e

Elad Lachmi

01/31/2023, 8:47 PM
You might also want to check the file in
_lakefs/retention/gc/addresses
If one doesn't exist or has very few objects, that could point us towards a policy/configuration issue If it has all the objects, but doesn't hard-delete them, then we'll look into that
k

Kevin Vasko

01/31/2023, 8:49 PM
yeah 99% of these files were never committed
e

Elad Lachmi

01/31/2023, 8:50 PM
I see... so that's a diff type of GC It's called uncommitted GC
k

Kevin Vasko

01/31/2023, 8:50 PM
yup reading that now
e

Elad Lachmi

01/31/2023, 8:50 PM
Same method of running and all, but a bit of a different purpose
k

Kevin Vasko

01/31/2023, 8:52 PM
yeah, makes sense
got to upgrade to later version of lakefs
e

Elad Lachmi

01/31/2023, 8:54 PM
From my experience, it's worth it That's where the real ROI comes in, because people make copies of huge data sets over and over and many of the branches are just abandoned after a bit of experimentation, but the objects remain and pile up real quick
But in terms of the endpoint setup, it should be the same
k

Kevin Vasko

01/31/2023, 9:01 PM
if i ran the migrate with 0.80.1 do i need to upgrade to 0.80.2 and migrate again?
or can i just drop the new binary in place?
e

Elad Lachmi

01/31/2023, 9:03 PM
Migrations automatically trigger a minor version bump, so it's probably a drop in, but let me make sure
k

Kevin Vasko

01/31/2023, 9:03 PM
i’m going to move all the way up to the latest (0.91.0)
but unsure if i need to use 0.80.2 to migrate up
e

Elad Lachmi

01/31/2023, 9:05 PM
0.80.1 requires a migrate up The next few do not
From what I can see, there aren't any migrations needed from 0.80.1 up to latest 0.91.0
k

Kevin Vasko

01/31/2023, 9:12 PM
ok cool!
e

Elad Lachmi

01/31/2023, 9:14 PM
If you get the chance, I'd love to hear how things went and of course, if you have any further questions, feel free to reach out again
k
e

Elad Lachmi

01/31/2023, 9:28 PM
🤦‍♂️🏻
k

Kevin Vasko

01/31/2023, 9:30 PM
unsure on if i’m missing something
e

Elad Lachmi

01/31/2023, 9:30 PM
I think you have a missing config param
fs.s3.impl
or
fs.s3a.impl
k

Kevin Vasko

01/31/2023, 9:30 PM
is that in the docs?
e

Elad Lachmi

01/31/2023, 9:31 PM
Oh, wait... maybe it's configured ok I just noticed you were using an
s3://
path Can you try
s3a://
instead, before configuring more stuff
k

Kevin Vasko

01/31/2023, 9:33 PM
sorry i’m confused where should i pass that?
i’m doing the same thing as I did with the other
i copied and pasted, changed the class and the -c options accordingly for do_sweep
e

Elad Lachmi

01/31/2023, 9:39 PM
Now that I think of it, the support for only S3 for the uncommitted GC probably means we're using s3 and not s3a, so it might be correct that the s3:// protocol handler needs to be configured
k

Kevin Vasko

01/31/2023, 9:40 PM
hmmm
docs show s3a
hmmm i’m still struggling with this. i’ve messed with every setting i can think of but still getting the same error. Any thoughts on what to change?
e

Elad Lachmi

02/02/2023, 3:46 PM
Hi @Kevin Vasko, Can you please remind me what error message you're getting right now? We've been through a few 😅
k

Kevin Vasko

02/02/2023, 3:57 PM
I tried this
I think you have a missing config param
fs.s3.impl
or
fs.s3a.impl
I also tried changing the parameters into command line to s3 instead of s3a.
e

Elad Lachmi

02/02/2023, 4:01 PM
Let me check something It might take a few minutes
k

Kevin Vasko

02/02/2023, 4:01 PM
yup! no worries
e

Elad Lachmi

02/02/2023, 4:24 PM
Can you try adding this to your spark command?
--conf spark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem
k

Kevin Vasko

02/02/2023, 4:59 PM
trying
no dice but different error message…getting error message for you
e

Elad Lachmi

02/02/2023, 5:09 PM
That looks like the error we originally got while trying to run GC, I think
k

Kevin Vasko

02/02/2023, 5:09 PM
it does but i still have all the other parameters in it
is there a fs.s3 set of properties i need to set?
e

Elad Lachmi

02/02/2023, 5:11 PM
I think when you set it to use s3a it users the same conf params, but I'm not sure
k

Kevin Vasko

02/02/2023, 5:15 PM
oops, typo in my key!
it’s working!
e

Elad Lachmi

02/02/2023, 5:15 PM
nice!
k

Kevin Vasko

02/02/2023, 5:16 PM
well the mark part is haha
e

Elad Lachmi

02/02/2023, 5:16 PM
I was thinking to myself "This should work... why isn't this working" 🙂
Another step in the right direction - I'll take it
k

Kevin Vasko

02/02/2023, 6:06 PM
well damn….
so close lol
Now i’m trying to run the sweep…it starts running and then blows up saying the AWS Acess Key Id you provided does not exist in our records
e

Elad Lachmi

02/02/2023, 6:14 PM
So close! 😑
k

Kevin Vasko

02/02/2023, 6:15 PM
lol no joke
but the job starts running…
so it’s like it’s in the actual lakeFS code
e

Elad Lachmi

02/02/2023, 6:20 PM
It's still an AWS SDK error The job uses the SDK as well
But it's in the right direction
k

Kevin Vasko

02/02/2023, 6:21 PM
yeah
e

Elad Lachmi

02/02/2023, 6:23 PM
I think what it's starting to do is divyy up the work and as soon as tasks start going, it tries to access S3 and hits that error
Just to make sure - you used the mark ID from the latest mark run, right?
k

Kevin Vasko

02/02/2023, 6:37 PM
yup
i had to add ‘spark.hadoop.lakefs.gc.do_sweep=true’ because it errored saying ‘Nothing to do, must specify at least one of mark, sweep. Exiting”
e

Elad Lachmi

02/02/2023, 6:49 PM
Yes, you either run with do_sweep=false and do_mark=true first and then the other way around or you can mark and sweep in one go (I think 🤔)
k

Kevin Vasko

02/02/2023, 6:50 PM
Yeah, in the docs it only has one
In step 1 shows do_sweep=false and nothing else
step 3 for sweep it shows do_mark=false
e

Elad Lachmi

02/02/2023, 6:53 PM
Yeah, maybe adding a mark_id and setting do_mark to false'll do it
k

Kevin Vasko

02/02/2023, 6:53 PM
yeah, i did do that
it’s got me to this point haha
e

Elad Lachmi

02/02/2023, 7:05 PM
From the docs, sweep-only mode should be configured like this
spark.hadoop.lakefs.gc.do_mark=false
spark.hadoop.lakefs.gc.mark_id=<MARK_ID> # Replace <MARK_ID> with the identifier you used on a previous mark-only run
k

Kevin Vasko

02/02/2023, 7:06 PM
yup, and that errors with above issue
the docs are wrong, have to be
e

Elad Lachmi

02/02/2023, 7:20 PM
I'm looking through the code to see if we've missed something I'll be a minute
Ok, can confirm that both do_sweep and do_mark are handled in code
so do_mark needs to be false, do_sweep needs to be true and mark_id needs to be the ID generated in the latest mark run
If that's all good, you should see one of the first lines in the log output will be
deleting marked addresses: <mark_id>
k

Kevin Vasko

02/02/2023, 7:31 PM
yeah, that’s what I already did. Unfortunately where I’m at is the error regarding missing AWS Key that I linked
e

Elad Lachmi

02/02/2023, 7:32 PM
Yes, but I'm trying to work through the code to see where it's potentially getting stuck and what it's trying to do
k

Kevin Vasko

02/02/2023, 7:32 PM
that includes the new parameters with true/false for mark and sweep with the mark_id specified
👍🏻 1
So now i’m at the error of the AWS Key error
e

Elad Lachmi

02/02/2023, 7:33 PM
Yeah, that the key ID you provided doesn't exist
I'm trying to find where it's using the AWS SDK and what it's doing exactly
It's picking up all of the configuration params that start with
fs.
or
lakefs.
, which seems ok
k

Kevin Vasko

02/02/2023, 7:44 PM
hmmm, yeah that’s odd
maybe a typo?
I’m unsure on how the mark option works but not the sweep
e

Elad Lachmi

02/02/2023, 7:47 PM
Yes, that's very strange That's why I'm looking through the code to maybe pick up on something in the difference between how the sweep and mark work
k

Kevin Vasko

02/02/2023, 7:47 PM
makes sense
is this code public? if so i can look at it too
e

Elad Lachmi

02/02/2023, 7:51 PM
Yes, it is
btw: the input validation makes sure that the combination of existence/non-existence/values of the mark, sweep, and mark_id make sense So that's probably not the issue
k

Kevin Vasko

02/02/2023, 7:53 PM
yeah, it’s fully into the code at this point, it’s doing it’s maps and splits and stuff
it’s like once it gets to actually doing the deletes i bet it fails
it’s like once it gets to actually doing the deletes i bet it fails
Yeah, that's for sure The question is where is it not getting the credentials, getting the wrong credentials, or not setting up the client correctly? (or some other option, which I'm not even thinking about right now)
I have a feeling it's not using the session token, but I'm not 100% sure yet
I think that might be the issue I'm not the scala or spark expert around these parts, so I'll pass it on to one of my colleagues and we'll take a look together
k

Kevin Vasko

02/02/2023, 8:12 PM
ahhh makes perfect sense
e

Elad Lachmi

02/02/2023, 8:12 PM
I guess that without the session token it can't look up the access key ID, so it's like it doesn't exist
It would seem that that's the error you'd get if you use an STS key pair without a session token Which kind of makes sense, if you imagine how they've implemented it, I guess
When you are testing credentials in Boto3: The error you receive may say this,
ClientError: An error occurred (InvalidAccessKeyId) when calling the ListBuckets operation: The AWS Access Key Id you provided does not exist in our records.
but may mean you are missing an
aws_session_token
if you are using temporary credentials (in my case, role-based credentials).
Looks like that's going to need fixing
k

Kevin Vasko

02/02/2023, 8:19 PM
dang! ok
well glad it wasn’t me
e

Elad Lachmi

02/02/2023, 8:20 PM
this time 😂
k

Kevin Vasko

02/02/2023, 8:20 PM
the next obvious question, do you know how long it might take to fix? I’m sure it’s low priority
haha yup. well technically most of these issues have been errors with environment
e

Elad Lachmi

02/02/2023, 8:21 PM
I'm not sure that it is... I'll have to talk to someone from that team
I think we can compromise on blaming Java and/or AWS
k

Kevin Vasko

02/02/2023, 8:22 PM
lol yup would agree
is there a bug report system that this needs to be documented in?
e

Elad Lachmi

02/02/2023, 8:24 PM
That's a great idea, but first I'd like to verify that I'm not missing something, although I'm relatively sure We work with GitHub issues for bug reports
k

Kevin Vasko

02/02/2023, 8:25 PM
got it
Not sure if there is a workaround
e

Elad Lachmi

02/02/2023, 8:26 PM
Well, there might be a couple, but they defeat the purpose of using STS, so I wouldn't recommend them
However, we might be able to push this through relatively quickly I'd like to at least try (once I confirm that that is the issue)
k

Kevin Vasko

02/02/2023, 8:37 PM
seems like an easy fix hopefully
e

Elad Lachmi

02/02/2023, 8:39 PM
I'm not sure what the wider context might be It might be a more complex issue than I'm thinking or it might be a different issue As soon as I have more information, I'll let you know
k

Kevin Vasko

02/02/2023, 8:40 PM
cross my fingers thst it’s simple :)
thanks!
e

Elad Lachmi

02/02/2023, 8:40 PM
same
sure, np
One quick thing I'd like to confirm In the full log of the run, do you see a line starting with
Use access key ID
k

Kevin Vasko

02/02/2023, 8:43 PM
I do not
e

Elad Lachmi

02/02/2023, 8:44 PM
ok, thanks!
k

Kevin Vasko

02/02/2023, 8:44 PM
should I?
e

Elad Lachmi

02/02/2023, 8:45 PM
I'm not sure, but I was trying to differentiate between two possible paths
k

Kevin Vasko

02/02/2023, 8:46 PM
got it
e

Elad Lachmi

02/02/2023, 10:30 PM
@Kevin Vasko Quick update It would seem that's indeed the case Would you be able to open an issue in the repo Send me a link to it and I'll add all he need tags
k

Kevin Vasko

02/03/2023, 2:29 PM
done! Thanks!
👍🏻 1
e

Elad Lachmi

02/03/2023, 2:45 PM
Thank you 🙏🏻
k

Kevin Vasko

02/17/2023, 3:58 PM
@Elad Lachmi know of any way to get a timeline for this? weeks? months? potentially never? just trying to find a workaround
j

Jonathan Rosenberg

02/17/2023, 4:20 PM
Hi @Kevin Vasko, there is a related design proposal and an improvement that might be handy to support STS in the Spark metadata client (which GC uses). We’ll examine them in the beginning of next week and hopefully have an estimate on that.
k

Kevin Vasko

02/17/2023, 7:48 PM
Sounds good thanks. I guess in the meantime is there any recommendations on running the GC to maybe bypass STS?
a

Ariel Shaqed (Scolnicov)

02/17/2023, 8:06 PM
Hi @Kevin Vasko, Sorry you're having difficulty with this. Indeed for various reasons we do not currently support sts during sweep. I'm afraid that the only current workaround is not to use sts. We do support straight access key and secret key, or if you're using Spark 3 on Hadoop 3 you might be able to get cross-account delegation. Alternatively, if it's a small installation you might be able to get by with directly listing and then deleting all the swept objects.
k

Kevin Vasko

02/17/2023, 8:06 PM
yeah it’s a small instance only 10k objects or so
might write a script to do that
😞 1
👍 1
i

Iddo Avneri

02/17/2023, 8:25 PM
That’s a possibility. Let us know if you run into issues.