Hi there - I’m trying to spin up LakeFS on ECS wit...
# help
d
Hi there - I’m trying to spin up LakeFS on ECS with RDS IAM authentication as opposed to hard-coded credentials. The way IAM auth appears to work (per that link) is that a
generate-db-auth-token
API call is made to the RDS service and then the returned value (valid for 15 minutes) is used as a password. Peeking at
ConnectDBPool
(and the corresponding pgx library) it seems like I might need to tweak that method to get the token…curious if others have run into this before or have thoughts. Context: I’m trying to deploy a full stack using AWS CDK and didn’t want to create an environment variable for my ECS task definition with a hard-coded password as it would show up in the CloudFormation template. 😅
👀 1
y
Hey @Damon C, welcome and thank you for the details. Today, the database credentials are taken from the lakeFS configuration (either a file or environment variables) once the server starts, and they can't be changed afterwards.
What I usually do in similar scenarios is to start a new container with the new credentials once they change. However, since in your case those are only valid for 15 minutes, it doesn't sound like a good idea.
A possible tweak to lakeFS code would be to have Viper (used by us to load the configurations) listen to configuration changes and replace the database connection. For this to work, you would have to configure lakeFS using a configuration file, since environment variables can't be changed. This solution may be simpler than generating the token from within lakeFS.
I've opened an issue to reflect your request, so that the team can prioritize it. Of course, you're always welcome to work on this issue if that was your intention 🙂
Just one more thing (sorry for bombing), we are very close to supporting DynamoDB as the underlying metadata store. So in the very near future, you will be able to use lakeFS without RDS.
👍 1
d
Thanks @Yoni Augarten ! That's helpful. The credentials, while only good for 15 minutes, are just for the initial connection. I think that connection can then last for several hours. I'll poke around with the lakefs code and see how it goes - for now, the “hard problem” is turning out to be all the RDS provisioning/IAM setup. 🙄 😀
Awesome to hear on the Dynamo side as well - definitely makes a lot of sense. :)
y
Thank you for clarifying, @Damon C, this makes sense. It seems like someone on stackoverflow solved this: https://stackoverflow.com/questions/66592497/golang-pgx-pool-dynamic-configuration
d
Oh sweet! That looks perfect. Thanks for looking that up - I’ve got a custom lakefs build locally, so shouldn’t be too hard to wire up. I’ll comment on that issue/open a PR depending on how far I get.
lakefs 1
👌🏻 1
An alternative for this specific case as well would be to update the configuration to take different environment variables for parts of the connection string and build the full connection string on config load. edit: Digging in on this, I do see that
ParseConfig currently recognizes the following environment variable
so quite possibly an option. Continuing down the IAM road for now, though.
Summing up where I was able to get to: • Using
PGPASSWORD
environment variable and connection string together end up working for me. (With ECS I can populate an environment “secret” from Secrets Manager so I don’t have to expose the password) • I wasn’t able to get IAM auth to work at all (kept getting an obtuse “PAM authentication failed” error). I’m assuming it’s something on my side, but am giving up for now. 😄
Almost got it working (user error, had a slash instead of colon in my IAM RDS auth statement), but looks like passing in no password and setting it in the lakefs code causes other issues. 😕 Alas, will comment on that ticket, thanks again!
Copy code
time="2022-08-19T06:15:45Z" level=error msg="Failed to migrate" func="pkg/db.(*DatabaseMigrator).Migrate" file="build/pkg/db/migration.go:48" direction=up error="get migrate database driver: pq: empty password returned by client" host=lakefs-[REDACTED].<http://us-west-2.elb.amazonaws.com|us-west-2.elb.amazonaws.com> log_audit=API method=POST path=/api/v1/setup_lakefs request_id=b91296d3-e0ea-4a6a-b418-80730e1c25cb service_name=rest_api
y
@Damon C, thanks for keeping us updated. I'm not sure I understand how using the separate PGPASSWORD variable can help, could you clarify?
d
@Yoni Augarten Yea, for sure! A little context first: With ECS, you can set both environment variables and “secret” environment variables. The former are raw strings while for the latter you can provide an AWS ARN for the secret you want ECS to populate the environment variable with. I happened to notice that `pgconn.ParseConfig` supports merging environment variables into the connection string. So in my ECS config, I set
LAKEFS_DATABASE_CONNECTION_STRING
to
<postgresql://lakefsadmin>@{hostname}:{port}/postgres
(notice no password) and then I set
PGPASSWORD
in my secrets to the ARN of my database password secret in Secrets Manager. And lakefs/pgx magically merges the password into my connection string. While this allows me to not hard-code my password in my CloudFormation/CDK stack, it does not necessarily support rolling credentials…but I suppose one could figure out a way to restart the container or similar on a credential change. Or maybe existing connections would be fine even if the password changes. 🤔
👍🏻 1