Ok stupid question…. Say I have a bucket <s3://la...
# help
k
Ok stupid question…. Say I have a bucket s3://lakefs/projects/. I have an ec2 instance in a VPC that is hosting the lakefs software. I provided my ec2 instance an IAM_ROLE to access s3://lakefs/projects. So my EC2 instance can do things with s3://lakefs/projects/. Now when my users interact with lakefs are they interacting with the permissions that lakefs has? Or are they interacting with s3 outside of the context of lakefs and lakefs is just acting as “mediator” where it just points users to the s3 buckets?
g
Hey @Kevin Vasko, Not a stupid question at all. lakeFS has it’s own authorization model, So, basically users will need to be authorized by lakeFS to do lakeFS commands. If you read/write objects via lakeFS (using the api or the s3 gateway) and lakeFS will use the the IAM_ROLE you gave it.
k
@Guy Hardonag Ok so essentially I don't need to provide each user access to the s3 bucket?
The ec2 server I assume will “proxy” the requests to s3? Like what happens if a user downloads/uploads a file? I assume the file gets sent to lakefs server then uploaded/downloaded to s3?
g
It depends on your case, If you are using lakeFS-specific Hadoop FileSystem you will also need to provide the it access to the bucket. you could read more about it here https://docs.lakefs.io/integrations/spark.html#access-lakefs-using-the-lakefs-specific-hadoop-filesystem
k
The reason I ask is because we have a VPC gateway to s3 along with a site to site VPN to the VPC. The only way for users to directly interact with s3 over the vpn is through a VPC s3 interface.
g
How does the use interact with lakefs? How does lakeFS interact with s3?
k
My intent is to have users interact with lakefs over the VPN (they can navigate and view the UI) I assume they just use their tokens with lakefs cli interact with it.
y
Hey @Kevin Vasko, for simple use cases, your user doesn't need access to S3. Like you described, lakeFS will proxy the requests to S3.
👍🏼 1
These use cases include using the UI, the lakectl command, the AWS CLI when configured to interact with lakeFS, and more.