https://lakefs.io/ logo
Title
e

Edmondo Porcu

10/20/2022, 8:06 PM
Is there a plan to support S3 server side encryption ?
a

Ariel Shaqed (Scolnicov)

10/20/2022, 8:35 PM
Are you asking about encrypted data at rest? Because you do get that from the cloud infrastructure; lakeFS itself utilizes other services to store all its data. As far as the data stored in objects and committed data is concerned, I believe that you could get this merely by switching sse on in the underlying s3 storage bucket. Similarly, if your data is encrypted on your kv (currently PostgreSQL or DynamoDB) then your staging will also be encrypted.
e

Edmondo Porcu

10/24/2022, 8:57 PM
The docs says that lakeFs doesn’t support that on s3
a

Ariel Shaqed (Scolnicov)

10/26/2022, 1:38 PM
Oh, I think I see, I misunderstood your question! lakeFS supports running on top of encrypted storage -- and that was my answer. But you are asking about https://docs.lakefs.io/reference/s3.html, which says "no support for SSE" on the S3 gateway that lakeFS implements! Indeed, docs are correct, we do not support SSE. Why? FIrstly, if you configure lakeFS to run on top of a bucket with default SSE, then of course all of its data will be encrypted. We believe(d) that this is enough for almost all use-cases. If you need per-object SSE to support different encryption keys for different objects, note that lakeFS will still need unconditional access to all of these encryption keys, or it will not be able to read them from storage. It would be great if you could open us an issue to support these APIs. We will probably have many specific questions. I can think about issues configuring the role for running lakeFS, how to warn about mismatches between lakeFS IAM and AWS KMS IAM permissions, and desired changes to the lakeFS API itself. Thanks!
e

Edmondo Porcu

10/26/2022, 2:01 PM
Not sure I understood. Can we use LakeFs with our buckets that have today encryption ? (Kms)
i

Iddo Avneri

10/26/2022, 6:12 PM
Yes 🙂 - lakeFS supports running on top of encrypted storage 🙂
e

Edmondo Porcu

10/26/2022, 6:13 PM
That is what I thought but not what the doc seems to say
i

Iddo Avneri

10/26/2022, 6:18 PM
You can use lakeFS on top of your encrypted buckets. The doc is referring specifically to the S3 gateway. You are welcome to run lakeFS on top of a bucket with default SSE - it will work and all the data will be encrypted. Are you looking for more capabilities above that? (Per Ariel’s response)
👍🏼 1
You can test this yourself very quickly: 1. Prepare your encrypted storage. 2. Run lakeFS locally against your storage.
a

Ariel Shaqed (Scolnicov)

10/26/2022, 8:38 PM
Absolutely. That doc refers to lakeFS serving objects via the S3 gateway. I understand that that is not your question. lakeFS should just work on top of a bucket with default Sever-Side Encryption configured. All configuration for this case will be on AWS, and lakeFS will just be another program that uses that S3 bucket. Hope this makes more sense!
e

Edmondo Porcu

10/26/2022, 8:43 PM
Is it possible to use LakeFS without having files served by the S3 Gateway?
a

Ariel Shaqed (Scolnicov)

10/27/2022, 3:53 AM
(replying with copy to channel as this a great question!)
Is it possible to use LakeFS without having files served by the S3 Gateway?
Yes indeed! • The lakeFS API offers complete access to all lakeFS features. • lakectl uses those features to provide a cli. • lakeFSFS, the lakeFS FileSystem, is a Hadoop FileSystem that offers access to lakeFS without going through s3a and the s3 gateway.
e

Edmondo Porcu

10/27/2022, 3:54 AM
So you can configure spark to use lakefs via the Hadoop fs implementation and this would use encryption
a

Ariel Shaqed (Scolnicov)

10/27/2022, 4:00 AM
Absolutely. If you can and are willing to share your requirements, here or in private, I may be able to provide more tailored information.