hi, i have another question regarding a production...
# help
m
hi, i have another question regarding a production environment. is it possible to scale Lakefs and is it necessary? if yes are there some config examples for a docker installation and some guidelines?
a
Hi! You can already scale lakeFS quite a bit today. As always, with any engineering question, there are many dimensions against which to scale. Do you have an estimate of your required scale? Regardless, you will be pleased to know we are currently working on removing all data plane operations from lakeFS directly into the underlying storage; that will of course require using a dedicated client. Further out we have removing Postgres - which can be a blocker for very (very...) large deployments -- on our roadmap.
https://docs.lakefs.io/quickstart/installing.html#using-docker-compose details using Docker compose for setup. I would just start with one beefy machine and see how that goes. Once you're ready to move to a larger deployment, see https://docs.lakefs.io/deploying/install.html#kubernetes-with-helm for a Helm chart. Please reach out with any difficulties you have, or if you need further help planning!
m
hi thanks for you answer, what do you understand under a beefy machine? 😄 50 TB data in object store and round about 50 users using it
a
50 TB is not a lot of data to manage 🙂 Can you estimate how many operations/s your users will generate at peak? If not, I'd start with say a c5.9xlarge (so you get 10 GiB dedicated bandwidth) and see if I need to scale up (or down...) from there.
m
ok thanks that helps, so i know at least a starting point
a
Sure, glad to have helped!