https://lakefs.io/ logo
Title
a

Adi Polak

12/11/2022, 3:00 PM
As we get closer to wrapping up 2022 , I plan on creating a special community newsletter. I'd very much appreciate your help with it!🌟 πŸŽ„ if you can answer one or more questions:
:help_lakefs: Which lakeFS feature was the most significant for you? Either developing, using, or trying?
:lakefs: What made you get excited about the project? Do you have any specific use cases/requirements?
πŸ™‰ How did you learn about the project?
Your contribution is very much appreciated! πŸ’œ
πŸ‘€ 1
:jumping-lakefs: 2
i

Idan Novogroder

12/11/2022, 4:53 PM
The most significant feature I developed this year was the
lakectl doctor
command. I hope it was helpful for new users trying lakeFS for the first time. 🩺 The most significant feature we had this year (IMO) is the garbage collector for Azure! ♻️ As a developer, I really can't imagine my day-to-day work without a VCS like Git and I really hope that 2023 will be the year when people will find out that they can (and should) manage their data the same way they manage their code! 2️⃣0️⃣2️⃣3️⃣
πŸ₯‡ 3
:tree-branch: 1
πŸ₯Ό 1
🩺 1
:x-ray: 1
:lakefs: 1
q

Quentin Nambot

12/11/2022, 7:18 PM
:lakefs: I really like the idea of managing data with branches (with a main branch and some fix/feat branches). I'm working with some teams that duplicate data manually on S3 in different prefiex (like
prod/
and
preview/
), and they have to manually delete/copy/move data on S3 to switch preview into prod.. I am planning to try LakeFS on their use case to ease their work!! πŸ™‰ I recently discoverd LakeFS by listening this conference that you made @Adi Polak: https://www.databricks.com/dataaisummit/session/chaos-engineering-world-large-scale-complex-data-flow (And I found that conference while I was looking for stuff about chaos engineering in the world of data)
πŸ‘πŸ» 1
πŸ™ 2
πŸ™Œ 3
☺️ 1
πŸ™ŒπŸ½ 1
πŸ‘ 2
πŸ‘πŸ½ 1
e

einat.orr

12/12/2022, 9:24 AM
My favorite feature is the merge strategy "Source wins", as it allows exposing a new version of a repository that replaces most of the data in one atomic action, while the older version of the repository is available in the last commit, and all that happens when naming conventions are kept and the process is bullet proof from a quality perspective.
πŸ₯‡ 2
πŸ”„ 1
v

Vino

12/12/2022, 10:43 AM
My favorite feature is using lakeFS hooks to run automated data quality checks on the data. The fact that it allows pre-merge or similar actions to be run on the daily incremental load of data is super cool. This way, one can test the newly ingested data without risking prod data and promote only high quality data into prod. It also allows you to integrate your existing test suits like Great Expectations or Soda with hooks to automate the quality testing process.
a

Ariel Shaqed (Scolnicov)

12/12/2022, 8:46 PM
I'd have to go with kv support on the inside, and cloud on the outside. The switch from Postgres to a general kv backend means simpler sizing, deployment, and ops. Plus it adds many new deployment options and the possibility to add even more! But cloud makes sizing, deployment, and ops simplest! I may be spoiled, but I've completely stopped bringing up instances on AWS using ECS or any other strategy.
πŸ₯‡ 1
🌟 1