Ankit Srinivas

08/15/2022, 8:14 PM
Hey everyone! It's your Monday updates! :jumping-lakefs: 📖 Here is a question for you, what kind of use-cases do you have regarding lakeFS? • One Spark job, Many Data Sources – How to Easily Use lakeFS with SparkHow Karius used lakeFS to comply with FDA regulations for the disease diagnostic studies. • Data Mesh: What is it and What Does it Mean for Data Engineers? :virtual-meeting: How would you like to learn about developing spark ETL pipelines? • Develop Spark ETL pipelines with no risk against production data - Aug 25th • State of Data Engineering meetup - Sept 9th 💻 Finally, we can't complete the update without some lakeFS technical releases: • lakeFS v0.70.1 is released 😒unglasses_lakefs:
:sunglasses_lakefs: 5
💡 2
:heart_lakefs: 4

Adi Polak

08/16/2022, 10:49 AM
I love this question -
what kind of use-cases do you have regarding lakeFS?
💭 some thoughts: one of the problems of scalable infra is the lack of tools for enabling best practices for every data practitioner in the world. We see it through various movements such as the new data stack which is more practical and the data mesh which is more conceptual. Since lakeFS is an infra solution, it provides a lot of value for any data practitioner. one that I believe significantly changes the way we work with data today and is a pain point for many. 1️⃣ One of them is enabling a testing environment with production data, where engineers can run their whole data pipelines, as complex as it is, and fully test it, including system testing and integration testing at the real scale and variety of the data. It's a game changer that enables faster CI ( continue integration) of software while building trust and confidence in the code we just developed. It reduces the software development lifecycle dramatically while significantly improving engineers' lives. 💜 Of course. there are more use-cases, and I would love to learn more from everyone here :heart_lakefs: