Hi I m planning on deploying lakefs on our GCP storage so th lakeFS #help

Hi! I'm planning on deploying lakefs on our GCP st...

user

04/01/2022, 6:15 PM

Hi! I'm planning on deploying lakefs on our GCP storage so that we can version-control our data in GBQ. How would I go about doing this, and will I have to bring any modifications to existing code or make modifications everytime I change branches. Thanks!

user

04/01/2022, 6:30 PM

Hi @Stéphane Burwash, here you can find information on how to setup lakefs on GCP https://docs.lakefs.io/deploy/gcp.html. lakeFS uses google storage as the underlying storage and provides UI, API, and S3 compatible interfaces. GBQ and the code will query one of these interfaces. How does the code work with the data today? Using GBQ sdk?

user

04/01/2022, 6:37 PM

(Sorry in advance if I don't answer all the questions, I'm new to the DE game) Thanks @Barak Amar! Currently our main interaction with the data is through GBQ. It's connected to an ELT pipeline using stitch or meltano and loaded (with a loader) into our project. We mostly use the integrated GBQ SQL to query the data from there (using the UI). Does this make sense?

user

04/01/2022, 7:03 PM

Hope I got it right (and I may have got it wrong), GBQ is a warehouse (loaded with stitch) and used to query the data loaded to the application. lakeFS is used as a layer above the object store that can be used by data processing/query/transformation tools. It will provide versioning for GBQ data warehouse. There is an option to query data using GBQ where the data source is S3 - didn't explore this one yet. You will need to load the data to lakeFS in this case for BQ to query on the federated data. If there is a use case to manage the data extracted from GBQ, lakeFS can be used to store the results. But you will probably need a different way (not GBQ) to load the data from lakeFS.

user

04/01/2022, 7:27 PM

Ok awesome, thank you so much! I'll look into what's best for our architecture and get back to you guys soon 😉

5 Views

Open in Slack

Previous Next