lakeFS is an open-source data version control that transforms your object storage to Git-like repositories..

lakeFS

Is this a hint that lakeFS might be able to work with BigQuery out of the box? <https://hevodata.com/learn/federated-query-bigquery/#w>

Probably not out of the box :slightly_smiling_face: but perhaps this paves a way

Out of the box seems to me "unrealistic", although I just submitted my first PR a couple of days ago :slightly_smiling_face:

I imagine that to implement de-duplication, LakeFS must be adopting some form of content addressable storage,  while maintaining an index that allows going from "commit + file name" to its physical location on an object storage.

There's a way, maybe, this "book index" can be exposed in ways that one can consume it from BigQuery, Athena, Presto, etc?