Rclone, anyone? The <recursive download PR> is gr...
# dev
Rclone, anyone? The recursive download PR is great and an obvious requested and useful feature, and we totally need it. But it raises an interesting option: the de facto standard for copying inside and across stores is Rclone. Can we add a lakeFS backend to it? We would get many cool features: • mountsync • Cross-object store copying (this is surprisingly hard to do with standard tools, including
lakectl fs {up,down}load
aws s3 cp
and others!) • A well-known CLI for data operations No promises, but would you find it useful? How often would you use it? And what do you use today instead?
👀 1
Think it will be great to configure Rclone working with lakeFS as metadata provided and use direct access for the data. Like a layer above all existing Rclone providers (at least the one we support) and will continue to work / configure the same way, which the additional lakefs endpoint for the metadata.
Why do we need it? Isn’t the integration with Rclone good enough?
An Rclone backend would allow copying without the data having to go through lakeFS.
👍 1
Plus we could implement the optional copy operation and make lakeFS <--> S3 copies use the AWS API.
Plus there's an optional "recursive listing", which should up when lakeFS is the source.