Rclone, anyone? The <recursive download PR> is gr...
# dev
a
Rclone, anyone? The recursive download PR is great and an obvious requested and useful feature, and we totally need it. But it raises an interesting option: the de facto standard for copying inside and across stores is Rclone. Can we add a lakeFS backend to it? We would get many cool features: • mountsync • Cross-object store copying (this is surprisingly hard to do with standard tools, including
lakectl fs {up,down}load
,
aws s3 cp
and others!) • A well-known CLI for data operations No promises, but would you find it useful? How often would you use it? And what do you use today instead?
👀 1
b
Think it will be great to configure Rclone working with lakeFS as metadata provided and use direct access for the data. Like a layer above all existing Rclone providers (at least the one we support) and will continue to work / configure the same way, which the additional lakefs endpoint for the metadata.
j
Why do we need it? Isn’t the integration with Rclone good enough?
a
An Rclone backend would allow copying without the data having to go through lakeFS.
👍 1
Plus we could implement the optional copy operation and make lakeFS <--> S3 copies use the AWS API.
Plus there's an optional "recursive listing", which should up when lakeFS is the source.