Jubiiz Audet
07/12/2023, 2:03 PMlakefs-learning
either to my local machine or to a GCS bucket called julien_learning
.
I'm trying to use rclone v1.63.0, and have followed the tutorials available on https://docs.lakefs.io/howto/export.html and https://docs.lakefs.io/howto/copying.html. I had first tried the docker command, which gave me the error
2023/07/12 13:52:03 Failed to create file system for "<gs://julien_learning/lakefs-backup/>": didn't find section in config file
rclone copy failed
Then I tried using rclone from my command line to have more control over the process. I'm using Windows 10 Pro, and am using rclone v1.63.0 on WSL Ubuntu 20.04.6 LTS. I have LakeFS deployed on my localhost for trying it out. I made a remote for LakeFS following the interactive rclone config
, but was never prompted to enter a no-check-bucket
option. When trying rclone lsd lakefs:
, I get error
2023/07/12 09:40:06 ERROR : : error listing: NoSuchBucket: The specified bucket does not exist
status code: 404, request id: , host id:
2023/07/12 09:40:06 DEBUG : 6 go routines active
2023/07/12 09:40:06 Failed to lsd with 2 errors: last error was: NoSuchBucket: The specified bucket does not exist
status code: 404, request id: , host id:
I have also tried adding the --s3-no-check-bucket
flag, but this doesn't change anything.
To make sure it's not just me failing to use rclone, I made another remote for my GCS bucket (called lakefs-learning
, bad name choice I know) tried rclone lsd lakefs-learning:julien_learning
, which correctly outputs the directories present
The NoSuchBucket
error makes me feel like it's checking for the bucket existing while it shouldn't because of a no-check-bucket
flag that I do not have :((.
Any help is much appreciated🤗lakefslakefs
remote I was on version 1.50.2. When retrying with v1.63.0, in the advanced options, I see
Option no_check_bucket.
If set, don't attempt to check the bucket exists or create it.
This can be useful when trying to minimise the number of transactions
rclone does if you know the bucket exists already.
It can also be needed if the user you are using does not have bucket
creation permissions. Before v1.52.0 this would have passed silently
due to a bug.
Enter a boolean value (true or false). Press Enter for the default (false).
no_check_bucket> true
Which is already a step further.
however, rclone lsd lakefs:
still gives the same error.Isan Rivkin
07/12/2023, 3:39 PMprovider=AWS
field.
[lakefs]
provider = AWS
In newer rsync versions this will not work anymore because rsync will try and speak with AWS instead of lakeFS, so the field needs to be provider=Other
I tested my local lakeFS server with GCS and did rsync copy and other commands with the following config and it worked.
[lakefs]
type = s3
provider = Other
env_auth = false
access_key_id = <lakefs-access-key-id>
secret_access_key = <lakefs-secret-key>
endpoint = <http://localhost:8000>
no_check_bucket = true
[gcs]
type = google cloud storage
env_auth = true
Can you please try that?
If that still doesn’t work please attach your rsync config, lakeFS server logs and lakeFS config (and remember to omit sensitive information 🙂 )
Thanks!Jubiiz Audet
07/12/2023, 7:00 PMrclone lsd
and rclone ls
work!
Thanks a ton for the helpIsan Rivkin
07/12/2023, 8:43 PM