Hi, have some issues with lakectl local: i creat...
# help
p
Hi, have some issues with lakectl local: i created a directory where I want to clone from remote to local. I have 3 branches created correctly. datalakeadmin@prj-sm-etm-dev-datalake-eval:~/ETM/lakefs/project_folder/lakefs/clone_target_dir$ lakectl local clone lakefs://datalake/main/ directory '/home/datalakeadmin/ETM/lakefs/project_folder/lakefs/clone_target_dir' exists and is not empty The issue is when I run ls -l in my working local directory it returns no objects. It seems the clone 'worked' yet retrieved no data files. Also when I enter lakefs local pull It works but no actual files are retrieved from remote. lakefs local list shows a valid linking with the server: lakectl local list: +-----------+-------------------------+------------------------------------------------------------------+ | DIRECTORY | REMOTE URI | SYNCED COMMIT | +-----------+-------------------------+------------------------------------------------------------------+ | . | lakefs://datalake/main/ | 4b846d749af8956bd1082f930d58258ff3128e4d0f00c0897b7794f4fafc7d35 | +-----------+-------------------------+------------------------------------------------------------------+
n
@Patrick Glynn Sorry to hear you're having issues with lakectl local. Can you tell me what you see when you run the command:
lakectl local status ~/ETM/lakefs/project_folder/lakefs/clone_target_dir
p
I get this output: lakectl local status ~/ETM/lakefs/project_folder/lakefs/clone_target_dir diff 'local:///home/datalakeadmin/ETM/lakefs/project_folder/lakefs/clone_target_dir' <--> 'lakefs://datalake/4b846d749af8956bd1082f930d58258ff3128e4d0f00c0897b7794f4fafc7d35/'... diff 'lakefs://datalake/4b846d749af8956bd1082f930d58258ff3128e4d0f00c0897b7794f4fafc7d35/' <--> 'lakefs://datalake/main/'... No diff found. I am not sure whether if the config.yaml is not linked correclty to the server instance or if I have misconfigured something minor
when I run lakefs doctor it says everything is correctly linked
n
Do you by chance have more than one lakeFS server up in your env? Could lakectl be configured to a endpoint where repo and main branch are empty?
Can you verify that this commit
4b846d749af8956bd1082f930d58258ff3128e4d0f00c0897b7794f4fafc7d35
is the top commit of your main branch?
p
do I use lakefs diff to check that?
n
You can do that via the web UI or via
lakectl log
command
p
I ran the command and I can confirm
4b846d749af8956bd1082f930d58258ff3128e4d0f00c0897b7794f4fafc7d35
is the top commit
n
Can you run
lakectl fs ls <lakefs://datalake/main/>
p
I ran the command, it show the objects! 🙂
i guess ls -l for some reason doesnt show the objects because they are some how abstracted by lakefs?
is that the case?
n
No, object should have been downloaded and you should have seen them
What lakectl version are you currently using?
p
lakeFS version: 1.20.0
n
OK - I think I have a suspicion
can you run
ls ~/ETM/lakefs/project_folder/lakefs/clone_target_dir
p
returns nothing or executes and proposes a new cursor line
n
That's very strange - can you tell me a little bit about your system
p
its a Ubuntu linux system that is cmdline interface only. I am connecting to it via ssh Putty
n
Also can you run the command
ls -la
p
returns objects: total 12 drwxrwxr-x 2 datalakeadmin datalakeadmin 4096 Apr 16 07:36 . drwxrwxr-x 5 datalakeadmin datalakeadmin 4096 Apr 17 08:00 .. -rw-r--r-- 1 datalakeadmin datalakeadmin 124 Apr 17 08:25 .lakefs_ref.yaml
n
is that your clone directory?
Lets do this - first delete the
.lakefs_ref.yaml
file
p
yes its my clone directory
n
Then I want you to run:
lakectl local clone <lakefs://datalake/main/> test
Now try
ls -la test
👍 1
Did it work?
p
it worked
could you explain what happened? just for personal knowledge and debugging it would be interesting for me to know
and what is the purpose of test after datalake/main? Is that to create another sub directory to point the clone output to?
thanks btw also 👍
jumping lakefs 1
n
I'm not exactly sure what happened. Perhaps you ran
lakectl local init
on that folder before you tried to clone? Basically we do not allow to perform clone on a directory which is not empty. Clone is an operation that consist of
init
and then
pull
. The init command is the one that creates the ref file which indicate to lakectl that this is a data directory. The extra parameter (
test
) is an optional argument telling the clone command what is that path to use. If not provided it uses the current working directory. If the provided path doesn't exist we try to create it.
p
I think yes I executed lakect local init before
however is it still normally test is empty? because shouldnt it have the imported directory or do I do I need to pull the contents
n
I'm not sure I understand the question
There's the issue why the status command did not show you the diff between the remote and local directory. I will look into it further today
OK, regarding why pull didn't work for you. The issue is that pull takes only the changes from the remote into the local directory. In your example after init, the files that did not exist in the local directory are not considered as new data from the remote path but rather as files that were deleted locally from the data folder. Therefore there is nothing to pull from remote. The correct command to sync between the remote and local in that case will be
lakectl local checkout