Hi , quick question is there a way on lakefs pytho...
# help
y
Hi , quick question is there a way on lakefs python api , to direct get_objects to output file into a specific dir instead of root temp dir configured on the client object?
v
Hi @Yaphet Kebede! Thanks for reaching out. Yes, you can write the output of the python API to any file and directory path you want to. What do you mean by
root temp dir configured on the client object
? Would you mind sharing how you configured that?
y
Hi @Vino , thanks for your quick response, sure,
Copy code
def init_client(lakefs_conf_path):
    logger.debug('initializing ')
    configuration = lakefs_client.Configuration()
    configuration.temp_folder_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'data')
    with open(lakefs_conf_path) as stream:
        config_raw = yaml.load(stream, Loader=yaml.FullLoader)

    configuration.username = config_raw["credentials"]["access_key_id"]
    configuration.password = config_raw["credentials"]["secret_access_key"]
    configuration.host = config_raw["server"]["endpoint_url"]
    the_lake = LakeFsWrapper(configuration=configuration)
    return the_lake
here's my client init, i set
Copy code
configuration.temp_folder_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'data')
as the correct path which is great, I can see the files being downloaded there when i make a call like
Copy code
# path is lakefs location 
self._client.objects.get_object(repository=repository, ref=branch, path=location)
but what i wanted to do was say i have
Copy code
some-path/some.txt
on lake fs , and wanted to download it to a local
Copy code
<temp_folder_path>/some-path/some.txt
the
client.objects.get_objects
wouldn't let me as it would just dump
some.txt
directly under
<temp_folder_path>
, ignoring
some-path
v
Right, this makes sense. Let me look into this and get back to you in a bit!
So, @Yaphet Kebede! lakeFS currently does not support bulk upload and download of objects. That is,
get_object
and
upload_object
work with a single file at a time. You'd have to iterate over the files under a specific lakeFS path and create required subdirectories in your local storage to achieve what you are looking for.
You'll have to use lakeFS API to get_object and use python to create necessary subdirectories in your local, and write the file to that specific dir. Let me know if that helps.