Hi all We re using the high level Python SDK to interact wit lakeFS #help

Hi all, We're using the high-level Python SDK to ...

Oscar Wong

09/23/2024, 12:31 PM

Hi all, We're using the high-level Python SDK to interact with LakeFS and have tagged a release. Is there documentation on parallel downloading of objects in the tag using the SDK? I saw this can be done with `lakectl fs download`—is there a similar option with the SDK? Any suggestions on the best approach? Second question, is mounting a remote LakeFS repo to a local directory with Everest only available for Enterprise users? Thanks!

Oscar Wong

09/23/2024, 12:32 PM

At the moment, this is what we have:

Copy code

# Download data
    with tqdm(total=len(objects), desc="Downloading files", unit="file") as pbar:
        for obj in objects:
            print(f"Downloading {obj.path}...")

            # Define the local file path
            local_file_path = os.path.join(local_training_data_storage, obj.path)

            # Create the directory if not exists
            os.makedirs(os.path.dirname(local_file_path), exist_ok=True)

            try:
                with (
                    open(local_file_path, "wb") as local_file,
                    tag.object(obj.path).reader("rb") as r,
                ):
                    local_file.write(r.read())
            except Exception as e:
                print(f"Failed to download {obj.path}: {e}")
            pbar.update(1)

Ariel Shaqed (Scolnicov)

09/23/2024, 12:49 PM

Hi @Oscar Wong, Welcome! I think you've pretty much nailed down how to perform multiple concurrent downloads. There is no magic that "our" solutions perform. You might also use

lakectl local

, which provides an experience somewhat closer to "Everest". Everest itself is a feature of lakeFS Enterprise.

Oscar Wong

09/23/2024, 12:55 PM

Thank you.

😇 1

Niro

09/23/2024, 2:22 PM

@Oscar Wong You can also use multiprocessing to achieve concurrency of the download process

👍 1

👍🏼 1

gratitude thank you 1

4 Views

Open in Slack

Previous Next