https://lakefs.io/ logo
Title
c

Cristian Caloian

10/24/2022, 9:16 AM
Hi! Looking at the documentation for
latest
version I see that the
lakefs import
from inventory file is not available anymore. I do see the functionality available in
v0.83
. Is this feature being dropped?
I think I found the answer 4323
a

Ariel Shaqed (Scolnicov)

10/24/2022, 9:39 AM
Indeed. We now have more ways to import, and we believe one of them will always be better than waiting 24h to get an inventory.
c

Cristian Caloian

10/24/2022, 9:46 AM
I agree with the arguments for removing
lakefs import
. Our usecase was uploading data from an S3 bucket on a rolling basis (data is being dumped to S3 by a different service which we don’t control), and in that sense the inventory was convenient, especially when the number of items grows. Do you have a suggestion that would work well in this scenario?
a

Ariel Shaqed (Scolnicov)

10/24/2022, 9:49 AM
I can see why import would be appealing there. If the volume of data is not too great, Rclone can be a surprisingly efficient tool. But I'll ask around.
i

Itai Admi

10/24/2022, 10:04 AM
Hey @Cristian Caloian, I guess that you look for a programatic solution, i.e. doing that through the UI daily is not feasible, right? I think you can use the same calls that the UI import wizard performs. Basically a bunch of paginated IngestRange, followed by CreateMetaRange and create a Commit to point to that MetaRange (with `source_materange`parameter. These 3 are plumbing commands that you can use to construct a MetaRange which is the snapshot of metadata held by a commit. We have an e2e example in our system test using go.
Also - lakectl ingest is simpler and somewhat more intuitive way to import since the imported objects are simply uncommitted in the branch of your liking. But it’s less scalable than the first option.
o

Oz Katz

10/24/2022, 12:19 PM
I’ve created an issue to support the same import functionality available in the UI as part of
lakectl
- feel free to comment and discuss: https://github.com/treeverse/lakeFS/issues/4450
c

Cristian Caloian

10/24/2022, 1:54 PM
Thank you everyone for your inputs! I will look into the approaches you suggested, and I will follow 4450.
I guess that you look for a programatic solution, i.e. doing that through the UI daily is not feasible, right?
That is correct! Using an inventory was convenient in the sense that I could only import new data added to the bucket since the last import without listing all the items. At least that was my understanding.