Any body know a good Python library for file transfer to/from lakefs ? I am looking for feature like the one with rclone:
• parallel transfer: a must as we have a lot of little files
• checksum to not redownload file again (useful for litle files)
08/01/2023, 9:05 AM
Hey @HT, I'm not aware of any specific tool that can achieve that out of the box. However, since lakeFS exposes an S3-compatible API, any tool that will solve that for S3 can solve it for lakeFS. For example, Boto's transfer configurations allow you to adjust concurrency levels. That being said, I wasn't able to find a tool that does exactly what you wanted for S3, so it may be required to do some heavy lifting on your own.
08/01/2023, 9:07 AM
so far, that was what I got from multiple googling and chatGPT in the last few month ....
Was trying here in case I miss the needle in the haystack 😉