Hello! Just had a great Lakefs demo, I'm very new ...
# help
l
Hello! Just had a great Lakefs demo, I'm very new to data versioning and am interested in integrating to my workflow. However, I am super attached to HuggingFace datasets, it is how my entire lab passes data to each other. It seems possible to version the HuggingFace cache files with Lakefs, which would allow for added versioning details and rollback. Has anyone thought about this or gotten that to work?
a
o
Hey @Logan Hallee! good news is that Huggingface datasets should be able to work natively with lakeFS! I've opened a PR to document exactly how to read/write datasets from lakeFS, however this requires a small change to a python library (
lakefs-spec
) which is currently open. Once merged and released you'll be able to simply use
lakefs://
file URIs when saving/loading datasets 🤗🤗🤗 I'll update on this thread once both are merged, I don't believe it will take long..
🙏 1
🙌 2
l
Awesome @Oz Katz. This is a great feature! I'll try it out once merged
o
Aaaand it's live 🙂 Documentation is available on the lakeFS docs