Title
#lakefs-for-beginners
c

CC

11/16/2022, 6:11 PM
Hi, For some reason, the list_repository_runs(repository) API call (in Python ActionsApi) is returning: Invalid value for
event_type
(post-create-tag), must be one of ['pre_commit', 'pre_merge'] I want to see post-create-tag entries, and they show up in the GUI. Does the API block me from accessing them? I'm using LakeFS Client 0.83.3 and LakeFS 0.83.3 . Bigger picture: I'm trying to use information saved from WebHooks to replicate a LakeFS instance. Is that a reasonable approach? Thanks, Chuck
Itai Admi

Itai Admi

11/16/2022, 6:18 PM
Hey CC, thanks for reaching out. I think it’s a bug, this line int the API limits the
event_type
. Let me open an issue for it.
6:20 PM
Also - what do you mean by replicate? How do hooks help you achieve that?
6:23 PM
Follow the issue to for resolution,
c

CC

11/16/2022, 6:32 PM
I want a slave lakefs instance to receive each change made to a master instance. For that purpose, it's very useful to know that, at time T, repo1 spun off branch1 from main, for commit ID c5f57237ae91609b7e20adb3ced4f1bd9cf6175d8391b9d39e2ca759259f6357. Otherwise, I have to figure out which branch spun off of which branch by something between complex logic and guesswork. If there are 3 branches, no problem. But if there are 300 branches, the combinatorics get nasty.
Itai Admi

Itai Admi

11/16/2022, 6:36 PM
Got it, sounds like an interesting use case. What's the purpose of tracking the changes in lakeFS in a different lakeFS instance?
c

CC

11/16/2022, 7:53 PM
The idea is to have an instance in another region, so it's local to a distant region.
8:04 PM
Thanks for investigating and opening the ticket. Do you have any idea how long it will take to resolve? (I don't know if someone can change a file and it's done in short order, or if it won't be considered till a preset date that is months away.)
Itai Admi

Itai Admi

11/16/2022, 8:45 PM
It’s not as simple as changing that file I’m afraid. There are bunch of generated client/server files that go along with it. Having said that, it should be pretty striaght forward to fix, I assume that we can fix and release lakeFS during next week. Does that work for you?
8:48 PM
Regarding relying on hooks to replicate the data. What happens if some hook fails (e.g. transient network issue)? Are the 2 lakeFS instances out of sync? For how long? I guess the best way to replicate would be either to replicate the kv-store(postgres/dynamodb), or to use
lakectl refs-dump …
and
lakectl refs-restore …
which does exactly that and works on the repository level.
c

CC

11/16/2022, 9:25 PM
The problem with replicating the kv-store is that the replicated kv-store uses the same object storage, so you're still accessing S3 in the original region. Based on discussion with Ariel and Oz, it should be possible to replicate programmatically, triggered by webhooks, but ignoring the webhook content (to avoid being derailed by a temporary web outage). Not sure what happens to the actions recorded in LakeFS itself if there is a web outage; I'm guessing that's not a problem. Regarding the other issue: Next week is certainly reasonable. It's easy to get used to pushing a button and having things happen right away, but that's not the way the real world works. If it did, none of us would have a job: Software doesn't write itself, and bugs don't fix themselves. 😃
3:54 PM
FYI, not surprisingly, get_run(repo, run_id) has the same issue.
Itai Admi

Itai Admi

11/17/2022, 3:56 PM
Thanks for the update. It makes sense, they both use the same api object that needs updating
10:58 AM
Hey @CC, lakeFS release v0.86.0 should fix your issue. Let us know if that’s not the case