Hi Team, I have two questions and apologies in adv...
# help
u
Hi Team, I have two questions and apologies in advance if I am getting something wrong 1. When I import something from s3 bucket and merge the changes to main the destination contents are getting over written even when there are no conflicts(no matching prefixes), my main branch gets completely wiped out. how do I prevent this from happening, I am expecting git like functionality where only changes from sources to overwrite destination. 2. where does the metadata for lakefs get stored? I am using dynamodb backend with s3 block storage but I do not see any files getting stored in s3 and I dont remember configuring any s3 in my helm config.
u
I might be getting something wrong as It makes sense to have data wiped out as the import branch is not branched out from main. I confused myself as Import is not an operation in git šŸ˜ž
u
Hi @Narendra Nath, 1. If you import on the same branch twice it will override your last import (the reason for that is to support the case of periodical imports ) in order to import as an addition (and not remove the ā€œremovedā€ files) you will need import on separate branches
u
thanks Guy, makes sense. I realized my mistake. can you help me with s3 configuration?
u
2. There are two types of metadata, according to the objects you are adding, there is committed and uncommitted metadata • The committed metadata will be saved in s3 under
/_lakefs
• The uncommitted metadata will be save in dynamoDB Hope this answers your question
u
thanks Guy, makes sense. I realized my mistake. can you help me with s3 configuration?
would love to, how can I help?
u
what s3 bucket does lakefs choose to save for committed metadata?
u
For each repo, lakeFS will save the committed metadata under the
storage namespace
(provided when creating the repository)
u
is that a setting in helm ? I did not choose anything for storage namespace this is my config
Copy code
lakefsConfig: |
  database:
    type: dynamodb
    dynamodb:
      aws_region: us-west-2
  blockstore:
    type: s3
    s3:
      region: us-west-2
u
nope, you wont see it in helm in helm you provide configurations used by lakeFS. lakeFS may have many repositories each with a different storage namespace. If you would like to view your current repositories (and storage namespaces) go to the repository view (by clicking on
Repositories
on the top menu)
u
@Narendra Nath Please note that import behavior will change in the next lakeFS release. It will not override the data in the destination branch but will simply behave as a commit over the current branch state
u
Hi NirO sounds good, when can we expect the release?
u
I believe we will release a new version by EOW
u
thats awesome!! so far Lakefs has been awesome compared to other tools I have been experimenting with for large datasets
u
thank you Guy, I am able to find the _lakefs, I was looking at the root of s3 when I had my data imported from under a prefix.
u
Happy to help!