• Gilad Sever

    Gilad Sever

    1 year ago
    Hello, I'm also new to lakeFS and excited to learn 🤓
    Gilad Sever
    1 replies
    Copy to Clipboard
  • m

    mwikstrom

    1 year ago
    Hello, I have a question regarding the garbage collection. Sorry if it’s somewhere in the docs but I’ve read “everywhere” and I can’t seem to figure this out. I started to look into the garbage collection rules and created a config. I applied it to a repository and all was well. Then I wanted to see how it actually runs and then suddenly there is a
    spark-submit
    in the docs. 🙂 So my question is: Where do I run this? I can’t see any other references to spark (other than interacting with lakefs). Do I need to set up a spark cluster for this? The documentation doesn’t give that much info, it feels like you are supposed to just know it. But maybe I just missed the info somewhere? Thank you for any help/clarification.. 🙂
    m
    Oz Katz
    10 replies
    Copy to Clipboard
  • p

    Phi-Long Bui

    11 months ago
    Hello everyone! I am new to lakeFS and I have been trying to run a local deployment, using the Windows binary and connecting a local Postgresql database. lakeFS appears to start properly, but when I attempt to visit the setup page, the page does not load. The icon for lakeFS appears on the browser tab, but the page does not load any elements. I have included the output in my cmd as well as the network data when I try to visit the page. Any help would be greatly appreciated, thank you!
    p
    Barak Amar
    57 replies
    Copy to Clipboard
  • p

    Phi-Long Bui

    11 months ago
    Hello everyone, I was looking into the data modeling of LakeFS and I had a question about the "Ranges" in the Merkle tree. Is a range created and mapped to every object upload in lakeFS, such that there is a range for every object in lakeFS?
    p
    Barak Amar
    3 replies
    Copy to Clipboard
  • Iddo Avneri

    Iddo Avneri

    11 months ago
    Hi everyone, this might be a silly question 🙂 I have a main repository that I committed changes to and a branch that is out of sync on my local environment. I want to revert my testing-branch to main, per the last commit. I tried running with the name of the commit or the ID of it:
    lakectl branch revert <lakefs://example-repo/testing-branch> Commit1019
    lakectl branch revert <lakefs://example-repo/testing-branch> 9a05a70fd3c7763334ffe605ba3fe4865b5fc74c44a3a9a4d08c837b7a5a67b9
    But I’m getting: Request failed: [400 Bad Request] What is the value that is expected for <commit ref to revert>?
    Iddo Avneri
    Ariel Shaqed (Scolnicov)
    7 replies
    Copy to Clipboard
  • Vítor Lourenço

    Vítor Lourenço

    10 months ago
    Hi, guys. There is any way to download all files from a dir using ctl? e.g., to download a file :
    lakectl fs cat <lakefs://repo/branch/dir/file.txt> > file.txt
    something like
    lakectl fs cat <lakefs://repo/branch/dir/*>
    or
    lakectl fs cat --recursive <lakefs://repo/branch/dir>
    Vítor Lourenço
    1 replies
    Copy to Clipboard
  • m

    mwikstrom

    10 months ago
    Hello, I'm struggling with using the AWS cli to access a repository in LakeFS. I am quite certain that either I'm missing something or my setup/deployment is causing this. But I can't seem to figure it out and maybe you can help me. 😃 I have deployed LakeFS in a kubernetes cluster in GCP and configured the storage, created a repo and uploaded a file. So far so good. The part where I've complicated things a bit is the networking in kubernetes. I have registered two DNS names, lets call them
    <http://lakefs.example.com|lakefs.example.com>
    and
    <http://lakefs.portal.example.com|lakefs.portal.example.com>
    . If you access
    <http://lakefs.portal.example.com|lakefs.portal.example.com>
    it authenticates you with GCP and then forwards you to the portal. It also adds https to the connection. Works well. If you go to the
    <http://lakefs.example.com|lakefs.example.com>
    it drops the connection if the url doesn't contain "/api". Since all portal access should go to the other endpoint. It also adds https to the connection. I've configured the lakectl command and put the endpoint_url to
    <https://lakefs.example.com/api/v1>
    . And the command works. I can run
    lakectl fs ls <lakefs://testrepo/main/>
    and get the correct listing. Then I tried to use the aws cli. I created a profile called lakefs with the key and secret. And then I called it like this:
    aws --profile lakefs --endpoint-url <https://lakefs.example.com/> s3 ls <s3://testrepo/main/>
    And I get the following error:
    An error occurred (404) when calling the ListObjectsV2 operation: Not found
    I figured it was the wrong endpoint so I changed it to
    <https://lakefs.example.com/api/v1>
    And then I get this error:
    An error occurred (NoSuchBucket) when calling the ListObjectsV2 operation: The specified bucket does not exist
    So my questions are: • First of all, "should" this work? 😃 • Second, what should the endpoint_url be? ◦ When I look at the architecture diagram I guess the S3 requests goes to the S3 Gateway? ◦ And my setup right now is dropping those requests? Please let me know if you have any suggestions or need any info. 😃 //Mattias
    m
    Guy Hardonag
    +1
    10 replies
    Copy to Clipboard
  • Chaim Turkel

    Chaim Turkel

    10 months ago
    CREATE DATABASE IF NOT EXISTS dbt_chaim LOCATION '<s3a://dbt-chaim/main>';
    Chaim Turkel
    Guy Hardonag
    7 replies
    Copy to Clipboard
  • a

    Alex To

    10 months ago
    Hello all. I started experimenting lakefs and have not found a definitive answer for Athena integration use case. I would like to know if it' s possible to continue using Athena as is on top of lakefs locations. ie: CREATE TABLE, ADD PARTITION. Based on the documentation here, the first paragraph sounds like it's not possible. Yet it then goes on to describe how to update the metastore with the assumption that there is an existing table pointing to a lakefs location. How does one create the table in the first place? Any success stories with Athena? Any gotchas? Our shop uses Athena heavily but open to run our own PrestoDB clusters if it's justified.
    a
    Itai Admi
    4 replies
    Copy to Clipboard