lakefs-for-beginners
  • e

    esspee

    10/03/2022, 3:35 PM
    Hi all would like some advise on deploying Lakefs on GKE. I have deployed the helm charts in the repo treeserve/charts. This has created me the Lakefs pod. No that the pod is running how do integrate the service to the GCS buckets?
  • o

    Omkar Patil

    10/07/2022, 11:11 AM
    Hello Everyone!! I want to store JSON file in LakeFS repo and want to do versioning of it. I was able to consume LakeFS API with Java Client to insert the object into repo. Now I want to the versioning of my file. How can I do that? Can anyone provide me some guidance on that?
  • f

    Fizza Abid

    10/13/2022, 9:33 AM
    Hello, I have installed lakefs on kubernetes; however, I have connected service account so how can I bypass this secret key and access key in UI?
  • m

    MounirB

    10/19/2022, 12:57 PM
    Hi everyone, we are trying to integrate LakeFS with SQL Server 2022 Polybase feature. SQL 2022 is now supporting s3 compatible endpoints so it should work with LakeFS. The thing is that TLS is required. Any idea how to make the LakeFS endpoint SSL enabled? I was not able to find such doc in the LakeFS website. Thanks
  • o

    Omkar Patil

    10/20/2022, 7:41 AM
    Hello Everyone!! I am using LakeFS Java client
    ActionsApi apiInstance = new ActionsApi(defaultClient);
        String repository = "repository_example"; // String | 
        String runId = "runId_example"; // String |
    but Here I am not able to understand about what is runID?
  • o

    Omkar Patil

    10/20/2022, 7:49 AM
    HttpBasicAuth basic_auth = (HttpBasicAuth) defaultClient.getAuthentication("basic_auth");
        basic_auth.setUsername("YOUR USERNAME");
        basic_auth.setPassword("YOUR PASSWORD");
    Should I provide my accessKey and Secretkey here in setUsername and setPassword for login into LakeFS repo
  • d

    dylan butler

    10/27/2022, 4:27 AM
    hello, for a dynamodb deployment do you need to specify a specific hash key? I do not see anything regarding that in the documentation
  • d

    David García

    11/04/2022, 2:56 PM
    Hi all!! I'm comparing tools and would like to know what kind of files I can track with LakeFS??... CSV, parquet, excel..?
  • s

    Selva

    11/06/2022, 3:58 PM
    Hi all!! First of all, thanks for your work. LakeFS is an interesting project. I do have one query. I am trying to streamline my storage layer for machine learning and I came across your repo. I am planning to setup a airflow setup where my users can commit changes in Azure and then a databricks ml notebook is triggered. I am worried of storage cost. I am expecting a solution where the data is deleted from azure storage every N days based on the branch except for one folder which has a scripts or tools to regenerate the data. For example, if my repo structure is as below,
    repo
    |_ config 
    |_ input data
    |_ ml output data
    I want the
    input data
    and
    ml output data
    to be deleted on 10th day of the commit but the
    config
    should live forever. Is it possible?
  • a

    Alok Anand

    11/07/2022, 4:29 AM
    Hi, I have setup my lakefs cloud environment on s3. Earlier I was able to create a repository but when I am trying now it shows me the below error. "failed to create repository: failed to access storage" . I have attached the screenshot for the same. Please let me know if there is something which I have missed.
  • v

    Vino

    11/10/2022, 6:14 PM
    Hi, I'm trying to understand if lakeFS has "git add" like functionality that lets me choose the changes that go into a commit. That way I can discard some changes and commit only a portion of changes in data.
  • t

    Tamir Zheleznyak

    11/13/2022, 10:13 AM
    Hey, Few questions for things im might need on lakefs? • There is a way to read multiple files from specific commit by one request? • There is a way to commit a file by writing it directtly to s3 or download specific file from s3 from specific commit/latest master ? • API for diff of what lines change ?
  • t

    Tamir Zheleznyak

    11/14/2022, 9:26 AM
    Hey, There is a possibility to mange a folder of repositories on the UI? Im expecting large amount of repositories in my usecase and managing them in folder can help me with the order 🙂
  • t

    Tamir Zheleznyak

    11/14/2022, 3:51 PM
    Hey, • What is the default expiration for token given by login request? Can i modify it? Maybe its good idea to add the expiration to the body of the response of the login request? • There is a way to alter the commiter name of commit in the body of the request:
    "POST", "repositories/%s/branches/%s/commits"
  • t

    Tamir Zheleznyak

    11/15/2022, 2:09 PM
    Hey everyone, Question I have regading the retention of objects , Im not sure I understood well the documentation. If I configure on main branch retention for x days . I will delete only objects that i deleted from the current branch after x days or delete their old commits. I want commits older then x days to be deleted it is something possible to archive ?
  • c

    CC

    11/16/2022, 6:11 PM
    Hi, For some reason, the list_repository_runs(repository) API call (in Python ActionsApi) is returning: Invalid value for
    event_type
    (post-create-tag), must be one of ['pre_commit', 'pre_merge'] I want to see post-create-tag entries, and they show up in the GUI. Does the API block me from accessing them? I'm using LakeFS Client 0.83.3 and LakeFS 0.83.3 . Bigger picture: I'm trying to use information saved from WebHooks to replicate a LakeFS instance. Is that a reasonable approach? Thanks, Chuck
  • r

    Ronnie Ning

    11/16/2022, 6:28 PM
    Anyone knows how to set config.yaml for aws dynamodb? I used the example on lakefs for deploy on aws, I got
    ValidationException: The provided key element does not match the schema
  • t

    Tamir Zheleznyak

    11/17/2022, 1:44 PM
    Hey, Questions regarding deletion with: POST/repositories​/{repository}​/branches​/{branch}​/objects​/delete • There is maximum to files I can give ? • If i give wrong pass why there are no errors and I get 200 ok? • Can I give directory to recursively delete all files under it? • Why it is post request and not delete request?
  • c

    CC

    11/17/2022, 2:25 PM
    Hi, Does anyone know where the information displayed by http://localhost:8000/repositories/repo1/actions is stored? Specifically, is it stored in the Key Value Store or in the Object Store? Thanks, Chuck
  • m

    Marija Vella

    11/18/2022, 10:22 AM
    Hi, I'm using lakeFS on my local machine. I uploaded some data to my repo and now I would like to remove it. To upload data I used:
    aws --endpoint-url=<http://localhost:8000> --profile local s3 cp D:\lakefs_upload <s3://example-repo/branch_full_data/> --recursive
    Then I to remove it:
    aws --endpoint-url=<http://localhost:8000> --profile local s3 rm <s3://example-repo/branch_full_data/> --recursive
    I can see that the data is not on my repo repo and that the branch is now empty however, the data seems to be stored somewhere else as the storage space is not being recovered on my PC. Is the data stored somewhere else when it is uploaded? Thanks
  • r

    Ronnie Ning

    11/23/2022, 6:28 PM
    if we run lakefs local settings, it will create a folder called
    lakefs
    under the current user. Can we customize where
    lakefs
    can be?
  • f

    Fizza Abid

    11/25/2022, 8:12 AM
    Hello, when in storage namespace I am giving s3 path, it says
    Can only create repository with storage type: local
    , how should I give s3 path?
  • w

    Walter Johnson

    11/29/2022, 7:44 PM
    I have lakeFS instance running at https://lakefs.quecall.biz/ . I have created a repo but I can't figure out how to make files public. Can anyone help with that? It is my test instance so I have no issue providing credentials for it.
  • o

    Omkar Patil

    11/30/2022, 1:01 PM
    Hi Everyone! I want to bootstrap my LakeFS instance programmatically in Java. I don't want to install LakeFS before but when I run my application it should bootstrap LakeFS instance and I should able to do all the operations. Is it possible in Java?? Please advice . Thank you
  • r

    Ronnie Ning

    12/01/2022, 9:35 PM
    Is there any setting in config for sync? For example, if the repo is deleted in LakeFS, it's also removed from the storage.
  • a

    Alessandro Mangone

    12/05/2022, 12:30 PM
    Hello, I am trying to use the Java API in a Scala codebase, and I am getting an error that was already reported by a user: https://lakefs.slack.com/archives/C02CV7MUV4G/p1666253775100359?thread_ts=1666251707.039089&amp;cid=C02CV7MUV4G In my case I am using SBT instead of maven and don’t have any other dependencies using okhttp3. What could be causing this issue? I am using the SBT Assembly plugin to create a uber-jar, I don’t know if I need to be careful with some merging rule, but I don’t see any dependency conflict
  • s

    Selva

    12/06/2022, 5:20 AM
    Hi guys. I have two questions on LakeFS and thought you can help me. 1. How can I use the LakeFS url in matlab and C# to read the content? 2. Is there any GUI to checkout file, then edit (in notepad) and then check in. We are used to using sourcetree and would like to know if LakeFS has one similar.
  • a

    Alessandro Mangone

    12/06/2022, 10:59 AM
    Final question (I think, sorry 😅) It’s the first time that I have physical access to the data on s3 and I am seeing that lakefs is generating some binary files, I cannot see directly delta files, branch paths, etc. If that’s the case, does this mean that the only way to integrate with tools like athena and snowflake is by generating manifest files? Also, do the files referenced in the symlink files maintain the original columnar compression format (e.g. orc, parquet)?
  • f

    Fizza Abid

    12/06/2022, 12:46 PM
    Hi what do we pass here in domain
    gateways:
        s3:
          domain_name: <http://s3.lakefs.example.com|s3.lakefs.example.com>