https://lakefs.io/ logo
Join Slack
Powered by
# help
  • u

    薛宇豪

    08/12/2025, 9:54 AM
    Hi, does Lakefs have a limit on the number of repositories? I ask this question because I noticed that the pgsql implementation is configured with 100 partitioned tables, and the data related to each repository is stored in the same partitioned table. Therefore, I am unsure whether having a large number of repositories would cause any additional issues or side effects. Additionally, what are the benefits of storing all data under the same table structure rather than using different tables? Would using different tables potentially reduce serialization overhead?
    a
    i
    • 3
    • 4
  • u

    薛宇豪

    08/13/2025, 9:16 AM
    Is there any way to restore a branch that was accidentally deleted? Manually querying the database is also acceptable. Or is there any way to prevent a branch from being deleted?
    a
    h
    • 3
    • 13
  • a

    Alan judi

    08/13/2025, 11:39 PM
    Hello Guys, I have setup lakeFS community on my k8s cluster. When I am in the dashboard, I get the following error. Upon inspecting my pod running lakeFS, I see the following:
    Copy code
    time="2025-08-13T22:53:36Z" level=error msg="failed to create user" func="pkg/auth.(*APIAuthService).CreateUser" file="build/pkg/auth/service.go:213" error="Post \"/auth/users\": unsupported protocol scheme \"\"" service=auth_api username=admin
    time="2025-08-13T22:53:36Z" level=error msg="API call returned status internal server error" func="pkg/api.(*Controller).handleAPIErrorCallback" file="build/pkg/api/controller.go:3033" error="create user - Post \"/auth/users\": unsupported protocol scheme \"\"" host=lakefs.*****.com method=POST operation_id=Setup path=/api/v1/setup_lakefs service=api_gateway
    time="2025-08-13T23:31:41Z" level=error msg="failed to create user" func="pkg/auth.(*APIAuthService).CreateUser" file="build/pkg/auth/service.go:213" error="Post \"/auth/users\": unsupported protocol scheme \"\"" service=auth_api username=admin
    time="2025-08-13T23:31:41Z" level=error msg="API call returned status internal server error" func="pkg/api.(*Controller).handleAPIErrorCallback" file="build/pkg/api/controller.go:3033" error="create user - Post \"/auth/users\": unsupported protocol scheme \"\"" host=lakefs.******.com method=POST operation_id=Setup path=/api/v1/setup_lakefs service=api_gateway
    Here are my helm chart values:
    Copy code
    # lakeFS server configuration
    lakefsConfig: |
      logging:
        level: "INFO"
      database:
        type: postgres
        postgres:
          connection_string: "postgres://****:****@****:5432/postgres?sslmode=disable"
      blockstore:
        type: s3
        s3:
          region: us-west-2
      auth:
        # Optional: map display names & default groups from ID token claims
        api:
          skip_health_check: true
          supports_invites: false
          endpoint: ""
        authentication_api:
          endpoint: ""
          external_principals_enabled: false
        ui_config:
           rbac: simplified
           login_url: /auth/login
           logout_url: /auth/logout
    n
    • 2
    • 4
  • j

    Jeffrey Ji

    08/17/2025, 1:28 AM
    hello folks, seems the career page doesn't work, I cannot submit my resume
    i
    i
    • 3
    • 2
  • u

    薛宇豪

    08/26/2025, 8:54 AM
    Hi, I want to mount the lakefs frontend under an existing domain, such as https://test.domain.com/lakefs. This way, all requests to the backend API will also include /lakefs. I need to change the original /api/v1 to /lakefs/api/v1. I see that the current helm chart supports configuring
    ingress.hosts.paths
    . Is it possible to directly modify this configuration? However, I see the frontend JS has a hardcoded `export const API_ENDPOINT = '/api/v1'`; https://github.com/treeverse/lakeFS/blob/master/webui/src/lib/api/index.js#L1
    b
    • 2
    • 8
  • c

    Carlos Luque

    09/02/2025, 8:18 AM
    Hi! one question, the OSS version only supports one user?
    i
    i
    • 3
    • 4
  • k

    Kungim

    09/03/2025, 7:28 AM
    Hello Team! I am trying to make a c# client api as a library using openapi generator using /api/swagger.yml, but I noticed that the api is split into 3 files: /api/authentication.yml, /api/authorization.yml and /api/swagger.yml. Do I need to combine them somehow to get full API? Building with just /api/swagger.yml seems to be missing some API functionality. How do I build the full API? Looking forward to any response!
    i
    • 2
    • 7
  • j

    Jose Ignacio Gascon Conde

    09/03/2025, 8:12 AM
    Hi team, I'm having a persistent issue trying to deploy LakeFS to an EKS cluster using the Terraform
    helm_release
    resource, and I'm hoping someone might have some insight. Passing configuration via `values`: I've tried passing the configuration using both the
    lakefsConfig
    key and the
    config
    key (as shown on Artifact Hub). In both cases,
    helm get values lakefs
    confirms that Helm receives the correct values from Terraform. However, the resulting
    ConfigMap
    in the cluster is still the default one.
    o
    • 2
    • 2
  • m

    Mingke Wang

    09/03/2025, 3:01 PM
    Hi guys, I'm a student in ML and want to use lake mount to mount the dataset since the dataset I have is about 3TB. Is there any cheap option instead of buying the enterprise version?
    i
    h
    • 3
    • 3
  • c

    Carlos Luque

    09/04/2025, 8:18 AM
    Hey everyone, just wanted to share some concerns about LakeFS (Version 1.29.0) 1. Is LakeFS removing from S3 the folder created when a repository is deleted? a. If not, why? I mean, LakeFS is a data versioning tool, if we are keeping data that was potentially removed by the user why are we keeping that in S3 2. Removing a repository make that name not usable anymore (I suppose this is coming by the concern explained above) 3. When I upload the same object to LakeFS (without any change), store the object again, taking up storage space (for small object this is not a big deal but since people normally saves here data and the common usage is to upload the folder directly, not the edited files only) 4. Creating Tags consume storage space?
    i
    • 2
    • 4
  • j

    Jiadong Bai

    03/16/2025, 9:29 PM
    Hi there, I am wondering if there is a native API to download the whole branch/commit as a zip file? I looked through the open API specification but seems that there is no such API.
  • i

    Ion

    09/16/2025, 12:45 PM
    I am seeing random failures
    SignatureDoesNotMatch
    the request signature we calculated does not match the signature you provided. Check your key and signing method.
    Any ideas, I found a issue in the repo that also points to boto, but I am using obstore (object-store-rs)
    j
    o
    • 3
    • 6
  • c

    Carlos Luque

    09/17/2025, 3:12 PM
    Hi! one question, are you going to introduce templates or any way to include a template in the Compare (Pull Request)?
    o
    • 2
    • 2
  • u

    薛宇豪

    09/18/2025, 6:33 AM
    Hey, I'm trying to build a customized LakeFS server. After modifying the code, running
    make build-docker
    doesn't seem to generate a docker image with my local code. Is it still pulling the GitHub code for the build?
    b
    • 2
    • 11
  • c

    Carlos Luque

    09/22/2025, 10:36 AM
    Hey, is there any way to restrict access to the repos to specific users? with the custom implementation of RBAC using your code or the up-to-date version of LakeFS, that would be a nice feature to have 😉
    e
    • 2
    • 1
  • h

    HT

    09/25/2025, 10:35 AM
    What is a fast way to retrieve physicalAddress ? Currently:
    Copy code
    client = lakefs_sdk.client.LakeFSClient(lakefs_conf)
    
    res = []
    for object_path in paths:
        response = client.objects_api.stat_object(repository=repo,
                                                    ref=commit,
                                                    path=object_path,
                                                    presign=presign)
    
        res.append(response.physical_address)
    Can client be used in multi processing ?
    Copy code
    import concurrent.futures
    import os
    
    def stat_object_path(args):
        client, repo, commit, presign, object_path = args
        response = client.objects_api.stat_object(
            repository=repo,
            ref=commit,
            path=object_path,
            presign=presign
        )
        return response.physical_address
    
    def get_physical_addresses(client, repo, commit, presign, paths):
        with concurrent.futures.ThreadPoolExecutor(max_workers=alot) as executor:
            res = list(executor.map(
                stat_object_path,
                [(client, repo, commit, presign, p) for p in paths]
            ))
        return res
    
    client = lakefs_sdk.client.LakeFSClient(lakefs_conf)
    res = get_physical_addresses(client, repo, commit, presign, paths)
    n
    b
    • 3
    • 2
  • a

    Amihay Gonen

    09/29/2025, 11:00 PM
    I'm try to connect to icberge reset catalog using duckdb 1.4 following this guide https://docs.lakefs.io/latest/integrations/iceberg/ (example duckdb). got this error
    Copy code
    D ATTACH 'lakefs' AS main_branch (
          TYPE iceberg,
          SECRET lakefs_credentials,
          -- notice the "/relative_to/.../" part:
          ENDPOINT 'https://.../relative_to/repo.main/api'
      );
    Invalid Input Error:
    CatalogConfig required property 'defaults' is missing
    this error is mislead (https://github.com/duckdb/duckdb-iceberg/issues/297#issuecomment-2973232577) it seems the problem is with endpoint , but I can't understand what is the issue
    o
    i
    t
    • 4
    • 6
  • m

    Manuele Nolli

    10/01/2025, 2:29 PM
    Hello everyone, I’m experiencing an issue with my AWS-hosted LakeFS. I successfully imported an S3 bucket into LakeFS, but whenever I try to view the file overview, download a file, or generate a presigned URL, I get the following error:
    AccessDenied arn:aws:sts::XXXX:assumed-role/YYYY/i-AAAA is not authorized to perform: s3:GetObject on resource: ...
    My bucket policy includes:
    {
    "Sid": "lakeFSObjects",
    "Effect": "Allow",
    "Principal": {
    "AWS": "arn:aws:iam::XXX:role/[ROLE_NAME]"
    },
    "Action": [
    "s3:GetObject",
    "s3:PutObject",
    "s3:AbortMultipartUpload",
    "s3:ListMultipartUploadParts"
    ],
    "Resource": "arn:aws:s3:::[BUCKET NAME]/*"
    },
    I’m not sure if it’s related, but I also cannot download the file using the LakeFS Python library. For reference, my bucket is located in eu-central-1. Does anyone have suggestions on how to resolve this issue? Thank you in advance!
    i
    • 2
    • 4
  • j

    John McCloud

    10/01/2025, 4:19 PM
    Hello there! I am playing around with the quickstart and was trying to figure out how to add object-level metadata to files uploaded from a local filesystem. I understand that I can add arbitrary key-value pairs as part of a commit, but what about object information? Is there any way to do this with lakefs? As an example, here's a file uploaded from my local filesystem and the "Object Information" as it exists within LakeFS. How do I add key-value pairs to this object? Thank you!
    i
    o
    • 3
    • 5
  • a

    Amit Varde

    10/08/2025, 9:44 PM
    Hello, I am getting the following error Is there a way I could run lakefs in debug mode..
    Copy code
    [ec2-user@ip-xxx-xxx-xxx-xxx ~]$ /opt/lakefs/latest/lakefs --config /etc/lakefs/poc-01-newton-config.yaml run
    INFO[0000]/home/runner/work/lakeFS/lakeFS/cmd/lakefs/cmd/root.go:130 <http://github.com/treeverse/lakefs/cmd/lakefs/cmd.initConfig()|github.com/treeverse/lakefs/cmd/lakefs/cmd.initConfig()> Configuration file                            fields.file=/etc/lakefs/poc-01-newton-config.yaml file=/etc/lakefs/poc-01-newton-config.yaml phase=startup
    FATA[0000]/home/runner/work/lakeFS/lakeFS/cmd/lakefs/cmd/root.go:114 <http://github.com/treeverse/lakefs/cmd/lakefs/cmd.LoadConfig()|github.com/treeverse/lakefs/cmd/lakefs/cmd.LoadConfig()> Load config                                   error="decoding failed due to the following error(s):\n\n'database' has invalid keys: dynamodb_table_name" phase=startup
    n
    • 2
    • 4
  • j

    Jonny Friedman

    10/10/2025, 7:19 PM
    Hey folks, is there official API documentation? There's a couple of different sdk docs floating out there on the internet and as I understand it some of them are deprecated. More specifically, I'm trying to access the commit history of a file via python sdk. I've found the
    RefsApi.log_commits()
    function, but there appears to be an internal issue with conflicting return types (function promises a
    CommitList
    but actually returns
    str
    ) that causes an unavoidable
    500
    when trying to invoke it. Poking through the source, I've managed to get a working call via a request to
    Copy code
    url = f"{config.lakefs_url}/api/v1/repositories/{repo_name}/refs/main/commits"
    but this learning process has taken me much longer than I would have liked to arrive at this. First started using LakeFS a few weeks ago, so there might be meta knowledge gaps over what docs to be using or best practice patterns. Open to all feedback, thanks!
    n
    • 2
    • 1
  • i

    Ion

    10/13/2025, 4:19 PM
    Correct me if I am wrong, but if I
    put
    the exact same file into the object-store on lakefs, then the uncomitted changes would show empty since there is no change? Is that correct? At least this is what I am seeing because I expect to see changes in the branch, but there werent
    i
    n
    • 3
    • 3
  • t

    Timothy Mak

    10/14/2025, 1:35 AM
    Hi, I'm using
    lakectl local
    to version control my data directory. Is there a
    .gitignore
    type file I can use where I can exclude certain data in the directory?
    b
    • 2
    • 1
  • t

    Timothy Mak

    10/14/2025, 4:33 AM
    Also, it seems that if I say, rename a particular folder and then try to sync with
    lakectl local commit
    , it will try to remove and upload everything again. Is there anyway to tell lakefs that I'm simply doing a rename?
    b
    • 2
    • 2
  • c

    Chris

    10/17/2025, 8:06 PM
    I've got a sticky wicket here. On
    macosx
    I used the FreeBSD tool
    paste
    to build a command.
    paste
    , unfortunately, did not implement
    macOSX
    policy regarding carriage returns, which apparently
    macOSX
    was using on the file. As a result,
    paste
    injected
    \r
    characters (
    0x0D
    ) that even
    vim
    was insensitive to (although
    tr
    was, so it does appear the file was valid, in some sense), and which bypassed python sanitization to be deposited directly in the datalake unescaped, in raw form. When querying these files via the
    lakectl
    tool, they are not accessible. A
    lakectl fs ls
    chokes on the result, yielding a corrupted `common_prefix`:
    Copy code
    common_prefix                                                    1536.1350/
    /ommon_prefix                                                    1539.1980
    -- here you can see the `\r`s are not handled. As a result I can neither rename nor access the files. Has anyone encountered this problem, and if you have, is there a solution?
    i
    n
    a
    • 4
    • 11
  • c

    Carlos Luque

    10/21/2025, 6:42 AM
    Hi one question is there any way to copy objects from one repo to another different repo using lakefs_sdk (python)? The method copy_objects throws some errors when I tried it. The unique solution found is downloading the objects to disk then upload them Thanks!
    i
    b
    • 3
    • 3
  • c

    Carlos Luque

    10/22/2025, 10:46 AM
    Hey! I am trying to run the Garbage collection using the command from the garbage collection tutorial (filled with my information)
    Copy code
    spark-submit --class io.treeverse.gc.GarbageCollection \
        --packages org.apache.hadoop:hadoop-aws:2.7.7 \
        -c spark.hadoop.lakefs.api.url=<https://lakefs.example.com:8000/api/v1>  \
        -c spark.hadoop.lakefs.api.access_key=<LAKEFS_ACCESS_KEY> \
        -c spark.hadoop.lakefs.api.secret_key=<LAKEFS_SECRET_KEY> \
        -c spark.hadoop.fs.s3a.access.key=<S3_ACCESS_KEY> \
        -c spark.hadoop.fs.s3a.secret.key=<S3_SECRET_KEY> \
        <http://treeverse-clients-us-east.s3-website-us-east-1.amazonaws.com/lakefs-spark-client/0.16.0/lakefs-spark-client-assembly-0.16.0.jar> \
        example-repo us-east-1
    And I got this error, any idea?
    Copy code
    Exception in thread "main" java.lang.NullPointerException
    	at io.treeverse.clients.ApiClient$$anon$7.get(ApiClient.scala:224)
    	at io.treeverse.clients.ApiClient$$anon$7.get(ApiClient.scala:220)
    	at dev.failsafe.Functions.lambda$toCtxSupplier$11(Functions.java:243)
    	at dev.failsafe.Functions.lambda$get$0(Functions.java:46)
    	at dev.failsafe.internal.RetryPolicyExecutor.lambda$apply$0(RetryPolicyExecutor.java:74)
    	at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:193)
    	at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
    	at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:112)
    	at io.treeverse.clients.RequestRetryWrapper.wrapWithRetry(ApiClient.scala:325)
    	at io.treeverse.clients.ApiClient.getBlockstoreType(ApiClient.scala:235)
    	at io.treeverse.gc.GarbageCollection$.run(GarbageCollection.scala:150)
    	at io.treeverse.gc.GarbageCollection$.main(GarbageCollection.scala:109)
    	at io.treeverse.gc.GarbageCollection.main(GarbageCollection.scala)
    	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:966)
    	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:191)
    	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:214)
    	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1054)
    	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1063)
    	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    Spark version
    Copy code
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 3.2.4
          /_/
    
    Using Scala version 2.12.15, OpenJDK 64-Bit Server VM, 11.0.29
    Branch HEAD
    Compiled by user centos on 2023-04-09T20:59:10Z
    Revision 0ae10ac18298d1792828f1d59b652ef17462d76e
    Url <https://github.com/apache/spark>
    Type --help for more information.
    LakeFS version 1.29.0 Thanks heart lakefs
    b
    i
    • 3
    • 3
  • r

    Rob Newman

    10/23/2025, 6:05 PM
    Hi team, we had a trial tenant (https://dynamic-lamprey-scehkw.us-east-1.lakefscloud.io/) that looks like its been removed?
    o
    i
    • 3
    • 7
  • r

    Rudra Prasad Dash

    10/29/2025, 6:36 AM
    Hi @all, I am trying build a golang wrapper over the APIs provider in the swagger docs, but when I am trying basic auth with the access_key_id and secret_key, it always says authentication fails. bdw, I just go introduced to lakefs and it is amazing. Please help me out
    o
    b
    • 3
    • 7
  • γ

    Γιάννης Μαντιός

    10/30/2025, 4:42 PM
    Hi everyone 👋 I am trying to build the project (I have contributed some years ago) but I am unable to get a successful build when running
    make build
    . The command reaches the following point:
    /usr/local/go/bin/go build -o lakefs -ldflags "-X <http://github.com/treeverse/lakefs/pkg/version.Version=dev-c5680959f.with.local.changes|github.com/treeverse/lakefs/pkg/version.Version=dev-c5680959f.with.local.changes>" -v ./cmd/lakefs
    and then crashes with this error:
    pkg/api/lakefs.gen.go:23001:30: undefined: openapi3.Swagger
    pkg/api/lakefs.gen.go:23011:27: undefined: openapi3.NewSwaggerLoader
    make: *** [build-binaries] Error 1
    . Checking the piece of code apparently there is an error with that specific package which doesn’t exist
    b
    • 2
    • 3