https://lakefs.io/ logo
Join Slack
Powered by
# help
  • u

    薛宇豪

    08/26/2025, 8:54 AM
    Hi, I want to mount the lakefs frontend under an existing domain, such as https://test.domain.com/lakefs. This way, all requests to the backend API will also include /lakefs. I need to change the original /api/v1 to /lakefs/api/v1. I see that the current helm chart supports configuring
    ingress.hosts.paths
    . Is it possible to directly modify this configuration? However, I see the frontend JS has a hardcoded `export const API_ENDPOINT = '/api/v1'`; https://github.com/treeverse/lakeFS/blob/master/webui/src/lib/api/index.js#L1
    b
    • 2
    • 8
  • c

    Carlos Luque

    09/02/2025, 8:18 AM
    Hi! one question, the OSS version only supports one user?
    i
    i
    • 3
    • 4
  • k

    Kungim

    09/03/2025, 7:28 AM
    Hello Team! I am trying to make a c# client api as a library using openapi generator using /api/swagger.yml, but I noticed that the api is split into 3 files: /api/authentication.yml, /api/authorization.yml and /api/swagger.yml. Do I need to combine them somehow to get full API? Building with just /api/swagger.yml seems to be missing some API functionality. How do I build the full API? Looking forward to any response!
    i
    • 2
    • 7
  • j

    Jose Ignacio Gascon Conde

    09/03/2025, 8:12 AM
    Hi team, I'm having a persistent issue trying to deploy LakeFS to an EKS cluster using the Terraform
    helm_release
    resource, and I'm hoping someone might have some insight. Passing configuration via `values`: I've tried passing the configuration using both the
    lakefsConfig
    key and the
    config
    key (as shown on Artifact Hub). In both cases,
    helm get values lakefs
    confirms that Helm receives the correct values from Terraform. However, the resulting
    ConfigMap
    in the cluster is still the default one.
    o
    • 2
    • 2
  • m

    Mingke Wang

    09/03/2025, 3:01 PM
    Hi guys, I'm a student in ML and want to use lake mount to mount the dataset since the dataset I have is about 3TB. Is there any cheap option instead of buying the enterprise version?
    i
    h
    • 3
    • 3
  • c

    Carlos Luque

    09/04/2025, 8:18 AM
    Hey everyone, just wanted to share some concerns about LakeFS (Version 1.29.0) 1. Is LakeFS removing from S3 the folder created when a repository is deleted? a. If not, why? I mean, LakeFS is a data versioning tool, if we are keeping data that was potentially removed by the user why are we keeping that in S3 2. Removing a repository make that name not usable anymore (I suppose this is coming by the concern explained above) 3. When I upload the same object to LakeFS (without any change), store the object again, taking up storage space (for small object this is not a big deal but since people normally saves here data and the common usage is to upload the folder directly, not the edited files only) 4. Creating Tags consume storage space?
    i
    • 2
    • 4
  • j

    Jiadong Bai

    03/16/2025, 9:29 PM
    Hi there, I am wondering if there is a native API to download the whole branch/commit as a zip file? I looked through the open API specification but seems that there is no such API.
  • i

    Ion

    09/16/2025, 12:45 PM
    I am seeing random failures
    SignatureDoesNotMatch
    the request signature we calculated does not match the signature you provided. Check your key and signing method.
    Any ideas, I found a issue in the repo that also points to boto, but I am using obstore (object-store-rs)
    j
    o
    • 3
    • 6
  • c

    Carlos Luque

    09/17/2025, 3:12 PM
    Hi! one question, are you going to introduce templates or any way to include a template in the Compare (Pull Request)?
    o
    • 2
    • 2
  • u

    薛宇豪

    09/18/2025, 6:33 AM
    Hey, I'm trying to build a customized LakeFS server. After modifying the code, running
    make build-docker
    doesn't seem to generate a docker image with my local code. Is it still pulling the GitHub code for the build?
    b
    • 2
    • 11
  • c

    Carlos Luque

    09/22/2025, 10:36 AM
    Hey, is there any way to restrict access to the repos to specific users? with the custom implementation of RBAC using your code or the up-to-date version of LakeFS, that would be a nice feature to have 😉
    e
    • 2
    • 1
  • h

    HT

    09/25/2025, 10:35 AM
    What is a fast way to retrieve physicalAddress ? Currently:
    Copy code
    client = lakefs_sdk.client.LakeFSClient(lakefs_conf)
    
    res = []
    for object_path in paths:
        response = client.objects_api.stat_object(repository=repo,
                                                    ref=commit,
                                                    path=object_path,
                                                    presign=presign)
    
        res.append(response.physical_address)
    Can client be used in multi processing ?
    Copy code
    import concurrent.futures
    import os
    
    def stat_object_path(args):
        client, repo, commit, presign, object_path = args
        response = client.objects_api.stat_object(
            repository=repo,
            ref=commit,
            path=object_path,
            presign=presign
        )
        return response.physical_address
    
    def get_physical_addresses(client, repo, commit, presign, paths):
        with concurrent.futures.ThreadPoolExecutor(max_workers=alot) as executor:
            res = list(executor.map(
                stat_object_path,
                [(client, repo, commit, presign, p) for p in paths]
            ))
        return res
    
    client = lakefs_sdk.client.LakeFSClient(lakefs_conf)
    res = get_physical_addresses(client, repo, commit, presign, paths)
    n
    b
    • 3
    • 2
  • a

    Amihay Gonen

    09/29/2025, 11:00 PM
    I'm try to connect to icberge reset catalog using duckdb 1.4 following this guide https://docs.lakefs.io/latest/integrations/iceberg/ (example duckdb). got this error
    Copy code
    D ATTACH 'lakefs' AS main_branch (
          TYPE iceberg,
          SECRET lakefs_credentials,
          -- notice the "/relative_to/.../" part:
          ENDPOINT 'https://.../relative_to/repo.main/api'
      );
    Invalid Input Error:
    CatalogConfig required property 'defaults' is missing
    this error is mislead (https://github.com/duckdb/duckdb-iceberg/issues/297#issuecomment-2973232577) it seems the problem is with endpoint , but I can't understand what is the issue
    o
    i
    t
    • 4
    • 6
  • m

    Manuele Nolli

    10/01/2025, 2:29 PM
    Hello everyone, I’m experiencing an issue with my AWS-hosted LakeFS. I successfully imported an S3 bucket into LakeFS, but whenever I try to view the file overview, download a file, or generate a presigned URL, I get the following error:
    AccessDenied arn:aws:sts::XXXX:assumed-role/YYYY/i-AAAA is not authorized to perform: s3:GetObject on resource: ...
    My bucket policy includes:
    {
    "Sid": "lakeFSObjects",
    "Effect": "Allow",
    "Principal": {
    "AWS": "arn:aws:iam::XXX:role/[ROLE_NAME]"
    },
    "Action": [
    "s3:GetObject",
    "s3:PutObject",
    "s3:AbortMultipartUpload",
    "s3:ListMultipartUploadParts"
    ],
    "Resource": "arn:aws:s3:::[BUCKET NAME]/*"
    },
    I’m not sure if it’s related, but I also cannot download the file using the LakeFS Python library. For reference, my bucket is located in eu-central-1. Does anyone have suggestions on how to resolve this issue? Thank you in advance!
    i
    • 2
    • 4
  • j

    John McCloud

    10/01/2025, 4:19 PM
    Hello there! I am playing around with the quickstart and was trying to figure out how to add object-level metadata to files uploaded from a local filesystem. I understand that I can add arbitrary key-value pairs as part of a commit, but what about object information? Is there any way to do this with lakefs? As an example, here's a file uploaded from my local filesystem and the "Object Information" as it exists within LakeFS. How do I add key-value pairs to this object? Thank you!
    i
    o
    • 3
    • 5
  • a

    Amit Varde

    10/08/2025, 9:44 PM
    Hello, I am getting the following error Is there a way I could run lakefs in debug mode..
    Copy code
    [ec2-user@ip-xxx-xxx-xxx-xxx ~]$ /opt/lakefs/latest/lakefs --config /etc/lakefs/poc-01-newton-config.yaml run
    INFO[0000]/home/runner/work/lakeFS/lakeFS/cmd/lakefs/cmd/root.go:130 <http://github.com/treeverse/lakefs/cmd/lakefs/cmd.initConfig()|github.com/treeverse/lakefs/cmd/lakefs/cmd.initConfig()> Configuration file                            fields.file=/etc/lakefs/poc-01-newton-config.yaml file=/etc/lakefs/poc-01-newton-config.yaml phase=startup
    FATA[0000]/home/runner/work/lakeFS/lakeFS/cmd/lakefs/cmd/root.go:114 <http://github.com/treeverse/lakefs/cmd/lakefs/cmd.LoadConfig()|github.com/treeverse/lakefs/cmd/lakefs/cmd.LoadConfig()> Load config                                   error="decoding failed due to the following error(s):\n\n'database' has invalid keys: dynamodb_table_name" phase=startup
    n
    • 2
    • 4
  • j

    Jonny Friedman

    10/10/2025, 7:19 PM
    Hey folks, is there official API documentation? There's a couple of different sdk docs floating out there on the internet and as I understand it some of them are deprecated. More specifically, I'm trying to access the commit history of a file via python sdk. I've found the
    RefsApi.log_commits()
    function, but there appears to be an internal issue with conflicting return types (function promises a
    CommitList
    but actually returns
    str
    ) that causes an unavoidable
    500
    when trying to invoke it. Poking through the source, I've managed to get a working call via a request to
    Copy code
    url = f"{config.lakefs_url}/api/v1/repositories/{repo_name}/refs/main/commits"
    but this learning process has taken me much longer than I would have liked to arrive at this. First started using LakeFS a few weeks ago, so there might be meta knowledge gaps over what docs to be using or best practice patterns. Open to all feedback, thanks!
    n
    • 2
    • 1
  • i

    Ion

    10/13/2025, 4:19 PM
    Correct me if I am wrong, but if I
    put
    the exact same file into the object-store on lakefs, then the uncomitted changes would show empty since there is no change? Is that correct? At least this is what I am seeing because I expect to see changes in the branch, but there werent
    i
    n
    • 3
    • 3
  • t

    Timothy Mak

    10/14/2025, 1:35 AM
    Hi, I'm using
    lakectl local
    to version control my data directory. Is there a
    .gitignore
    type file I can use where I can exclude certain data in the directory?
    b
    • 2
    • 1
  • t

    Timothy Mak

    10/14/2025, 4:33 AM
    Also, it seems that if I say, rename a particular folder and then try to sync with
    lakectl local commit
    , it will try to remove and upload everything again. Is there anyway to tell lakefs that I'm simply doing a rename?
    b
    • 2
    • 2
  • c

    Chris

    10/17/2025, 8:06 PM
    I've got a sticky wicket here. On
    macosx
    I used the FreeBSD tool
    paste
    to build a command.
    paste
    , unfortunately, did not implement
    macOSX
    policy regarding carriage returns, which apparently
    macOSX
    was using on the file. As a result,
    paste
    injected
    \r
    characters (
    0x0D
    ) that even
    vim
    was insensitive to (although
    tr
    was, so it does appear the file was valid, in some sense), and which bypassed python sanitization to be deposited directly in the datalake unescaped, in raw form. When querying these files via the
    lakectl
    tool, they are not accessible. A
    lakectl fs ls
    chokes on the result, yielding a corrupted `common_prefix`:
    Copy code
    common_prefix                                                    1536.1350/
    /ommon_prefix                                                    1539.1980
    -- here you can see the `\r`s are not handled. As a result I can neither rename nor access the files. Has anyone encountered this problem, and if you have, is there a solution?
    i
    n
    a
    • 4
    • 11
  • c

    Carlos Luque

    10/21/2025, 6:42 AM
    Hi one question is there any way to copy objects from one repo to another different repo using lakefs_sdk (python)? The method copy_objects throws some errors when I tried it. The unique solution found is downloading the objects to disk then upload them Thanks!
    i
    b
    • 3
    • 3
  • c

    Carlos Luque

    10/22/2025, 10:46 AM
    Hey! I am trying to run the Garbage collection using the command from the garbage collection tutorial (filled with my information)
    Copy code
    spark-submit --class io.treeverse.gc.GarbageCollection \
        --packages org.apache.hadoop:hadoop-aws:2.7.7 \
        -c spark.hadoop.lakefs.api.url=<https://lakefs.example.com:8000/api/v1>  \
        -c spark.hadoop.lakefs.api.access_key=<LAKEFS_ACCESS_KEY> \
        -c spark.hadoop.lakefs.api.secret_key=<LAKEFS_SECRET_KEY> \
        -c spark.hadoop.fs.s3a.access.key=<S3_ACCESS_KEY> \
        -c spark.hadoop.fs.s3a.secret.key=<S3_SECRET_KEY> \
        <http://treeverse-clients-us-east.s3-website-us-east-1.amazonaws.com/lakefs-spark-client/0.16.0/lakefs-spark-client-assembly-0.16.0.jar> \
        example-repo us-east-1
    And I got this error, any idea?
    Copy code
    Exception in thread "main" java.lang.NullPointerException
    	at io.treeverse.clients.ApiClient$$anon$7.get(ApiClient.scala:224)
    	at io.treeverse.clients.ApiClient$$anon$7.get(ApiClient.scala:220)
    	at dev.failsafe.Functions.lambda$toCtxSupplier$11(Functions.java:243)
    	at dev.failsafe.Functions.lambda$get$0(Functions.java:46)
    	at dev.failsafe.internal.RetryPolicyExecutor.lambda$apply$0(RetryPolicyExecutor.java:74)
    	at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:193)
    	at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
    	at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:112)
    	at io.treeverse.clients.RequestRetryWrapper.wrapWithRetry(ApiClient.scala:325)
    	at io.treeverse.clients.ApiClient.getBlockstoreType(ApiClient.scala:235)
    	at io.treeverse.gc.GarbageCollection$.run(GarbageCollection.scala:150)
    	at io.treeverse.gc.GarbageCollection$.main(GarbageCollection.scala:109)
    	at io.treeverse.gc.GarbageCollection.main(GarbageCollection.scala)
    	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:966)
    	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:191)
    	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:214)
    	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1054)
    	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1063)
    	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    Spark version
    Copy code
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 3.2.4
          /_/
    
    Using Scala version 2.12.15, OpenJDK 64-Bit Server VM, 11.0.29
    Branch HEAD
    Compiled by user centos on 2023-04-09T20:59:10Z
    Revision 0ae10ac18298d1792828f1d59b652ef17462d76e
    Url <https://github.com/apache/spark>
    Type --help for more information.
    LakeFS version 1.29.0 Thanks heart lakefs
    b
    i
    • 3
    • 3
  • r

    Rob Newman

    10/23/2025, 6:05 PM
    Hi team, we had a trial tenant (https://dynamic-lamprey-scehkw.us-east-1.lakefscloud.io/) that looks like its been removed?
    o
    i
    • 3
    • 8
  • r

    Rudra Prasad Dash

    10/29/2025, 6:36 AM
    Hi @all, I am trying build a golang wrapper over the APIs provider in the swagger docs, but when I am trying basic auth with the access_key_id and secret_key, it always says authentication fails. bdw, I just go introduced to lakefs and it is amazing. Please help me out
    o
    b
    • 3
    • 7
  • γ

    Γιάννης Μαντιός

    10/30/2025, 4:42 PM
    Hi everyone 👋 I am trying to build the project (I have contributed some years ago) but I am unable to get a successful build when running
    make build
    . The command reaches the following point:
    /usr/local/go/bin/go build -o lakefs -ldflags "-X <http://github.com/treeverse/lakefs/pkg/version.Version=dev-c5680959f.with.local.changes|github.com/treeverse/lakefs/pkg/version.Version=dev-c5680959f.with.local.changes>" -v ./cmd/lakefs
    and then crashes with this error:
    pkg/api/lakefs.gen.go:23001:30: undefined: openapi3.Swagger
    pkg/api/lakefs.gen.go:23011:27: undefined: openapi3.NewSwaggerLoader
    make: *** [build-binaries] Error 1
    . Checking the piece of code apparently there is an error with that specific package which doesn’t exist
    b
    • 2
    • 3
  • i

    Ion

    11/07/2025, 8:11 AM
    does lakectl config support multiple endpoints at once? It seems I can only configure one single endpoint in the config
    i
    • 2
    • 1
  • t

    Timothy Mak

    11/10/2025, 1:21 AM
    Hi, I am just wondering if there are any plans in the near future to make
    lakectl local
    more like
    git
    ? I would just like to say, I have so far really enjoyed using
    lakefs
    , and it seems to be the answer to my pain point for a long time. I mainly use it as I would use git -- to version data through the
    lakectl local
    interface. However, compared to git, it's missing quite a lot of features, which makes it, while useable, somewhat finicky when I want to do things in certain ways. Here I list a few examples: 1. I would like to checkout only a directory or file rather than the entire repository for a commit with
    lakectl checkout
    . 2. I would like to examine changes to certain files between different commits, similar to
    git difftool
    . Are there plans to include features such as the above in the near future? Or am I not quite using
    lakefs
    in the intended way?
    i
    a
    • 3
    • 11
  • t

    Timothy Mak

    11/10/2025, 1:25 AM
    Another question. With reference to https://lakefs.slack.com/archives/C016726JLJW/p1760420043187809?thread_ts=1760416389.902419&amp;cid=C016726JLJW above, there isn't a
    lakectl mv
    command that can handle renaming. I would just like to know, if I simply rename a folder, since
    lakefs
    will remove and re-upload everything, does it mean that I will also double the storage usage, since
    lakefs
    tracks changes to my directory?
    i
    • 2
    • 4
  • a

    Ali Risheh

    11/13/2025, 11:30 PM
    Hi all, Thanks for hosting this community. I have a question about LakeFS API in Java (Scala), I am using the following simple function to generate presignedURL:
    Copy code
    statObject(repoName, commitHash, filePath).presign(true).execute().getPhysicalAddress
    But it does not support filename (content disposition header), I went through full library io.lakefs but there is no option to add headers to presigned URL, the only option is to use S3 API and use LakeFS directly which is not standard. Is there any way that LakeFS add filename to presigned URL coming from MinIO?
    a
    • 2
    • 2