https://lakefs.io/ logo
Docs
Join the conversationJoin Slack
Channels
announcements
blockers_for_windward
career-opportunities
celebrations
cuddle-corner
data-discussion
data-events
dev
events
general
help
iceberg-integration
lakefs-for-beginners
lakefs-hubspot-cloud-registration-email-automation
lakefs-releases
lakefs-suggestions
lakefs-twitter
linen-dev
memes-and-banter
new-channel
new-channel
say-hello
stackoverflow
test
Powered by Linen
dev
  • o

    Oz Katz

    11/11/2021, 8:35 PM
    Possibly interesting: GitHub's design system is open source and even available as a set of React components: https://primer.style/react/
    👍 1
    k
    2 replies · 2 participants
  • i

    Itai David

    11/14/2021, 6:17 AM
    Hello, I started working on https://github.com/treeverse/lakeFS/issues/1657, and came up with an initial design draft. You can find it here: https://github.com/treeverse/dev/pull/53 https://github.com/treeverse/lakeFS/pull/2711 I requested reviews by @Barak Amar (issue opener) and @Ariel Shaqed (Scolnicov) and @Yoni Augarten (already involved in the discussion), but anyone who is interested is welcome to the discussion. Thanks
    👍🏽 1
    👍 4
    b
    3 replies · 2 participants
  • y

    Yoni Augarten

    11/15/2021, 10:41 AM
    Today @Lior Itzhak, @Itai Admi and I were discussing the behavior of the lakeFS hadoop file system when copying into a non-existing directory. I've checked the behavior in both HDFS and S3A (without lakeFS). It turns out that
    hadoof fs -cp
    doesn't copy to a non-existing path (on neither HDFS nor S3A). What one should do is use
    -mkdir -p
    to create the path, and then perform the copy.
    👍 6
    a
    2 replies · 2 participants
  • y

    Yoni Augarten

    11/17/2021, 1:21 PM
    Why does the lakeFS hadoop filesystem depend on Hadoop 2.7.7 and not a newer version? @Tal Sofer @Itai Admi
    i
    t
    +2
    9 replies · 5 participants
  • u

    Uttam

    11/19/2021, 7:16 AM
    Hello everyone! I joined the community recently and I'm trying to setup the project into my machine(Win x64). I was going through the documentation and I had downloaded latest version of make that came with .diff extension. When I googled some software to extract
    .diff
    files then I landed up something called Mercurial, is this software is decent to use or I should use something else? I'm willing to contribute to
    webui
    is it mandatory to download
    go
    ?
    i
    3 replies · 2 participants
  • t

    Tal Sofer

    11/21/2021, 9:43 AM
    I used docker “Everything Bagel” to spin up a lakeFS setup connected to hive metastore, and ran
    docker compose --profile client run --rm hive-client
    to enable the hive-client. Now, I would like to create my first table in the metastore. How do I use the hive-client to do that? @Guy Hardonag you can probably point me to the right direction
    g
    6 replies · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    11/22/2021, 11:59 AM
    @Barak Amar is https://github.com/treeverse/lakeFS/issues/618 to run Nessie vs. a MinIO object store (using the S3 block adapter), or is it something else? I think the former, but I don't understand your last comment there.
    b
    2 replies · 2 participants
  • m

    mishraprafful

    11/22/2021, 1:49 PM
    Hey everyone, What do you guys think about this? https://github.com/treeverse/lakeFS/issues/2739
    :lakefs: 1
    👍🏼 1
    l
    2 replies · 2 participants
  • i

    Itai Admi

    11/24/2021, 1:53 PM
    @Lior Resisi @Ori Adijes I think this feature-request might also be helpful for your merge conflicts, WDYT?
    o
    d
    3 replies · 3 participants
  • o

    Ori Adijes

    11/25/2021, 12:45 PM
    <!here> Hello, quick question: I saw there is an option of delete recursive for the lakectl fs rm command. 1. Since which version this support was added? 2. Do we have an option to do it also through the delete object Rest API?
    l
    i
    +1
    4 replies · 4 participants
  • y

    Yoni Augarten

    11/30/2021, 10:53 AM
    Hey, I've been experimenting with the lakeFS metadata client - and managed to use Databricks to create a diff between two commits! I want to publish the notebook as an example to the project. However all examples in the project are Scala files (see
    examples/
    directory) and my notebook is exported in HTML. Where in the project can/should I put this HTML?
    👏 1
    a
    a
    5 replies · 3 participants
  • o

    Ori Adijes

    12/01/2021, 9:22 AM
    Hello all, Is there an option for bulk delete of object paths through the lakeFS API? At the moment we use the delete object api one by one (async) in order to delete many objects. I think performance wise, it will be much better to do this parallelism server side instead of issuing many http requests in the client side (We can have more than 1000 files requests for deletion)
    b
    o
    +1
    24 replies · 4 participants
  • a

    Ariel Shaqed (Scolnicov)

    12/04/2021, 7:12 PM
    I want us to do a quick patch fix for part of https://github.com/treeverse/lakeFS/issues/2773 tomorrow (we have 2 users stuck on this). The quick fix is to remove username format validation entirely or almost entirely. I believe our code is robust against injection attacks; we can and will verify that usernames are never involved in constructing sql queries, but only ever passed as parameters. Does anyone know of a good reason to validate username formats on usage? (We might validate a format during internal user registration, but that would be for business reasons...) Thanks!
    o
    b
    3 replies · 3 participants
  • a

    Ariel Shaqed (Scolnicov)

    12/10/2021, 7:36 PM
    No immediate action from us. But Spark users will probably be vulnerable, regardless of whether or not they use lakeFS clients (depending on how much 3rd parties can control what they log). Operators should set the relevant property to avoid loading executable code from user-supplied urls. But this is a nasty bug, and one that was completely avoidable. https://www.theregister.com/2021/12/10/log4j_remote_code_execution_vuln_patch_issued/ I reckon once an actual official patch is available, we can bump versions in our Java clients.
    👍 3
    2 replies · 1 participant
  • a

    Ariel Shaqed (Scolnicov)

    12/11/2021, 8:34 AM
    I've just deleted branch
    to-be-deleted-main
    from the lakeFS repository, which used to be called "`main`". Nobody's ever used it, and it contained some filename case issues. The branch tip was
    8ff9846ab6e9aacdea56606ff9dde135c42e4f83
    . I am purposefully not generating a tag there. To be clear: the lakeFS trunk is named "`master`". WHY? This branch was an aborted attempt to rename
    master
    (the lakeFS repo was opened back when this horribly incorrect name was the default). This particular renaming was broken because of some uppercase/lowercase renamings. And those cause issues on a case-insensitive filesystem -- which MacOS helpfully provides by default. So it used to be that after
    git checkout main
    on a macOS box you'd need to seek expert help to recover your local repo on that box... WHY does trunk remain
    master
    ?
    It is an incorrect name with horrible connotations (and not one used historically). However renaming trunk requires a simultaneous (not concurrent!) change: all developers must rename
    master
    ->
    main
    on the same commit, all actions must change to accept the new name, and other unknown dependencies will break. (I would appreciate pointers to articles from any open-source projects about good practices for doing and testing this!)
    👍 3
    🙏 2
    k
    1 reply · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    12/19/2021, 1:51 PM
    I'm getting
    java.lang.NoSuchMethodException: com.databricks.spark.metrics.FileSystemWithMetrics.getAmazonS3Client()
    when writing using HadoopFS (
    io.lakefs:hadoop-lakefs-assembly:0.1.4
    ). Has anyone seen this?
    stacktrace.txt.gz
    b
    5 replies · 2 participants
  • o

    Oz Katz

    12/21/2021, 11:13 AM
    could be useful for the lakeFS monorepo - supports running steps only if certain paths changed: https://github.com/dorny/paths-filter
    b
    a
    3 replies · 3 participants
  • t

    Tal Sofer

    12/23/2021, 7:39 AM
    Is there a way to view Exclidraw charts that are part of github PRs?
    b
    a
    3 replies · 3 participants
  • i

    Itai Admi

    01/13/2022, 3:58 PM
    Is anyone familiar with speedb ? They claim to have developed a 10X rockdb compatible replacement.
    o
    a
    4 replies · 3 participants
  • a

    Ariel Shaqed (Scolnicov)

    01/14/2022, 3:43 PM
    https://github.com/github/roadmap/issues/372 We should try some of these mermaid diagrams. 🧜🏼 (Says the guy whose name Disney stole for The Little Mermaid...)
    👍🏻 1
    👍 1
    b
    1 reply · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    01/17/2022, 7:01 AM
    TIL Kubernetes lets me use JSON Patch to edit objects. For instance, I can add a backend service to an ingress named "my-ingress" by saying
    kubectl patch ingress myingress --type json --patch-file=/tmp/patch.json
    where
    /tmp/patch.json
    is:
    [
      {
        "op": "add",
        "path": "/spec/rules/-",
        "value": {
          "host": "<http://another.example.com|another.example.com>",
          "http": {
    	"paths": [
              {
              "backend": {
                "service": {
                  "name": "another-svc",
                  "port": {
                    "number": 5678
                  }
                }
              },
              "path": "/another",
    	  "pathType": "Prefix"
            }]
          }
        }
      }
    ]
    A (more) readable introduction to JSON Patch is http://jsonpatch.com/; there's also https://datatracker.ietf.org/doc/html/rfc6902 of course.
    👍 1
    🆒 3
    o
    2 replies · 2 participants
  • g

    Guy Hardonag

    01/19/2022, 8:31 AM
    Hey, we got a feature request from one of our users requesting to add the option to set the
    time
    of the commit. git does support it and I believe we should to, Checking hear to see if anyone has any objections
    👍 2
    i
    a
    4 replies · 3 participants
  • a

    Ariel Shaqed (Scolnicov)

    01/21/2022, 7:34 AM
    @Guy Hardonag after we talked yesterday, I came up with this neat trick for transforming any iterator into a prefetching iterator! The basic trick is for the wrapping iterator to start a goroutine that reads values from the base iterator and writes them to a channel of size 10K (for instance). Now to advance and read a value, the wrapping iterator simply reads from the channel. If goroutine gets too far ahead, it fills up the output channel... and blocks. So it cannot move too far, but if it goes to read file or network it still lets the wrapping iterator continue for a bit. How to handle
    NextRange
    (and even
    SeekGE
    )? Basically, have another channel from the wrapping iterator to the base iterator. Every time the wrapping iterator needs to change location, it drops the old channel (@Barak Amar do we need to drain a channel, or can it just be garbage-collected while full?), creates a new channel in its stead, and sends a command
    <"NextRange", ptrToNewChannel>
    to the base iterator. The base iterator now
    select
    s on being able to write on its output or read from its input; if it gets a command on the input channel it can implement it. Not perfect, but a cheap way to prefetch.
    👀 1
    b
    6 replies · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    01/23/2022, 8:14 AM
    Reversim Summit 2021 conference videos are all out -- tech content in Hebrew (sorry all non-speakers...). https://www.youtube.com/playlist?list=PLqXy0aX6TzQryGoAdbyPevKocQxMJzg8_. Includes some content by familiar faces from lakeFS users, as well as my very own

    https://youtu.be/ZeE-JxMZDLk▾

    . 🙂
    👏 7
    :levitating-lakefs: 3
    🔥 1
    :lakefs: 7
    💥 7
    o
    2 replies · 2 participants
  • t

    Tal Sofer

    01/30/2022, 11:42 AM
    I need to use lfs golang sdk from my code, what go package I should get? I only see this github.com/dollarkillerx/lakefs-sdk and it is not published by the lakefs team (this is very cool!). Is there a reason why we don’t publish this? cc @Barak Amar
    👍🏻 1
    b
    2 replies · 2 participants
  • m

    mishraprafful

    02/01/2022, 9:59 PM
    Hey, I just created a PR for a good first issue, but seems like I need approvals from maintainers before the CI checks run. Could you please approve the same. Thanks Ref: https://github.com/treeverse/lakeFS/pull/2900
    👀 1
    🙌 2
    🎉 2
    b
    a
    5 replies · 3 participants
  • m

    mishraprafful

    02/02/2022, 3:23 PM
    Hey, I’m just a bit curious, I get this message every-time I create a PR.
    First-time contributors need a maintainer to approve running workflows. Learn more.
    but I am not a first time contributor, is it that every PR needs to be approved by a maintainer for running the workflows. Ref: https://github.com/treeverse/lakeFS/pull/2902
    b
    3 replies · 2 participants
  • a

    Adi Polak

    02/07/2022, 8:56 AM
    hi devs, installing statik with
    go get <http://github.com/rakyll/statik|github.com/rakyll/statik>
    returned the following message:
    go get: installing executables with 'go get' in module mode is deprecated.
    	To adjust and download dependencies of the current module, use 'go get -d'.
    	To install using requirements of the current module, use 'go install'.
    	To install ignoring the current module, use 'go install' with a version,
    	like 'go install <http://example.com/cmd@latest|example.com/cmd@latest>'.
    	For more information, see <https://golang.org/doc/go-get-install-deprecation>
    	or run 'go help get' or 'go help install'.
    which one is should I use to install statik?
    go get -d
    or
    go install
    ?
    b
    6 replies · 2 participants
  • i

    Itai David

    02/07/2022, 2:40 PM
    Hi all. I'm attaching a technical discussion that we were (mistakenly) having in an internal thread. It concerns an issue regarding the way we handle Windows paths. TL;DR - we want to change backslashes (
    \
    ) in Windows' style paths to Unix's style forward slashes (
    /
    ) to match S3 behavior. Out main concern is the effect it might have on users currently using Windows' style paths and the effect it might have on their data. The original message is attached hereafter. I'm attaching all responses as comments. All further comments are very appreciated. Thanks Hi. I would love to here some opinions and some your of concerns regarding an issue I'm working on - BUG : Upload directory path not parsed correctly #2880 TL;DR - we are not treating Windows' path delimiter - the backslash - as such Upon uploading a file, from Windows, the original path, e.g. 
    C:\some\path\to\file
     is kept and used as a key. Later, when the file is returned to the client(s), as a response for ListObjects, we do not parse the key as a path correctly. While 
    lakectl
     shows the key as a full path either ways, the web UI fails to create the nice (and expected) tree display, as described in the bug On @Tal Sofer's advice, I looked at S3 behaviour, and found out that AWS actually converts Windows` style paths to Unix style. Namely, all backslashes are turned into forward slashes. That's it for the problem description, now for possible solutions: Option 1 - change lakefs behaviour to match S3 - this means we actively change backslashes in path to forward slashes. I can think of 2 problems here: • Users who use Windows clients and has their data in place (let's say they do not use the web UI and are not bothered by the path parsing difference) will be affected, as new files will be 'named' differently • Not sure this is a real issue, but Unix allows forward slashes as part of a valid path name. If we change these to backslashes it will affect the file path in lakefs. Option 2 - Store the file path/name/key the same way we do today, but treat both backslash and forward slash as path delimiters. This will solve the web UI problem and will not affect the way we keep data, and so, existing user data will not be affected . However, this is handling the symptom rather than the problem, and I'm not sure of where else this problem gonna pop. Moreover, it does not align with our S3-like approach, so not sure this is even an option Any advice, opinion or other considerations I fell to mention? All comments will be greatly appreciated 🙏
    7 replies · 1 participant
  • b

    Barak Amar

    02/16/2022, 10:16 AM
    build with the new docker uses buildx as far as I know
    o
    4 replies · 2 participants
Powered by Linen
Title
b

Barak Amar

02/16/2022, 10:16 AM
build with the new docker uses buildx as far as I know
o

Or Tzabary

02/16/2022, 10:22 AM
you're right
the
buildx
is redundant
b

Barak Amar

02/16/2022, 10:24 AM
think we can specify the same with env var and inside the docker file
didn't test it on Eden s machine
View count: 3