https://lakefs.io/ logo
Docs
Join the conversationJoin Slack
Channels
announcements
blockers_for_windward
career-opportunities
celebrations
cuddle-corner
data-discussion
data-events
dev
events
general
help
iceberg-integration
lakefs-for-beginners
lakefs-hubspot-cloud-registration-email-automation
lakefs-releases
lakefs-suggestions
lakefs-twitter
linen-dev
memes-and-banter
new-channel
new-channel
say-hello
stackoverflow
test
Powered by Linen
dev
  • t

    Tal Sofer

    02/24/2022, 8:34 AM
    @Ariel Shaqed (Scolnicov) which hadoopfs method did you find that allows scalar property lookup?
    a
    3 replies · 2 participants
  • m

    Marius Dieckmann

    02/25/2022, 1:34 PM
    hey, i am currently building a data management system, with a self-made data backend similar to yours. I am currently considering switchting to lakefs as data backend and add my own functionality on top. I do however have a question: Did you ever consider to provide support for cockroachdb as an alternative to postgres?
    o
    3 replies · 2 participants
  • i

    Itai David

    02/26/2022, 5:08 AM
    Hello all, New PR is up for review - https://github.com/treeverse/lakeFS/pull/2968 TL;DR - there is an efficiency issue when looking for a best common ancestor, as part of diff The search algorithm was (possibly) going over the same commits again and again, causing the process to take much longer than needed This PR diminishes this behaviour, while (hopefully) maintaining the same outcome Please, feel free to review. All comments are welcome 🙂
    🙌 1
    💪🏻 1
    a
    t
    +1
    8 replies · 4 participants
  • a

    Adi Polak

    03/08/2022, 9:21 AM
    Heya Axolotls :levitating-lakefs: I am looking for a way to improve lakeFS documentation. One of the criteria is a reference tool. For example, when updating a specific API, it will alert us on a tutorial that uses this API that requires us to update it as well. Are you familiar with such a tool? any suggestions? Also, which documentation needs improvement asap? 💡
    👌🏿 1
    👌 1
    t
    o
    +1
    5 replies · 4 participants
  • n

    Niro

    03/09/2022, 11:59 AM
    Hi guys, I've been looking at branch naming in the lakeFS repository and noticed that many branches are not created according to the naming conventions under the contributing doc:
    4. If you're adding new functionality, create a new branch named feature/<DESCRIPTIVE NAME>
    5. If you're fixing a bug, create a new branch named fix/<DESCRIPTIVE NAME>-<ISSUE NUMBER>
    Maybe it's good to open this topic to discussion and understand if the requirement is important and relevant and to strictly enforce it. Otherwise maybe we should remove this requirement altogether / define a new one that makes more sense?
    👍🏼 1
    👍 2
    b
    j
    12 replies · 3 participants
  • b

    Barak Amar

    03/09/2022, 1:26 PM
    In order to enable posting proposal and enable on-going work. Suggest to write them down under 'design/proposal/'. The work on the proposal and the on going changes will be done though PRs. Proposal can be closed and discard, accepted and committed to the main branch, or rejected. Rejected proposal will be moved into 'design/proposal/rejected'. Let me know what you think on the proposal proposal, to improve the current PR based solution.
    👍🏼 1
    👍 4
    n
    4 replies · 2 participants
  • n

    Niro

    03/13/2022, 8:07 AM
    Question: What is the reason for the creation_date under the user API to be in Epoch time? Isn't it better to provide it to the user in human readable format?
    b
    1 reply · 2 participants
  • t

    Tal Sofer

    03/15/2022, 3:47 PM
    A naming dilemma I’ll be more than happy to get your input on - We are about to open a new repo for RouterFileSystem and looking for an appealing and clear name for it. So far we have: • hadoop-router-fs • fs-link • router-fs What is your suggestion?? :help_lakefs::help_lakefs::help_lakefs::help_lakefs::help_lakefs::help_lakefs::help_lakefs::help_lakefs:
    🤔 4
    a
    2 replies · 2 participants
  • b

    Barak Amar

    03/15/2022, 6:23 PM
    Go 1.18 just released - https://go.dev/doc/go1.18 🥳
    🎉 1
    🔥 2
    👀 2
    o
    a
    2 replies · 3 participants
  • a

    Ariel Shaqed (Scolnicov)

    03/16/2022, 2:47 PM
    GitHub actions & friends are... slow... https://www.githubstatus.com/incidents/fpk08rxnqjz2
    ☹️ 1
    s
    1 reply · 2 participants
  • e

    Edmondo Porcu

    03/17/2022, 1:39 AM
    Hi. Are there priorities for the good-first-issue issues on Github?
    i
    1 reply · 2 participants
  • p

    Paul Singman

    03/17/2022, 4:15 PM
    There is a community member in the awesome ML.Ops slack asking for more details on the dataset we used to test lakefs with
    lakectl abuse
    command documented here. Can someone help me with answers to his questions (see screenshot)?
    o
    2 replies · 2 participants
  • o

    Oz Katz

    03/18/2022, 10:20 AM
    Hey @Yusuf K! as @Itai Admi mentioned, we’re indeed talking to the Snowflake team about this use case (External table support for lakeFS). We'd love to work with you to make sure we're building something that meets your needs!
    y
    1 reply · 2 participants
  • e

    Edmondo Porcu

    03/19/2022, 7:21 PM
    Does anyone have a view on this? https://github.com/treeverse/lakeFS/issues/2364 I am happy to help, but the current error handling style is not very clear to me
    i
    1 reply · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    03/20/2022, 7:21 AM
    Hi @Barak Amar and VE-team, Really excited to see lakeFS on KV is now a proposal! Did you incorporate my "lock freedom" suggestions on the PR into the proposal? These are intended to ensure that at all times at least one commit makes progress. These will be really important in order to support streaming workflows. For instance, for almost any use of Kafka with lakeFS, we will have periodic commits which may contain many objects. As soon as there is a hiccup we will have concurrent commits -- all of these after the first large commit will also be large! Without lock freedom, concurrent commits get in one another's way, and progress might not occur. As more commits pile up, the probability of any one of them succeeding drops rapidly, and the system will have unbounded delays and might effectively livelock.
    b
    2 replies · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    03/20/2022, 7:38 AM
    Are you writing your own Graveler code, or interested in doing so? To be very clear: If you are a lakeFS user, or if you program inside lakeFS, then you do not care if we do the following. It is only important if you have a copy of any
    .proto
    file from github.com/treeverse/lakeFS somewhere in your project. Graveler serializes its actual commit and range values using protocol buffers. One constantly annoying factor with protocol buffers is that there is no way to publish them. Instead, most projects just copy them over from their source project. A project called Buf is trying to create a protocol buffer repository (think "NPM for protobufs"); it is now in beta. Would you be interested in consuming protocol buffers from there? Would you trust this project? Thanks!
    i
    2 replies · 2 participants
  • i

    Itai Admi

    03/21/2022, 3:38 PM
    Any idea why many checks are skipped for forked PRs? For example this PR. Is it the secrets issue? I thought that’s resolved..
    i
    1 reply · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    03/22/2022, 3:56 PM
    GitHub is taking another nap. https://www.githubstatus.com/incidents/83lq7ftk19r5
    😕 1
    🛌 1
    🛌🏼 1
    i
    a
    2 replies · 3 participants
  • a

    Ariel Shaqed (Scolnicov)

    03/24/2022, 11:07 AM
    Hi lakeFS developers! This PR changes the name of the "test spark metadata client" action. So right now: • Any PRs on the lakeFS repo that branched out prior to 10bced0f on lakeFS can be green and merged without administrative force; • Any PRs on the lakeFS repo that branched out at or after 10bced0f4 will not be green and need administrative force to merge to master. In 3 hours' time I plan to change the name of the required test to the new names. So starting at around 14:00 UTC today (16:00 IST) these states will be reversed. After this time, your old PRs will not be green until you rebase them onto a more recent of lakeFS (or merge, I guess, for people who don't rebase their development branches...). I will be happy to help if you run into any difficulties!
    👍 3
    👍🏻 1
    1 reply · 1 participant
  • e

    Edmondo Porcu

    03/26/2022, 11:35 PM
    Is there already a setup to execute tests against a mock infrastructure within the CI/CD pipeline?
    a
    i
    10 replies · 3 participants
  • e

    Edmondo Porcu

    03/29/2022, 3:31 AM
    Does anyone understand why tests are not run automatically on a new commit on a Pull request ? It says "Waiting reporting of status check" or something similar on each check. It would be handy if one could just click and jump to the workflow execution instead
    b
    20 replies · 2 participants
  • e

    einat.orr

    03/30/2022, 3:58 PM
    Is this a hint that lakeFS might be able to work with BigQuery out of the box? https://hevodata.com/learn/federated-query-bigquery/#w
    👀 1
    o
    e
    3 replies · 3 participants
  • e

    Edmondo Porcu

    03/30/2022, 8:22 PM
    Has someone developed pre-commit hooks for lakeFS? I am doing a couple of self-inflicted push-ups because I committed and pushed without linting locally 😞
    1 reply · 1 participant
  • t

    Tal Sofer

    04/06/2022, 7:04 AM
    Hi, I created my own lakeFS policy and am trying to attach it to a lakeFS user. However, the UI only allow me attach pre-configured policies. am I missing something or this is a bug? I’m able to attach the policy with
    lakectl auth users policies attach -id my_user --policy myPolicy
    b
    16 replies · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    04/08/2022, 8:33 AM
    @Jonathan Rosenberg can we pull https://github.com/treeverse/lakeFS/pull/3159 soon so that I can rebase and test https://github.com/treeverse/lakeFS/pull/3183?
    j
    1 reply · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    04/08/2022, 12:10 PM
    I'm having difficulty compiling
    spark/client
    ; I get (during update):
    [error] file:///home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.9.0/jackson-annotations-2.9.0.jar: not found: /home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.9.0/jackson-annotations-2.9.0.jar
    [error] file:///home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.10.3/jackson-annotations-2.10.3.jar: not found: /home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.10.3/jackson-annotations-2.10.3.jar
    [error] file:///home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.9.5/jackson-databind-2.9.5.jar: not found: /home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.9.5/jackson-databind-2.9.5.jar
    [error] (lakefs-spark-client-247 / update) lmcoursier.internal.shaded.coursier.error.FetchError$DownloadingArtifacts: Error fetching artifacts:
    [error] file:///home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.9.0/jackson-annotations-2.9.0.jar: not found: /home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.9.0/jackson-annotations-2.9.0.jar
    [error] file:///home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.10.3/jackson-annotations-2.10.3.jar: not found: /home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.10.3/jackson-annotations-2.10.3.jar
    [error] file:///home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.9.5/jackson-databind-2.9.5.jar: not found: /home/ariels/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.9.5/jackson-databind-2.9.5.jar
    Does
    sbt compile
    in
    client/spark
    work for anyone? Does it stlll work after
    sbt clean
    ?
    👀 1
    t
    4 replies · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    04/08/2022, 2:47 PM
    Another retention question: the file
    _lakefs/retention/gc/rules/config.json
    is not a JSON file. What format is it? protobuf?? Here's a hexdump of one that I have right now:
    00000000  08 05 12 08 0a 04 6d 61  69 6e 10 06              |......main..|
    b
    4 replies · 2 participants
  • a

    Ariel Shaqed (Scolnicov)

    04/09/2022, 4:35 PM
    ❯ ~/dev/lakeFS/lakectl fs stat <lakefs://ariels-gc-test/main~1/a/moo>
    Invalid 'path': not a valid path uri
    Error executing command.
    Did something change in the refs parser, or am I too tired?
    y
    2 replies · 2 participants
  • y

    Yoni Augarten

    04/12/2022, 3:58 PM
    Hey, EMR question - asked it on Stackoverflow as well: In an EMR cluster, I run multiple Spark steps. Steps may or may not have the same name. I want to monitor the number of failed steps, grouped by the step name. EMR triggers EventBridge events for a step status change, but I want numbers: the goal is to trigger an alarm if more than (say) 5 steps with the same name failed within (say) the last hour. Was hoping to get a Cloudwatch metric counting failed steps, with a dimension of the step name. Can I achieve that?
    o
    2 replies · 2 participants
  • y

    Yoni Augarten

    04/17/2022, 11:12 AM
    I'm about to release a new lakeFS version! Are there any objections? (waiting for @Tal Sofer to merge #3223)
    🤘 1
    t
    1 reply · 2 participants
Powered by Linen
Title
y

Yoni Augarten

04/17/2022, 11:12 AM
I'm about to release a new lakeFS version! Are there any objections? (waiting for @Tal Sofer to merge #3223)
🤘 1
t

Tal Sofer

04/17/2022, 11:22 AM
done
⭐ 1
View count: 3