https://lakefs.io/ logo
#help
Title
# help
g

Giuseppe Barbieri

02/07/2024, 11:32 AM
I'm using the example available here, all in one However, when the docker is off, or after a reboot, repo created in lakeFS are not persistent, while data in MinIO buckets are still present, so if I try to re-create a lakeFS repo using the same bucket I get an error
failed to create repository: found lakeFS objects in the storage namespace(s3://example) key(_lakefs/dummy): storage namespace already in use
n

Niro

02/07/2024, 12:21 PM
The lakeFS target in the docker file uses local database type for the metadata store. If you want persistency you need to configure a persistent volume for that target
If you believe there should be a different default behavior, please feel free to open a Github issue
g

Giuseppe Barbieri

02/07/2024, 1:40 PM
ok, so shall I just destroy and re-create minIO buckets on the fly?
n

Niro

02/07/2024, 1:55 PM
Do you want your setup to be persistent or not?
g

Giuseppe Barbieri

02/07/2024, 1:55 PM
let's say, I think this is overkill for my case, but if it's gonna be quick (and dirty, I dont care) then I can do it
to give you some background, I'm gonna need to play around with webhooks and so on locally, before moving everything in production
n

Niro

02/07/2024, 2:09 PM
I understand - so in that case, it's better if your environment is persistent. In that case I would recommend creating a volume for the lakeFS metadata and setting the LAKEFS_DATABASE_LOCAL_PATH to that volume
You can also reuse the
data
volume for that matter
g

Giuseppe Barbieri

02/07/2024, 2:10 PM
In that case I would recommend creating a volume for the lakeFS metadata and setting the LAKEFS_DATABASE_LOCAL_PATH to that volume
do you mean in MinIO?
n

Niro

02/07/2024, 2:10 PM
No - in the docker file
g

Giuseppe Barbieri

02/07/2024, 2:11 PM
ok, so
data
is already there?
n

Niro

02/07/2024, 2:11 PM
yes - it is currently used by the minio container
g

Giuseppe Barbieri

02/07/2024, 2:11 PM
I just type
<s3://data>
or something like that?
n

Niro

02/07/2024, 2:13 PM
Are you using the lakeFS-samples repo because you are using the examples and data there for your POC or just for a quick way to setup a lakeFS instance?
g

Giuseppe Barbieri

02/07/2024, 2:13 PM
the latter
n

Niro

02/07/2024, 2:14 PM
In that case I would recommend going with the quickstart manual
It will provide you with a persistent lakeFS environment using local database and block adapter
g

Giuseppe Barbieri

02/13/2024, 10:48 AM
Hello again Niro šŸ™‚ so, I'm trying to follow your advice, I: ā€¢ downloaded the latest lakefs binary for linux x64 ā€¢ installed MinIO ā€¢ installed and configured Postgres ā€¢ added a bucket (
bucket0
) and an user (`username`/`password`) in MinIO ā€¢ created
config.yaml
in the extracted LakeFS binary folder and now I'm running
Copy code
elect@5800x:~/Documents/lakeFS_1.10.0_Linux_x86_64$ ./lakefs run --quickstart

WARNING!

Using quickstart parameters configuration. This is suitable only for testing! It is NOT SUPPORTED for production.

FATAL: quickstart mode can only run with local settings
if I omit
--quickstart
, this is my output and config file
n

Niro

02/13/2024, 10:51 AM
Hi @Giuseppe Barbieri, Quickstart mode does not require minio or postgres. It relies on your local filesystem for both the data and metadata store. You don't need to provide a configuration file. Simply run
lakefs run --quickstart
Do you require postgres and Mino for your POC?
g

Giuseppe Barbieri

02/13/2024, 10:52 AM
thinking about it, actually no, for the moment..
n

Niro

02/13/2024, 10:53 AM
Great, so it should be just as simple
šŸ‘ 1
g

Giuseppe Barbieri

02/13/2024, 11:16 AM
just tried to upload, via web interface,
_lakefs_actions/dataset.yml
Copy code
name: Dataset
description: This webhook ensures that a DATASET.json should be present
on:
  pre-commit:
    branches:
      - "*"
hooks:
  - id: dataset_validator
    type: webhook
    description: Validate DATASET.json
    properties:
      url: "<http://0.0.0.0:8080/webhooks/format>"
pre-commit hook aborted, run id '5dkkpg2pdakjvr75sb7g': 1 error occurred: * hook run id '0000_0000' failed on action 'Dataset' hook 'dataset_validator': webhook request failed (status code: 405)
my simple server still listens something, nonetheless > 2024-02-13 121731.650 [eventLoopGroupProxy-4-1] TRACE io.ktor.routing.Routing - Trace for [webhooks, format] > /, segment:0 -> SUCCESS @ / > /webhooks, segment:1 -> SUCCESS @ /webhooks > /webhooks/format, segment:2 -> SUCCESS @ /webhooks/format > /webhooks/format/(method:GET), segment:2 -> FAILURE "Selector didn't match" @ /webhooks/format/(method:GET) > Matched routes: > No results > Route resolve result: > FAILURE "No matched subtrees found" @ /
this is the output from visiting manually http://0.0.0.0:8080/webhooks/format via browser instead > 2024-02-13 121836.135 [eventLoopGroupProxy-4-2] TRACE io.ktor.routing.Routing - Trace for [webhooks, format] > /, segment:0 -> SUCCESS @ / > /webhooks, segment:1 -> SUCCESS @ /webhooks > /webhooks/format, segment:2 -> SUCCESS @ /webhooks/format > /webhooks/format/(method:GET), segment:2 -> SUCCESS @ /webhooks/format/(method:GET) > Matched routes: > "" -> "webhooks" -> "format" -> "(method:GET)" > Route resolve result: > SUCCESS @ /webhooks/format/(method:GET)
n

Niro

02/13/2024, 12:39 PM
@Giuseppe Barbieri seems like you're making progress, however it seems like your webhook server is not configured correctly:
The HTTP 405 status code indicates that the server has received your request, but the resource you are requesting doesn't support the request method. This may occur if you're using an incorrect method or the server is configured to disallow the said method.
g

Giuseppe Barbieri

02/13/2024, 1:51 PM
indeed, I was answering on
GET
, I'm doing on
POST
now and it seems fine, commits get through
n

Niro

02/13/2024, 1:52 PM
Great to hear, hope you have smooth sailing from here
šŸ‘ 1
g

Giuseppe Barbieri

02/13/2024, 1:57 PM
thanks, can I use the LakeFS API to connect to a LakeFS instance running in quickstart mode?
a

Ariel Shaqed (Scolnicov)

02/13/2024, 2:11 PM
Hi Giuseppe, Not sure what is the flow / architecture -- where is the server, where is the client, and what kind of client is it?
If the question is how to connect to a locally-running lakefs server in quickstart mode, then it will be accessible on loopback on whatever port you're using. lakectl will happily connect there, on localhost:8000 or 127.0.0.1:8000 (":8000" should be whatever port lakefs is listening on). As long as it's local traffic, it should just work.
šŸ‘ 1
g

Giuseppe Barbieri

02/13/2024, 2:21 PM
Hey Ariel, at the moment it's just this (Ktor)
Copy code
fun main() {
    embeddedServer(Netty, port = 8080, host = "0.0.0.0", module = Application::module)
        .start(wait = true)
}

fun Application.module() {
    configureRouting()
}

fun Application.configureRouting() {
    routing {
        post("/webhooks/format") {
            call.respondText("ciao")
        }
    }
}
a

Ariel Shaqed (Scolnicov)

02/13/2024, 2:34 PM
Could you provide more context for which lakeFS is connecting to which lakeFS, in this question:
thanks, can I use the LakeFS to connect to a LakeFS instance running in quickstart mode?
I understand that you have a lakeFS running locally in quickstart mode. i understand that you have successfully connected it to your hooks server. What is the second "lakeFS"? Are you asking whether and how the hooks server can access lakeFS?
g

Giuseppe Barbieri

02/13/2024, 2:38 PM
sorry Ariel, I forgot
API
in the original question
thanks, can I use the LakeFS API to connect to a LakeFS instance running in quickstart mode? (edited)
a

Ariel Shaqed (Scolnicov)

02/13/2024, 2:58 PM
Oh right, sure. It's a full lakeFS server. You should be able to reach it on localhost on the port it listens on. If it's 8000, your code should connect to something like http://localhost:8000/api/v1.
šŸ‘ 1
2 Views