Does anybody know how to skip or mimic the /app/wa...
# help
u
Does anybody know how to skip or mimic the /app/wait-for function in the entrypoint of the docker file? I have a situation where the container keeps exiting because it receives an error at this point.
u
Hi @Walter Johnson, can you please provide more information about your situation. the
wait-for
function waits until
postgres
is up. it runs this script. you could skip it by editing the docker-compose file and removing
"/app/wait-for", "postgres:5432", "--",
But you will probably experience some other error.
u
I am using Terraform and my postgres instance is setup before and I am certain it is there before I need lakeFS.
u
I have attmepted to remove that and just use the /app/start function but it doesn't like that at all. So am I wondering if there is a way to mimic that output in place of the /app/wait-for postgres:5432 -- command.
u
It is blowing up right here ... {wait-for} /bin/sh /app/wait-for localhost:5432 -- /app/lakefs run It starts for a second and then that function times out. I have confirmed that pg is running and able to be connected to on port 5432 with user lakefs and password lakefs, Any ideas?
u
How did you confirm that?
u
I have another container running that I am able to connect to my pg instance with.
u
from that containers cli I was able to log in.
u
can you please send here the connection string you are using to connect?
u
what is Opostgres ?
u
Please dont let that be it....😄
u
😅
u
It is, the wait-for just waits for port 5432 on domain
postgres
u
because lakefs will later try to connect to it
u
the wait for fails because there is no open port on that domain
u
My container still shut down.
u
If you’d like to use 0postgres, you should edit the run command and the properies as well
Copy code
entrypoint: ["/app/wait-for", "postgres:5432", "--", "/app/lakefs", "run"] -> entrypoint: ["/app/wait-for", "0postgres:5432", "--", "/app/lakefs", "run"]

- LAKEFS_DATABASE_CONNECTION_STRING=<postgres://lakefs:lakefs@postgres/postgres?sslmode=disable> -> - LAKEFS_DATABASE_CONNECTION_STRING=<postgres://lakefs:lakefs@0postgres/call-center?sslmode=disable>
u
nc -z postgres 5432 returns nc: getaddrinfo: Try again
u
The container starts and I can go in there and run the same function that wait for is running and that is the result.
u
nc -z localhost 5432 just returns to the next line with no output
u
Sorry but I am not following, which container ?
u
The lakeFS container.
u
The operation times out.
u
Can you please share your docker-compose file ?
u
From the lakefs container you could run wait-for directly
u
Copy code
app/wait-for postgres:5432
u
I am using Terraform not docker compose, it is very similar though.. It just pulls the latest image from treeverse/lakefs and run docker run on that. Here is the relevant code.
Copy code
resource "docker_container" "lakeFS" {
  image = docker_image.lakeFS.name
  name  = "LakeFS"
  env = [
    "LAKEFS_AUTH_ENCRYPT_SECRET_KEY=secret",
    "LAKEFS_DATABASE_TYPE:postgres",
    "LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=<postgres://lakefs:lakefs@postgres:5432/call-center?sslmode=disable>",
    "LAKEFS_BLOCKSTORE_TYPE=local",
    "LAKEFS_BLOCKSTORE_LOCAL_PATH=/home/lakefs",
    "LAKEFS_GATEWAYS_S3_DOMAIN_NAME=<http://s3.local.lakefs.io:8000|s3.local.lakefs.io:8000>",
    "LAKEFS_LOGGING_LEVEL=INFO",
  "LAKEFS_COMMITTED_LOCAL_CACHE_DIR=/home/lakefs/.local_tier"]
  ports {
    internal = 8000
    external = 8000
  }
  hostname = "localhost"

  logs     = true
  must_run = true
  attach = false

  entrypoint = ["/app/wait-for", "postgres:5432", "--", "/app/lakefs", "run"]
 

}
u
wait for times out.
u
try
/app/wait-for 0postgres:5432
u
I switched that to postgres. It was a type.
u
typo
u
Oh ok
u
If it times out it means that there is a connectivity problem
u
I assume there is a problem with the domain name
u
I have to assume lakefs cant make any outside connections because I have verified that I can remotely connect to my pg instance.
u
How could I test that?
u
That’s a good question, lakeFS is quite limited,
u
That is why I am wondering if there is amy way to mimic a positve response from /app/wait-for in the entrypoint
u
you could run
nc  postgres 5432
u
If it returns it didn’t connect
u
So could I just return a 1 or something at that point, I tried miserably to attempt something like that but I am not the best bash coder.
u
I would start by running a different image ( instead of lakeFS ) and trying to connect with that image
u
Or is that the case that is already working for you?
u
you could us nc with v flag for more information
u
Copy code
~ $ nc -zv postgres 5431
nc: connect to postgres port 5431 (tcp) failed: Connection refused

~ $ nc -zv postgres 5432
Connection to postgres 5432 port [tcp/postgresql] succeeded!
~ $
u
nc: getaddrinfo: Try again
u
Seems like its failing to get the address
u
I think you should try running the container with a different image that has a postgres client and stronger debugging capabilities
u
~ $ nc -zv postgres 5432 nc: getaddrinfo: Try again ~ $ nc -zv localhost 5432 nc: connect to localhost port 5432 (tcp) failed: Connection refused nc: connect to localhost port 5432 (tcp) failed: Connection refused nc: connect to localhost port 5432 (tcp) failed: Address not available ~ $ nc -zv 0.0.0.0 5432 nc: connect to 0.0.0.0 port 5432 (tcp) failed: Connection refused
u
My problem is right there in front of me.
u
Is there no way to skip /app-wait-for?
u
You could skip it
u
but than lakeFS will fail to connect
u
wait-for is just there to protect you from running before postgres is available from the lakeFS machine.
u
IIUC you already skipped it and got a different error
u
What if I already have Postgres running and I am certain of it?
u
I understand, I am also certain of that. But lakeFS doesn’t manage to connect to it for some reason
u
When the instance starts up is it impossible to skip or mimic the outcome wait-for?
u
It is because that nc function fails. What does lakefs actually use to connect to the pg instance?
u
psql?
u
OK, I might of got you wrong at the start, thought you already tried without wait-for and it failed. try changing
entrypoint = ["/app/wait-for", "postgres:5432", "--", "/app/lakefs", "run"]
to
entrypoint = ["/app/lakefs", "run"]
u
I did but will try it again just for my own sanity.
u
What happened when you did it?
u
Because that is the way to skip
wait-for
u
By the way, what machine are you running your docker on?
u
The container exits immediately and I cant even ssh into for a few seconds.
u
I am running on windows in a WSL2 Ubuntu
u
But all my containers are running on windows
u
So lakeFS probably fails when trying to connect to the database ( nothing to do with
wait-for
now )
u
So this is basically some port issue. I can start lakeFS fine with docker compose.
u
Port or domain resolving
u
can app run be more verbose?
u
Are you seeing any errors?
u
or logs
u
Don’t think the verbosity is the problem.
u
I am not seeing errors. I start the instance that download and look at the nc function in there and I need to get my hostname to resolve to postgres. That is what will make lakeFS happy and I want to make lakeFS happy. 😁 Time to go further down the rabbit.
u
If you can provide here the configurations you are using to run both lakeFS and postgres I will try to run it on my end and see if I can reproduce it.
u
Otherwise we can set a meeting tomorrow and try to debug it on your end.
u
All of your help was super enlightening. I think I will track down the issue shortly.
u
I will definitely reconvene with you and let you know my progress. Are you familiar with Terraform?
u
OK, I am familiar a bit with terraform. Waitingto hear about your progress. Good luck
u
I am off to bed a sad little man. I did not solve my issue but I did get one more piece of valuable knowledge. I was able to get into the lakeFS server and run nc -zv {IP-OF-MY-PG-INSTANCE} 5432 and it successfully connected. So I put that in the wait-for function and the container exits with errors but that makes me believe there is a problem with the connection string but I can't seem t locate it yet. Good night.
Copy code
"LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=<postgresql://lakefs:lakefs@172.17.0.3:5432/postgres?sslmode=disable>"
u
@Guy Hardonag @Walter Johnson Should variable name be
LAKEFS_DATABASE_CONNECTION_STRING
instead of
LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING
(no POSTGRES in name) unless using version 0.80?
u
That’s right - for new installations with version >=v0.80.0 , the env var containing
POSTGRES
should be used
u
I have gotten into the container for a few seconds by making /app/wait-for hang. While in there I ran /app/lakefs run and I got this response ->
Copy code
Failed to open KV store                       error="unknown driver: "
u
Hi @Walter Johnson, Nice progress. Try adding
LAKEFS_DATABASE_TYPE=postgres
above
LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING
u
One step closer to nirvana
Copy code
FATAL  [2022-09-15T19:49:12Z]cmd/run.go:123 cmd/lakefs/cmd.glob..func8 Failed to open KV store                       error="connect failed: failed to connect to `host=/tmp user=lakefs database=`: dial error (dial unix /tmp/.s.PGSQL.5432: connect: no such file or directory)"
u
SUCCESS!
u
It was a combination of typos and configuration errors.
u
You get 3 or four of those put together and its a nightmare.
u
But through out it all I gained a little better understanding of lakeFS
u
Happy to hear 😄 😄 Hope it goes smooth from now on. We are here if you need any help!
u
This inspired me to create my first npm package https://www.npmjs.com/package/lakefs
u
Cool sunglasses lakefs Will check it out Thanks!