Hey all! Recently spun up LakeFS for the first ti...
# help
e
Hey all! Recently spun up LakeFS for the first time using a helm chart, with a self hosted postgres DB and Ceph blockstore, all local. Really cool so far, very excited to play around with it some more heart lakefs One quick question, when spinning up the pod for the first time, I noticed it would halt after connecting to the DB. After adding in a http/https proxy to allow it to communicate with the internet, it spun up fine. I was wondering if there are some behind the scenes calls that are being made, and what exactly they are?
y
Hey @Erin Aho, lakeFS should be able to work without internet connection. Let me check that and update here shortly.
gratitude thank you 1
@Erin Aho I was able to spin lakeFS up with no internet access. Can you share the command you are using to run lakeFS, the configuration files (or environment variables) and the output printed by the command?
o
@Yoni Augarten @Erin Aho I believe this might be related to the use of the S3 protocol outside of AWS. If I remember correctly, lakeFS will try to figure out if the S3 endpoint is actually AWS S3 by calling an AWS API. That being said, I'd expect that even in the case of S3 compatible storage without internet access, it shouldn't halt, but rather give up after 30 seconds (or some other timeout?)
e
If I remember correctly, lakeFS will try to figure out if the S3 endpoint is actually AWS S3 by calling an AWS API.
When using the S3 block store, it definitely does try to connect to the AWS S3 API, I can see those requests timing out in the logs when using the S3 block store. However, on my end, the hanging issue still occurs using a local block store and no proxy (I tested it out to see if that was the culprit). I'm recreating the initial issue now so I can share the logs/config
y
Thank you!
e
Debugging details Using lakefs helm chart, version 1.0.3 values file:
Copy code
lakefs:
  replicaCount: 1
  secrets:
    authEncryptSecretKey: <injected secret>
    databaseConnectionString: <injected secret>
  lakefsConfig: |
    database:
      type: "postgres"
    blockstore:
      type: local
Logs before hang:
Copy code
time="2023-11-23T14:06:37Z" level=info msg="Configuration file" func=<http://github.com/treeverse/lakefs/cmd/lakefs/cmd.initConfig|github.com/treeverse/lakefs/cmd/lakefs/cmd.initConfig> file="/build/cmd/lakefs/cmd/root.go:109" fields.file=/etc/lakefs/config.yaml file="/build/cmd/lakefs/cmd/root.go:109" phase=startup
time="2023-11-23T14:06:37Z" level=info msg="Config loaded" func=cmd/lakefs/cmd.initConfig file="cmd/root.go:151" fields.file=/etc/lakefs/config.yaml file="cmd/root.go:151" phase=startup
time="2023-11-23T14:06:37Z" level=info msg=Config func=cmd/lakefs/cmd.initConfig file=...
time="2023-11-23T14:06:37Z" level=info msg="lakeFS run" func=cmd/lakefs/cmd.glob..func8 file="cmd/run.go:91" version=1.2.0
time="2023-11-23T14:06:37Z" level=info msg="KV valid" func=pkg/kv.ValidateSchemaVersion file="build/pkg/kv/migration.go:68" version=4
time="2023-11-23T14:06:37Z" level=info msg="initialized Auth service" func=pkg/auth.NewAuthService file="build/pkg/auth/service.go:188" service=auth_service
time="2023-11-23T14:06:37Z" level=info msg="initialize blockstore adapter" func=pkg/block/factory.BuildBlockAdapter file="build/pkg/block/factory/build.go:32" type=local
time="2023-11-23T14:06:37Z" level=info msg="initialized blockstore adapter" func=pkg/block/factory.buildLocalAdapter file="build/pkg/block/factory/build.go:79" path=/home/lakefs/lakefs/data/block type=local
Works fine upon adding
Copy code
extraEnvVars:
    - name: HTTP_PROXY
      value: <proxy>
    - name: HTTPS_PROXY
      value: <proxy>
    - name: NO_PROXY
      value: <no proxy list>
If there aren't any behind the scenes networking calls being made, my guess is it's related to the connection to postgres. Will do some more digging my end.
y
If internet access is needed to connect to your postgres instance, then of course lakeFS requires internet access. Is that the case?
e
Shouldn't be, the connection string is for a
svc.cluster.local
inside the cluster
In any case, issue is probably config on our side, mostly just wanted to check there weren't any unexpected calls being made, thanks!
o
@Erin Aho another possibility is dns resolution, i.e. even if
svc.cluster.local
resolves to a local IP, the first DNS server used to resolve it is a public one? (just a wild guess, of course)
e
Could well be something along those lines, time to dig out the old DNSUtils and do some debugging 🙂
🫣 1
o
As the ancient Haiku goes:
🤣 4
alphabet white d 1
alphabet white n 1
alphabet white s 1
alphabet yellow a 1
alphabet yellow r 1
alphabet yellow p 1