Hi, I am trying to setup the lakefs server on GKE ...
# help
k
Hi, I am trying to setup the lakefs server on GKE with terraform, but running into the issue that its not able to find the
index.yaml
file For reference, this is the relevant HCL script (Have removed the GKE and Cloud SQL modules) I am sure I have specified either the chart or the repository arguments wrongly, but not sure what is the right one
Copy code
provider "helm" {

  kubernetes {
    host                   = "${google_container_cluster.primary.endpoint}"
    token                  = "${data.google_client_config.current.access_token}"
    client_certificate     = "${base64decode(google_container_cluster.primary.master_auth.0.client_certificate)}"
    client_key             = "${base64decode(google_container_cluster.primary.master_auth.0.client_key)}"
    cluster_ca_certificate = "${base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)}"
  }
}

resource "helm_release" "my-lakefs" {
  name  = "my-lakefs"
  chart = "charts/lakefs"
  repository = "<https://github.com/treeverse/lakeFS>"
}
a
Hi @Kartikey Mullick, Sorry to hear you're running into this problem. The repository argument seems suspicious: the chart lives in https://github.com/treeverse/charts, and is not part of the lakeFS repository itself. We'll get back to you on this, of course. In the meantime could you provide the full messages from Terraform and/or Helm? (Don't forget to scrub confidential data!)
k
Hi @Ariel Shaqed (Scolnicov) , sure here are the complete error logs from terraform 1. When using
repository = "<https://github.com/treeverse/lakeFS>"
Copy code
╷
│ Error: could not download chart: looks like "<https://github.com/treeverse/lakeFS>" is not a valid chart repository or cannot be reached: failed to fetch <https://github.com/treeverse/lakeFS/index.yaml> : 404 Not Found
│
│   with helm_release.my-lakefs,
│   on <http://main.tf|main.tf> line 149, in resource "helm_release" "my-lakefs":
│  149: resource "helm_release" "my-lakefs" {
│
2. When using
repository = "<https://github.com/treeverse/charts>"
Copy code
╷
│ Error: could not download chart: looks like "<https://github.com/treeverse/charts>" is not a valid chart repository or cannot be reached: failed to fetch <https://github.com/treeverse/charts/index.yaml> : 404 Not Found
│
│   with helm_release.my-lakefs,
│   on <http://main.tf|main.tf> line 149, in resource "helm_release" "my-lakefs":
│  149: resource "helm_release" "my-lakefs" {
│
╵
In both cases, I have set
chart = "charts/lakefs"
a
I did some digging around, but please bear in mind that I am not an expert on Helm. We publish the chart on artifacthub. The repo is named https://charts.lakefs.io/, and the chart there is called "lakefs". I hope this will be enough to get you unstuck 🙂
k
Thanks a lot, that worked!
a
Glad to hear, thanks for your patience!
gratitude thank you 1
k
Hi @Ariel Shaqed (Scolnicov), I am getting additional errors when trying to pass my GCP specific arguments to the image as specified in the deploy with GKE docs Here is my script:
Copy code
resource "helm_release" "my-lakefs" {
  name  = "my-lakefs"
  chart = "lakefs"
  repository = "<https://charts.lakefs.io>"

  set {
    name  = "secrets.databaseConnectionString"
    value = "postgresql+psycopg2://${var.db_user}:${var.db_password}@/${var.db_name}?host=/cloudsql/${var.project_id}:${var.location}:${var.db_instance}"
  }
  set {
    name  = "secrets.authEncryptSecretKey"
    value = "uuid_code(not shown)"
  }
  set {
    name  = "lakefsConfig.blockstore.type"
    value = "gs"
  }
  set {
    name  = "lakefsConfig.gs.credentials_file"
    value = "/secrets/lakefs-service-account.json"
  }
  
  depends_on = [ 
    google_container_cluster.primary
   ]
}
Here are the error logs from terraform:
Copy code
helm_release.my-lakefs: Modifying... [id=my-lakefs]
╷
│ Error: template: lakefs/templates/deployment.yaml:93:12: executing "lakefs/templates/deployment.yaml" at <include "lakefs.s3proxyContainer" .>: error calling include: template: lakefs/templates/_proxy_container.tpl:3:37: executing "lakefs.s3proxyContainer" at <fromYaml>: wrong type for value; expected string; got map[string]interface {}
│
│   with helm_release.my-lakefs,
│   on <http://main.tf|main.tf> line 149, in resource "helm_release" "my-lakefs":
│  149: resource "helm_release" "my-lakefs" {
│
╵
I apologize if I am missing something obvious - not an expert in deployment
a
Hi @Kartikey Mullick, I'll dig around, maybe consult one of our Helm experts tomorrow and respond.
👍 1
Hi @Kartikey Mullick!
lakeFSConfig
needs to be a string value. You can see in the example: saying
lakefsConfig:  |
starts a YAML string. Can you perhaps try:
Copy code
resource "helm_release" "my-lakefs" {
  name  = "my-lakefs"
  chart = "lakefs"
  repository = "<https://charts.lakefs.io>"

  set {
    name  = "secrets.databaseConnectionString"
    value = "postgresql+psycopg2://${var.db_user}:${var.db_password}@/${var.db_name}?host=/cloudsql/${var.project_id}:${var.location}:${var.db_instance}"
  }
  set {
    name  = "secrets.authEncryptSecretKey"
    value = "uuid_code(not shown)"
  }
  set {
    name  = "lakefsConfig"
    value = "blockstore:\n  type: gs\n  gs:\n    credentials_file: /secrets/lakefs-service-account.json"
  }
  
  depends_on = [ 
    google_container_cluster.primary
   ]
}
This is just me typing some YAML and HCL, I apologise for any typing errors. Also please note that I've also gone ahead and changed the
credentials_file
to be under
<http://blockstore.gs|blockstore.gs>
, whereas in your case it was under
gs
.
If you have a more complex configuration and don't mind some HCL you might prefer to create a lakefsConfig object and then
Copy code
{
...
  set {
    name="lakefsConfig"
    value=yamlencode(some.object.lakefsConfig)
  }
...
}
This uses yamlencode to create a good string value from an object.
k
Hi @Ariel Shaqed (Scolnicov) thanks a lot for the support! • I tried setting the
lakefsConfig
as string like you mentioned and ran into this error in the first screenshot • Then I destroyed the entire infra and recreated it (Thinking it was an issue with updating the helm chart in-place) and got the same error as shown in the second screenshot. I have also run the helm command to check the error and this is the output-
Copy code
helm history my-lakefs
REVISION        UPDATED                         STATUS  CHART           APP VERSION     DESCRIPTION
1               Tue Aug 15 12:25:14 2023        failed  lakefs-0.9.16   0.106.0         Release "my-lakefs" failed: context deadline exceeded
• I also increased the timeout to 20 minutes, and had the same error In case it is needed, I can also attach the entire
<http://main.tf|main.tf>
file I use (no confidential data)
a
Hi @Kartikey Mullick, Have you tried to do as the second screenshot suggests? In it terraform suggests that you will further need to use the
helm
command to investigate.
k
Yes, there seems to be an issue connecting the postgres db in Cloud SQL to k8s cluster in GKE, I am looking into it, thank you!