Thread
#lakefs-for-beginners
    f

    Farman Pirzada

    1 month ago
    howdy folks, im working on deploying lakefs to our GCP platform using Google Cloud Run. It will not run so I needed to check via Docker what the issue was:
    PORT=8080 && docker run -p 9090:${PORT} -e PORT=${PORT} 4239d2f8608b
    This is the error message I am seeing:
    time="2022-08-19T04:04:47Z" level=info msg="Config loaded" func=cmd/lakefs/cmd.initConfig file="cmd/root.go:103" fields.file=/home/lakefs/.lakefs.yaml file="cmd/root.go:103" phase=startup
    time="2022-08-19T04:04:47Z" level=fatal msg="Invalid config" func=cmd/lakefs/cmd.initConfig file="cmd/root.go:108" error="bad configuration: missing required keys: [auth.encrypt.secret_key blockstore.type]" fields.file=/home/lakefs/.lakefs.yaml file="cmd/root.go:108" phase=startup
    I understand what I'm missing:
    auth.encrypt.secret_key 
    blockstore.type
    However, how do i check these vales? This is my
    Dockerfile
    :
    FROM treeverse/lakefs:latest
    I'm using Cloud Run and have my values like so
    values.yaml
    from the example:
    logging:
      format: json
      level: WARN
      output: "-"
    auth:
      encrypt:
        secret_key: "10a718b3f285d89c36e9864494cdd1507f3bc85b342df24736ea81f9a1134bcc"
    blockstore:
      type: gs
    If I run this as suggest in the docs as
    lakefs-config.yaml
    I will not be able to proceed because I don't have a local SQL connection to create. I am newish to Docker so I appreciate your patience with me here and can try to provide as much as possible but I'm not sure how to get the
    .lakefs.yaml
    file to read the values i have in my
    values.yaml
    - if I'm reading the error message correctly.
    d

    Damon C

    1 month ago
    Hi Farman, I haven’t used Cloud Run before, but my guess is that you either need to copy your values.yml into your container
    COPY values.yml /home/lakefs/.lakefs.yaml
    or define your config as environment variables in GCP. As shown in the lakefs docs, you need:
    LAKEFS_DATABASE_CONNECTION_STRING="[DATABASE_CONNECTION_STRING]"
    LAKEFS_AUTH_ENCRYPT_SECRET_KEY="[ENCRYPTION_SECRET_KEY]"
    LAKEFS_BLOCKSTORE_TYPE="gs"
    You will need to create a PostgreSQL instance in GCP as well.
    f

    Farman Pirzada

    1 month ago
    thanks @Damon C - for Postgres, I'm using Spanner
    COPY values.yml /home/lakefs/.lakefs.yaml
    I'll try this!
    d

    Damon C

    1 month ago
    Good luck! Also will be interesting to see if Spanner will work. Looks like it needs a PGAdapter(?).
    Applications can connect to a PostgreSQL interface-enabled Spanner database using native Spanner clients or PGAdapter
    f

    Farman Pirzada

    1 month ago
    @Damon C i appreciate your help and I know this may be an elementary question but how do I get these values?
    LAKEFS_DATABASE_CONNECTION_STRING="[DATABASE_CONNECTION_STRING]"
    LAKEFS_AUTH_ENCRYPT_SECRET_KEY="[ENCRYPTION_SECRET_KEY]"
    Are they something I need to generate from somewhere with LakeFS or GCP - or is it a randomly generated string I can put in?
    d

    Damon C

    1 month ago
    LAKEFS_DATABASE_CONNECTION_STRING
    is the connection string for your postgres database. With Spanner, I’m not sure where that comes from because it looks like you also have to run PGAdapter somewhere… With a regular postgres installation, lakefs has a section on setting it up here: https://docs.lakefs.io/deploy/gcp.html#creating-the-database-on-gcp-sql
    LAKEFS_AUTH_ENCRYPT_SECRET_KEY
    will be a randomly generated string. So a UUID or something similar.
    f

    Farman Pirzada

    1 month ago
    aha! that's what you meant, got it
    d

    Damon C

    1 month ago
    (And a disclaimer, I haven’t used GCP much and only just started looking at lakefs recently…but also spent the past couple of days messing around with database authentication. 😄)
    f

    Farman Pirzada

    1 month ago
    GCP sure is interesting but what makes it more interesting is that within my company, we're using terraform behind the scenes so some of the things youre mentioning will not play nicely with our internal provisioning tools
    d

    Damon C

    1 month ago
    Ah yep, that can def make things tricky - I was running into similar on the AWS side. e.g. I couldn’t build the connection string without exposing the password in the infrastructure template.
    f

    Farman Pirzada

    1 month ago
    yeah!
    also this is weird but is this right?
    gcs_storage
    ---
    logging:
      format: json
      level: WARN
      output: "-"
    
    database:
      connection_string: "<postgres://user:pass@lakefs.rds.amazonaws.com:5432/postgres>"
    
    auth:
      encrypt:
        secret_key: "10a718b3f285d89c36e9864494cdd1507f3bc85b342df24736ea81f9a1134bcc"
    
    blockstore:
      type: gs
      gs:
        credentials_file: /secrets/lakefs-service-account.json
    this is the example in lakefs's page and the connection string says aws
    i may actually check out #help for that because that looks like misinformation
    d

    Damon C

    1 month ago
    Ah, probably just copy/paste heh. (I mean, in theory you could connect to an RDS DB on AWS from GCP…but looks like every example has that same connection_string).
    f

    Farman Pirzada

    1 month ago
    haha right? i wonder where in the code i can see this
    d

    Damon C

    1 month ago
    You’ll just need to replace
    <http://lakefs.rds.amazonaws.com|lakefs.rds.amazonaws.com>
    with whatever your postgres hostname is.
    I think the examples on that page are just to highlight the
    blockstore
    sections and the rest of the yaml is just placeholders/generic.
    f

    Farman Pirzada

    1 month ago
    gotcha ok
    ya im gonna see what the connection string is because i imagine this is a common thing that you need to do with GCP and Postgres
    oh wait no that's for .NET
    d

    Damon C

    1 month ago
    I don’t think you’ll be able to use Spanner without some additional work. First, it needs to be setup as a PostgreSQL-dialect database and second it looks like you need to run this PGAdapter somewhere. And then the endpoint would be the host/port of wherever the pgadapter is running. You may need some help from your infra(?) team to get that spun up or I think the other option (if possible) is spinning up an “actual” postgres instance ala https://cloud.google.com/sql/docs/postgres/connect-instance-cloud-shell#create-instance
    f

    Farman Pirzada

    1 month ago
    ya youre right, thanks for pointing that out patiently - i know you mentioned it before 😅 - just wanted to see if it was possible to directly use Spanner
    d

    Damon C

    1 month ago
    🤗 Hehe no prob - I’m not entirely sure myself (says the AWS guy lol), so definitely good to verify.
    f

    Farman Pirzada

    1 month ago
    ya i wanted to reduce any tip of spending if i needed to as well but... im at a pretty big company so at least its my oyster... for now
    i created a SQL instance for my project and patiently wait for it to complete