• n

    Nuno Carvalho dos Santos

    2 months ago
    Forgive me if this question is already answered: Is there any way to install LakeFS in distributed mode?
  • n

    Nuno Carvalho dos Santos

    2 months ago
    Let me ask it another way, is it strictly necessary to use any of these (AWS S3, Azure Blob Storage and Google Cloud Storage) to achieve redundancy?
    n
    Ariel Shaqed (Scolnicov)
    +1
    4 replies
    Copy to Clipboard
  • k

    Kevin Vasko

    1 month ago
    Ok stupid question…. Say I have a bucket s3://lakefs/projects/. I have an ec2 instance in a VPC that is hosting the lakefs software. I provided my ec2 instance an IAM_ROLE to access s3://lakefs/projects. So my EC2 instance can do things with s3://lakefs/projects/. Now when my users interact with lakefs are they interacting with the permissions that lakefs has? Or are they interacting with s3 outside of the context of lakefs and lakefs is just acting as “mediator” where it just points users to the s3 buckets?
    k
    Guy Hardonag
    +1
    9 replies
    Copy to Clipboard
  • v

    Vino

    1 month ago
    Hi, I'm running lakeFS using everything bagel. lakeFS and jupyter notebook are up and running. But when I
    import lakefs_client
    I run into this error
    ModuleNotFoundError: No module named 'lakefs_client'
    . What am I missing?
    v
    Or Tzabary
    7 replies
    Copy to Clipboard
  • carlos osuna

    carlos osuna

    1 month ago
    Hello, I'm having some trouble getting virtual hosting for S3 Gateway working with my installation through Helm.

    I'm trying to access as such:

    aws s3 --endpoint-url <https://bucket.s3.my.path.net> ls

    It returns the same as base path:

    aws s3 --endpoint-url <https://s3.my.path.net> ls
    in that it returns the list of available buckets instead of the contents of the bucket. Is there something wrong with my configuration here? Is the
    domain_name
    field supposed to be something else? Traffic is being sent to the service, but I can't seem to identify why the bucket is not getting resolved. Below is my
    values.yaml
    for Helm installation:
    lakefsConfig: |
      blockstore:
        type: s3
      gateways:
        s3:
          domain_name: <http://s3.my.path.net|s3.my.path.net>
    ingress:
      enabled: true
      ingressClassName: "nginx"
      tls:
      - hosts:
        - <http://my.path.net|my.path.net>
        - <http://s3.my.path.net|s3.my.path.net>
        - "*.<http://s3.my.path.net|s3.my.path.net>"
        secretName: <secret tls>
      hosts:
        - host: <http://my.path.net|my.path.net>
          paths: ["/"]
        - host: <http://s3.my.path.net|s3.my.path.net>
          paths: ["/"]
        - host: "*.<http://s3.my.path.net|s3.my.path.net>"
          paths: ["/"]
      
    extraEnvVarsSecret: <secret env>
    Thank you!
    carlos osuna
    Eden Ohana
    +3
    12 replies
    Copy to Clipboard
  • m

    Mike Logaciuk

    1 month ago
    Hi! I've built sample docker-compose for future development with lakeFS and minIO as S3 imitation. However I can't get it able to work, I've done configuration as it is said in docs, minIO instance is seen and works, I am even able to connect it as a S3 in Pycharm, however lakeFS after minute of trying to start, returns timeout in Docker. It doesn't return any log or what so ever:
    version: '3.9'
    # Settings and configurations that are common for all containers
    x-minio-common:
    _&_minio-common
    image: 'minio/minio:latest'
    command: server --console-address ":9001" <http://lakefs-minio>{1...4}/data{1...2}
    expose:
    - "9000"
    - "9001"
    environment:
    MINIO_ROOT_USER: ${AWS_ACCESS_KEY_ID}
    MINIO_ROOT_PASSWORD: ${AWS_SECRET_ACCESS_KEY}
    networks:
    - lakefs-network
    healthcheck:
    test:
    [
    "CMD",
    "curl",
    "-f",
    "<http://localhost:9000/minio/health/live>"
    ]
    interval: 30s
    timeout: 20s
    retries: 3
    # starts 4 docker containers running minio server instances.
    # using nginx reverse proxy, load balancing, you can access
    # it through port 9000.
    services:
    lakefs-minio1:
    <<: _*_minio-common
    hostname: lakefs-minio1
    container_name: lakefs-minio1
    networks:
    - lakefs-network
    volumes:
    - data1-1:/data1
    - data1-2:/data2
    lakefs-minio2:
    <<: _*_minio-common
    hostname: lakefs-minio2
    container_name: lakefs-minio2
    networks:
    - lakefs-network
    volumes:
    - data2-1:/data1
    - data2-2:/data2
    lakefs-minio3:
    <<: _*_minio-common
    hostname: lakefs-minio3
    container_name: lakefs-minio3
    networks:
    - lakefs-network
    volumes:
    - data3-1:/data1
    - data3-2:/data2
    lakefs-minio4:
    <<: _*_minio-common
    hostname: lakefs-minio4
    container_name: lakefs-minio4
    networks:
    - lakefs-network
    volumes:
    - data4-1:/data1
    - data4-2:/data2
    lakefs-minio-setup:
    image: minio/mc
    container_name: lakefs-minio-setup
    environment:
    - MC_HOST_lakefs=http://${AWS_ACCESS_KEY_ID}:${AWS_SECRET_ACCESS_KEY}@lakefs-minio1:9000
    depends_on:
    - lakefs-minio1
    - lakefs-minio2
    - lakefs-minio3
    - lakefs-minio4
    command: [ "mb", "lakefs/example" ]
    networks:
    - lakefs-network
    lakefs-nginx:
    image: nginx:1.19.2-alpine
    hostname: lakefs-nginx
    container_name: lakefs-nginx
    networks:
    - lakefs-network
    volumes:
    - ./nginx.conf:/etc/nginx/nginx.conf:ro
    ports:
    - "9000:9000"
    - "9001:9001"
    depends_on:
    - lakefs-minio1
    - lakefs-minio2
    - lakefs-minio3
    - lakefs-minio4
    lakefs-postgres:
    image: "postgres:11"
    hostname: lakefs-postgres
    container_name: lakefs-postgres
    environment:
    POSTGRES_USER: ${POSTGRES_USER}
    POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    POSTGRES_DB: ${POSTGRES_DB}
    networks:
    - lakefs-network
    lakefs-setup:
    image: treeverse/lakefs:latest
    container_name: lakefs-setup
    depends_on:
    - lakefs-POSTGRES_DB
    - lakefs-minio-setup
    environment:
    - LAKEFS_AUTH_ENCRYPT_SECRET_KEY=${LAKEFS_AUTH_ENCRYPT_SECRET_KEY}
    - LAKEFS_DATABASE_CONNECTION_STRING=postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@lakefs-postgres/postgres?sslmode=disable
    - LAKECTL_CREDENTIALS_ACCESS_KEY_ID=${LAKECTL_CREDENTIALS_ACCESS_KEY_ID}
    - LAKECTL_CREDENTIALS_SECRET_ACCESS_KEY=${LAKECTL_CREDENTIALS_SECRET_ACCESS_KEY}
    - LAKECTL_SERVER_ENDPOINT_URL=<http://lakefs:8000>
    - LAKEFS_BLOCKSTORE_TYPE=s3
    entrypoint: ["/app/wait-for", "postgres:5432", "--", "sh", "-c",
    "lakefs setup --user-name docker --access-key-id  ${LAKECTL_CREDENTIALS_ACCESS_KEY_ID} --secret-access-key ${LAKECTL_CREDENTIALS_SECRET_ACCESS_KEY} && lakectl repo create <lakefs://example>
    <s3://example>"
    ]
    networks:
    - lakefs-network
    lakefs:
    image: "treeverse/lakefs:latest"
    container_name: lakefs
    ports:
    - "8001:8000"
    depends_on:
    - "lakefs-postgres"
    environment:
    - LAKEFS_AUTH_ENCRYPT_SECRET_KEY=${LAKEFS_AUTH_ENCRYPT_SECRET_KEY}
    - LAKEFS_DATABASE_CONNECTION_STRING=postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@lakefs-postgres/postgres?sslmode=disable
    - LAKEFS_BLOCKSTORE_TYPE=s3
    - LAKEFS_BLOCKSTORE_LOCAL_PATH=${LAKEFS_BLOCKSTORE_LOCAL_PATH:-/home/lakefs}
    - LAKEFS_GATEWAYS_S3_DOMAIN_NAME=${LAKEFS_GATEWAYS_S3_DOMAIN_NAME:-<http://s3.local.lakefs.io:8000|s3.local.lakefs.io:8000>}
    - LAKEFS_BLOCKSTORE_S3_CREDENTIALS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
    - LAKEFS_BLOCKSTORE_S3_CREDENTIALS_ACCESS_SECRET_KEY=${AWS_SECRET_ACCESS_KEY}
    - LAKEFS_LOGGING_LEVEL=DEBUG
    - LAKEFS_STATS_ENABLED
    - LAKEFS_BLOCKSTORE_S3_ENDPOINT=<http://localhost:9000>
    - LAKEFS_BLOCKSTORE_S3_FORCE_PATH_STYLE=true
    - LAKEFS_COMMITTED_LOCAL_CACHE_DIR=${LAKEFS_COMMITTED_LOCAL_CACHE_DIR:-/home/lakefs/.local_tier}
    entrypoint:
    [
    "/app/wait-for",
    "postgres:5432",
    "--",
    "/app/lakefs",
    "run"
    ]
    volumes:
    data1-1:
    name: lakefs-minio1-storage-1
    data1-2:
    name: lakefs-minio1-storage-2
    data2-1:
    name: lakefs-minio2-storage-1
    data2-2:
    name: lakefs-minio2-storage-2
    data3-1:
    name: lakefs-minio3-storage-1
    data3-2:
    name: lakefs-minio3-storage-2
    data4-1:
    name: lakefs-minio4-storage-1
    data4-2:
    name: lakefs-minio4-storage-2
    networks:
    lakefs-network:
    driver: bridge
    name: lakefs-network
    m
    Jonathan Rosenberg
    6 replies
    Copy to Clipboard
  • d

    Damon C

    1 month ago
    Hi there - I’m trying to spin up LakeFS on ECS with RDS IAM authentication as opposed to hard-coded credentials. The way IAM auth appears to work (per that link) is that a
    generate-db-auth-token
    API call is made to the RDS service and then the returned value (valid for 15 minutes) is used as a password. Peeking at
    ConnectDBPool
    (and the corresponding pgx library) it seems like I might need to tweak that method to get the token…curious if others have run into this before or have thoughts. Context: I’m trying to deploy a full stack using AWS CDK and didn’t want to create an environment variable for my ECS task definition with a hard-coded password as it would show up in the CloudFormation template. 😅
    d
    Yoni Augarten
    14 replies
    Copy to Clipboard
  • c

    Clay Buxton

    1 month ago
    Is there a way to make objects in LakeFS publicly available via http.
    c
    Eden Ohana
    3 replies
    Copy to Clipboard
  • f

    Farman Pirzada

    1 month ago
    is there a way to disable lakefs authentication? I am deploying to a dev environment and require bypassing authentication so that we can use our own generating a GCP token and adding to Modheaders)
    f
    Eden Ohana
    2 replies
    Copy to Clipboard
  • a

    Anandkarthick Krishnakumar

    3 weeks ago
    Hello - I'm setting up lakefs but I don't see any reference to creating the metastore table using lakectl. If you're using for the first time, how would you create the table before enabling
    symlink
    thanks
    a
    Itai Admi
    15 replies
    Copy to Clipboard