Hi team, I am trying configure lakefs with azure ...
# help
a
Hi team, I am trying configure lakefs with azure blob storage using storage access key
Copy code
sudo docker run \
  --name lakefs \
  -p 80:8000 \
  -e LAKEFS_DATABASE_TYPE="postgres" \
  -e LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=$LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING \
  -e LAKEFS_AUTH_ENCRYPT_SECRET_KEY=$LAKEFS_AUTH_ENCRYPT_SECRET_KEY \
  -e LAKEFS_BLOCKSTORE_TYPE="azure" \
  -e LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCOUNT=$LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCOUNT \
  -e LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCESS_KEY=$LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCESS_KEY \
and I get the following error
time="2023-03-11T22:41:44Z" level=error msg="failed to get azure blob from container &{%!s(*generated.ContainerClient=&{<https://lakefs11673f4e.blob.core.windows.net/mds> {[0x1400120 0xc008045c00 0xc0080b01c0 0xc008077770 0xc00804ff38 0x13ffda0 0x13ffaa0 {0xc000159350}]}}) %!s(*exported.SharedKeyCredential=<nil>)} key &{%!s(*generated.BlobClient=&{<https://lakefs11673f4e.blob.core.windows.net/mds/dummy> {[0x1400120 0xc008045c00 0xc0080b01c0 0xc008077770 0xc00804ff38 0x13ffda0 0x13ffaa0 {0xc000159350}]}}) %!s(*generated.BlockBlobClient=&{<https://lakefs11673f4e.blob.core.windows.net/mds/dummy> {[0x1400120 0xc008045c00 0xc0080b01c0 0xc008077770 0xc00804ff38 0x13ffda0 0x13ffaa0 {0xc000159350}]}}) %!s(*exported.SharedKeyCredential=<nil>)}" func="pkg/logging.(*logrusEntryWrapper).Errorf" file="build/pkg/logging/logger.go:262" error="DefaultAzureCredential: failed to acquire a token.\nAttempted credentials:\n\tEnvironmentCredential: missing environment variable AZURE_TENANT_ID\n\tManagedIdentityCredential: no default identity is assigned to this resource\n\tAzureCLICredential: Azure CLI not found on path" host=68.219.229.172 method=POST operation_id=CreateRepository path=/api/v1/repositories request_id=94bca5e0-f1e3-49e0-ab44-20313230bbdf service_name=rest_api user=alex
time="2023-03-11T22:41:44Z" level=warning msg="Could not access storage namespace" func="pkg/api.(*Controller).CreateRepository" file="build/pkg/api/controller.go:1406" error="DefaultAzureCredential: failed to acquire a token.\nAttempted credentials:\n\tEnvironmentCredential: missing environment variable AZURE_TENANT_ID\n\tManagedIdentityCredential: no default identity is assigned to this resource\n\tAzureCLICredential: Azure CLI not found on path" reason=unknown service=api_gateway storage_namespace="<https://lakefs11673f4e.blob.core.windows.net/mds>"
Reading the above, it looks like lakefs tries to use my storage access key, then fails and tries to fall back on Service principal credentials. I verified that shared key works using python. What should I do?
g
Hi @Alexander Reinthal , It seems like there is an issue with the credentials, can you please validate the storage account you provided in the environment variable is the same as the one in the created repository and that you can acceit with the provided secret key?
a
Copy code
import os
import os, uuid
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

# Create a local directory to hold blob data
local_path = "./data"
os.mkdir(local_path)

# Create a file in the local data directory to upload and download
local_file_name = str(uuid.uuid4()) + ".txt"
upload_file_path = os.path.join(local_path, local_file_name)

# Write text to the file
file = open(file=upload_file_path, mode='w')
file.write("Hello, World!")
file.close()
blob_service_client = BlobServiceClient(os.environ["LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCOUNT"], credential=os.environ["LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCESS_KEY"])
# Create a blob client using the local file name as the name for the blob
blob_client = blob_service_client.get_blob_client(container='mds', blob=local_file_name)

print("\nUploading to Azure Storage as blob:\n\t" + local_file_name)

# Upload the created file
with open(file=upload_file_path, mode="rb") as data:
    blob_client.upload_blob(data)
I did verify best I could.
I’ll triple check ¯\_(ツ)_/¯
g
Looks like a good verification :), are you running the python code inside the docker container? If not you can also try to exec into the docker machine and check that environment variables are as expected. I will try to reproduce your case and get back to you tomorrow
lakefs 1
a
I didnt execute in the docker environment cuz it doesnt have python but I checked, and the environment variables are present inside docker aswell,
Copy code
sudo docker exec -it lakefs sh
~ $ env
LAKEFS_BLOCKSTORE_TYPE=azure
LAKEFS_DATABASE_TYPE=postgres
HOSTNAME=22de2fd23000
SHLVL=1
HOME=/home/lakefs
LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=<postgres://madonna>:<REDACTED>@siversky-server.postgres.database.azure.com/postgres?sslmode=require
TERM=xterm
PATH=/app:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
LAKEFS_AUTH_ENCRYPT_SECRET_KEY=<REDACTED>
LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCOUNT=<https://lakefs11673f4e.blob.core.windows.net/>
PWD=/home/lakefs
LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCESS_KEY=<REDACTED>
g
I understand, thanks! I will check it out and get back to you
❤️ 1
a
image.png
g
Hi @Alexander Reinthal,
LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCOUNT
should get only the name of the storage account and the full URL Please try using
LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCOUNT=lakefs11673f4e
Sorry I missed that at first 😞
🥰 1
a
omg that worked! I was really close to solving it on my own. I was trying
Copy code
LAKEFS_BLOCKSTORE_AZURE_STORAGE_ACCOUNT=<http://lakefs11673f4e.blob.core.windows.net|lakefs11673f4e.blob.core.windows.net>
but got the same error! With only lakefs11673f4e it works! Thank you so much! made my day!
g
Happy it worked 🤗 !!! Enjoy, if you need anything else we’re here
lakefs 1
I will also go over the documentation, see if we can make it more clear