• Yusuf Khan

    Yusuf Khan

    9 months ago
    Hey sorry still running into issues. I'm not able to connect to the actual azure blob storage, this is the console errror:
    time="2021-12-06T19:42:13-08:00" level=warning msg="Could not access storage namespace" func="<http://github.com/treeverse/lakefs/pkg/api.(*Controller).CreateRepository|github.com/treeverse/lakefs/pkg/api.(*Controller).CreateRepository>" file="/lakeFS/pkg/api/controller.go:1149" error="write error: -> <http://github.com/Azure/azure-storage-blob-go/azblob.newStorageError|github.com/Azure/azure-storage-blob-go/azblob.newStorageError>, /home/runner/go/pkg/mod/github.com/!azure/azure-storage-blob-go@v0.14.0/azblob/zc_storage_error.go:42\n===== RESPONSE ERROR (ServiceCode=InvalidAuthenticationInfo) =====\nDescription=Authentication information is not given in the correct format. Check the value of Authorization header.\nRequestId:99b65e84-801e-006f-801c-eb4b9a000000\nTime:2021-12-07T03:42:13.7541419Z, Details: \n  Code: InvalidAuthenticationInfo\n  PUT <https://newintelligentsys02sa.blob.core.windows.net/dummy?blockid=6xDVl8JQTja6vlELhJ8MMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA%3D%3D&comp=block&timeout=601>\n  Authorization: REDACTED\n  Content-Length: [70]\n  User-Agent: [Azure-Storage/0.14 (go1.16.2; Windows_NT)]\n  X-Ms-Client-Request-Id: [37f72e80-b2e9-4508-67d0-5242348b5790]\n  X-Ms-Date: [Tue, 07 Dec 2021 03:42:13 GMT]\n  X-Ms-Version: [2020-04-08]\n  --------------------------------------------------------------------------------\n  RESPONSE Status: 400 Authentication information is not given in the correct format. Check the value of Authorization header.\n  Content-Length: [297]\n  Content-Type: [application/xml]\n  Date: [Tue, 07 Dec 2021 03:42:13 GMT]\n  Server: [Microsoft-HTTPAPI/2.0]\n  X-Ms-Error-Code: [InvalidAuthenticationInfo]\n  X-Ms-Request-Id: [99b65e84-801e-006f-801c-eb4b9a000000]\n\n\n" service=api_gateway storage_namespace=<https://newintelligentsys02sa.blob.core.windows.net>
    I'm not sure why it's saying the authentication isn't in the correct format. I'm following the format provided here: https://docs.lakefs.io/reference/configuration.html#example-azure-blob-storage, at the end of the console log you can see the format of my storage account name
    Yusuf Khan
    Yoni Augarten
    15 replies
    Copy to Clipboard
  • Yusuf Khan

    Yusuf Khan

    9 months ago
    Hey 🙂 I'm going to be showing lakefs to a couple of people today, one question I had was how can I show its roll-back ability? So for example1. I ingest (track) a blob storage container that has data. 2. I make a commit (on the main branch) 3. I add new data to the blob storage container. 4. I ingest again 5. I commit again Now I want to revert back to the commit from step 2, which I'm doing with lakectl branch revert <repo> <commit id>, it seems to reset the state of the lakefs tracking, but my blob storage still has that new data from step 3.
    Yusuf Khan
    Yoni Augarten
    33 replies
    Copy to Clipboard
  • y

    Yusuf K

    8 months ago
    Hey 🙂 I'm trying to work on two example workflows. One additive case, i.e. lakefs repository main branch exists, blob container under specific directory is ingested, new branch is created, new data is ingested into the underlying storage, ingest is run again, new branch is merged to main. One subtractive case, i.e. main branch exists and ingests underlying blob storage, new branch is created, some files from underlying storage are removed, new branch ingests, new branch is merged to main. Both scenarios for me are not working as I expected, I suspect its simply a workflow/user error. I have a document outlining the steps I took but I'd prefer not to share it on a public channel, if there's a dev that can DM me I can share with them.
    y
    1 replies
    Copy to Clipboard
  • y

    Yusuf K

    8 months ago
    Silly question sorry, but once I've created a LakeFS repo, ingested some data into a branch. How does a different user read data from that branch? Using the python or spark client? I'm looking through the github, it seems like it would be this: https://github.com/treeverse/lakeFS/blob/master/clients/python/lakefs_client/api/objects_api.py#:~:text=def-,get_object,-( Does anyone have any community examples of being an end user of a lakefs repository?
    y
    Tal Sofer
    4 replies
    Copy to Clipboard
  • v

    Verun Rahimtoola

    8 months ago
    question on logging for
    lakectl dbt create-branch-schema
    v
    Lynn Rozen
    53 replies
    Copy to Clipboard
  • Chaim Turkel

    Chaim Turkel

    7 months ago
    what size db do we need it we will host it on aws?
    Chaim Turkel
    Lynn Rozen
    2 replies
    Copy to Clipboard
  • Louis Cheung

    Louis Cheung

    7 months ago
    hi, I used aws ctas to extract valid data but is it possible to integrate lakefs with aws ctas table?
    Louis Cheung
    Barak Amar
    16 replies
    Copy to Clipboard
  • d

    donald

    7 months ago
    How can I create folder-like structure under main branch to manage the data? For example, lakefs://myproject/main/trainingset, lakefs://myproject/main/validationset, lakefs://myproject/main/testset
    d
    Yoni Augarten
    11 replies
    Copy to Clipboard
  • a

    Ashwath

    7 months ago
    Hello i am started using multi container, how can I use jupter notebook to use spark and read s3 bucket from MinIO
    a
    Yoni Augarten
    3 replies
    Copy to Clipboard
  • v

    Verun Rahimtoola

    7 months ago
    quick question about writing to tagged data in lakefs
    v
    Ariel Shaqed (Scolnicov)
    6 replies
    Copy to Clipboard