Hi, is this expected behaviour? (using python CDK ...
# help
f
Hi, is this expected behaviour? (using python CDK old API, cant migrate to new due to outdated pydantic dependency in the new library) I'm calling import_status until I see "completed=True", however it seems that it just returns Bad Gateway after a while, a while means less than a second? ( I guess after it completes) Using LakeFS 1.5.0
Copy code
{'id': 'cmuarlehdgvo09jaee8g'}
{'completed': False,
 'ingested_objects': 0,
 'update_time': datetime.datetime(2024, 2, 2, 8, 54, 45, 107239, tzinfo=tzutc())}
{'completed': False,
 'ingested_objects': 2,
 'update_time': datetime.datetime(2024, 2, 2, 8, 54, 46, 123730, tzinfo=tzutc())}
[ERROR][2024-02-02 09:54:47,192][lakefs_helper.py:379] Exception when calling ImportApi->import_status: (502)
Reason: Bad Gateway
HTTP response headers: HTTPHeaderDict({'Server': 'awselb/2.0', 'Date': 'Fri, 02 Feb 2024 08:54:47 GMT', 'Content-Type': 'text/html', 'Content-Length': '122', 'Connection': 'keep-alive'})
HTTP response body: <html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
</body>
</html>
my code
Copy code
try:
        # import data from object store
        api_response : ImportCreationResponse = lakefs.import_api.import_start(repository, 
                                                branch, import_creation)
        print(api_response)
        finished = False
        total_ingested = 0
        while not finished:
            time.sleep(0.2)
            status_response : ImportStatusResp = lakefs.import_api.import_status(repository, branch,
                                                                                api_response["id"])
            print(status_response)
            finished = status_response["completed"]
            total_ingested = status_response["ingested_objects"]
            
        assert not error_if_no_files or total_ingested > 0, f"No files found when processing CopyOp file {origin} to {destination}"  # noqa: E501


    except lakefs_client.ApiException as e:
        logging.error("Exception when calling ImportApi->import_start/status: %s\n" % e)
the dynamodb table which is our backed has no throttle, weird 😕 its copying few files, nothing big tbh
but it seems it just took down our whole lakefs though, it recovers eventually (this could be EKS restarting jobs, ima check logs)
i
Yeah, I was just about to suggest if lakeFS was having restarts during this time
f
i think its something with the import API, its the first time we test it, and lakefs was rock solid stable before it, checking logs anyways
i
There could be some bug in that API causing panics / restarts / etc.
f
yup "log": "panic: assignment to entry in nil map", "log": "github.com/treeverse/lakefs/pkg/graveler/ref.(*Manager).BranchUpdate(0xc0005ec840, {0x5e6b050, 0xc000b13180}, 0xc007036198, {0xc0078a254b, 0x3f}, 0xc00701c9c0)", "log": "github.com/treeverse/lakefs/pkg/graveler.(*Graveler).Import.func1(0xc00072af00)",
gratitude thank you 1
should I open. a ticket? i can get the whole trace (was messing with cloudwatch filtering xP, didnt manage to get it perfect but its readable)
i
Thank you, that would be very helpful 🙏
f
Ok the issue is related with commit=CommitCreation( message=f"CopyOp commit", metadata={ }, allow_empty=False, force=False, ),
if metadata is empty, thats why you get that nil map 🙂
👍 1
😱 1
if i remove metadata alltogether it works fine
👍 1
i
f
thanks!, will update our version when available (no hurry, all imports in our system happen within a library so we should be "fine")
jumping lakefs 1
e
👍🏻
o
@Elad Lachmi @Florentino Sainz I believe the latest v1.10.0 release fixes this
e
Yes, this issue should be fixed in lakeFS v1.10.0
f
yup, also fixes the issue with import branch permissions! thanks, already installed it in preprod
🙌 1
🙌🏻 1