Hi, I would like to upload pretty big file (~5.5 G...
# help
k
Hi, I would like to upload pretty big file (~5.5 GB) via
lakectl fs upload
. The issue is, that under the hood it probably uses single part upload, cause I received and error
Copy code
Error executing command: request failed: [500 Internal Server Error] s3 error: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>EntityTooLarge</Code><Message>Your proposed upload exceeds the maximum allowed size</Message><ProposedSize>5667614938</ProposedSize><MaxSizeAllowed>5368709120</MaxSizeAllowed><RequestId>BCJHP7ZY8N0V025N</RequestId>
Is there a way to force this to use multipart upload to s3? Or am I missing something and should look in another place? btw. I cannot use aws s3 client, as we don’t have proper DNS settings on AWS in place yet…
y
Hey @Konrad, thank you for your question. I'm checking some options for you
🙌 1
@Konrad, how are you reaching lakeFS? Do you have a DNS for it? Is it an IP address? Or is it a local installation?
k
It’s an EC2 instance in AWS. I am using the instance url with port 8000 to connect.
the upload from my machine to lakefs instance itself is going fine as I can see the whole file in the /tmp folder on the EC2 instance.
y
You're running lakectl upload from your machine, right?
k
yup
y
Can you share please the output of
lakectl --version
?
k
Copy code
➜ lakectl --version
lakectl version 0.46.0
y
Thanks. I would like to try using a new feature where the upload will be done directly from the machine running lakectl
could you try to add
--direct
to the upload command?
👀 1
k
this time after ~20s I received
Copy code
Error executing command: upload to backing store: EntityTooLarge: Your proposed upload exceeds the maximum allowed size
	status code: 400, request id: STCH3KZ1PWFVS4X5
This is the exact command I’ve used is
Copy code
➜ lakectl fs upload --direct <lakefs://topic-classification/master/nkjp+wiki.txt> -s nkjp+wiki.txt
y
Thank you for trying. I will open an issue for it and share it with you shortly. I can offer you a workaround for now
k
Thanks! And yes, the workaround would be great 🙂
y
What you can do is the following: 1. In the lakeFS configuration, set the property of
gateways.s3.domain_name
to
<http://s3.workaround.com|s3.workaround.com>
(the value itself doesn't really matter). Restart lakeFS with this new configuration. 2. In your local machine, edit your /etc/hosts file and add the following entries (note I've added your repo name in the second one):
Copy code
<your ec2 instance ip> <http://s3.workaround.com|s3.workaround.com>
<your ec2 instance ip> <http://topic-classification.s3.workaround.com|topic-classification.s3.workaround.com>
3. Use aws s3 client like so:
Copy code
aws s3 cp --endpoint-url <http://s3.workaround.com:8000> nkjp+wiki.txt <s3://topic-classification/master/nkjp+wiki.txt>
1
👍🏾 1
k
thanks a lot, I’ll try in a minute
y
Let me know 🙂
Here is the new issue, https://github.com/treeverse/lakeFS/issues/2280 Feel free to comment there
👍 1
k
It worked! 🥳 Thanks a lot!
y
Glad to help! Thanks for the update