Hello folks, I am able ingest data using lakectl a...
# help
r
Hello folks, I am able ingest data using lakectl as
lakectl ingest --from <s3://bucket-name/template> --to <lakefs://repo1/main/>
it works When I try to do it via java API it gives
Internal Server Error
Following is the sample code:
Copy code
StageRangeCreation stageRangeCreation = new StageRangeCreation();
stageRangeCreation.setFromSourceURI("<s3://bucket-name/template>");
stageRangeCreation.setAfter("main/");
stageRangeCreation.setPrepend("main/");
importApi.ingestRange('repo1', stageRangeCreation);
Is
ingestRange
API is correct to ingest? If yes then what I'm missing here?
šŸ‘€ 1
i
Hi @Raman Kharche Let me take a look at this and get back to you
šŸ‘ 1
The
ingestRange
API is not the same as the
lakectl ingest
command. While
lakectl ingest
is a complete ingest flow, perfromed by the
lakectl
client,
ingestRange
is a plumbing API that is meant to be used as part of an import flow (specifically, UI import) and has to be followed by additional API calls (
CreateMetaRange
) Nevertheless, that does not explain the error response. Can you elaborate some more details on the error itself? Anything from the server logs?
r
io.lakefs.clients.api.ApiException: Internal Server Error at io.lakefs.clients.api.ApiClient.handleResponse(ApiClient.java:1029) at io.lakefs.clients.api.ApiClient.execute(ApiClient.java:942) at io.lakefs.clients.api.ImportApi.ingestRangeWithHttpInfo(ImportApi.java:307) at io.lakefs.clients.api.ImportApi.ingestRange(ImportApi.java:283) at com.data.lakefs.service.impl.LakeFsService.createRepository(LakeFsService.java:110) at com.data.lakefs.service.impl.LakeFsService$$FastClassBySpringCGLIB$$92559d9.invoke(<generated>)
so equivalent
lakectl ingest
java api would be?
i
I believe that would be
ingestRange
followed by
createMetaRange
, where the range(s) ingested by
ingestRange
are given as parameters to
createMetaRange
Here is a link to an example of the flow, from our tests: https://github.com/treeverse/lakeFS/blob/bb848637cb7604d46f64d6ed6c9a47c4f567aed6/esti/import_test.go#L39-L84 Unfortunately, we do not have a similar example of the JavaAPI usage, but the flow should be the same
r
okay
so after
ingestRange
createMetaRange
I will have to call
but somehow
ingestRange
is breaking for me.
i
I see. Do you have access to the
lakefs
server logs?
r
yeah lakefs is running on my local as
./lakefs --config lake-config.yaml run
--- database: connection_string: "postgres://postgres@localhost:5432/lakefs1?sslmode=disable" blockstore: type: "s3" s3: region: "us-east-1" credentials_file: /credentials profile: temporary auth: encrypt: secret_key: "a random string that should be kept secret" logging: level: DEBUG
following are the configs šŸ‘† (lakefs-config.yaml)
i
What is the error from the `lakefs`server? I believe it should appear on the same terminal
r
there is no log in lakefs server for this error
i
Looking at the code from the initial message, you are setting
Copy code
stageRangeCreation.setAfter("main/");
If I'm not wrong, this means you are looking to ingest objects following the
"main/"
key in the
from
. Is that configuration correct? I'm asking, as it doesn`t seem to match the
lakectl ingest
command you specified above.
r
I'm confused with setAfter and setPrepend so main is the name of my branch and repo1 is the name of repo
i
Not exactly. The
ingestRange
and
createMetaRange
does not define a branch. After these you will have to perform a
Commit
API call, providing the
branch
and the
metarange
created by
createMetaRange
API
So, basically, the
after
should be the starting point in your
from
location, and the
prepend
is a prefix you would like to add to the ingested objects, in the
target
repo
In the code specified above, you are adding a
"main/"
prefix, that is not correlated to the branch
main
r
my
from
is
<s3://bucket-name/template>
in that case
after
will be
template
and the
prepend
can be empty. Right?
i
after
should be a key, in the
from
location, from which you would like to ingest. In case you would like to ingest the entire
from
repository, use
""
(this will align with the
lakectl ingest
command you specified
prepend
can be left empty too (will also align with your
lakectl ingest
). It is a prefix you like to add to the ingested objects in the target. E.g., if your
from
contains
obj1, obj2...
using
prepend = "somepref"
will create
somepref/obj1, somepref/obj2...
.
r
šŸ¤” but not working but this following change
Copy code
StageRangeCreation stageRangeCreation = new StageRangeCreation();
stageRangeCreation.setFromSourceURI("<s3://bucket-name/template>");
stageRangeCreation.setAfter("");
stageRangeCreation.setPrepend("");
importApi.ingestRange('repo1', stageRangeCreation);
In logs of lafeks server it is not logging
ingest_range
log action
i
Does it log anything else?
r
it logs action
create_repo
because before
ingestRange
I call to create a repo. And then this DEBUG [2022-05-25T232655+0530]lakeFS/pkg/httputil/logging.go78 pkg/httputil.DebugLoggingMiddleware.func1.1 HTTP call ended host="localhost:8000" method=POST path="/api/v1/repositories?bare=false" request_id=e6b055fd-b114-4f52-8345-c8c1f58532d7 sent_bytes=118 service_name=rest_api status_code=201 took=5.1797475s thats it
i
Let me deep into the code and try to figure this one...
i
Hey @Raman Kharche - may I ask what are you trying to achieve? The reason I’m asking is that
IngestRange
is a plumbing command, which must be executed in a sequential order with more plumbing commands, which is hard to nail just right. There might be an easier path if you explain more about the use case..
r
Use case is: From S3 I want to ingest data into lakefs. Via lakectl it works, but how to do via java API trying to figure out that.
i
Hi @Raman Kharche - from the server logs, could you tell me what version of lakefs is being used?
It should be printed as part of the start sequence
r
Version 0.64.0
i
Could you please try and upgrade to Version 0.65.0 ? I believe that is the version to introduce the new
ingestRange
API
r
sure
i
Thank you for the cooperation šŸ™‚ If/when you need anything else, I will be here
r
Now in logs got the event of
ingest-range
. DEBUG [2022-05-26T073831+0530]lakeFS/pkg/api/controller.go3424 pkg/api.(*Controller).LogAction performing API action action=
ingest_range
host="localhost:8000" message_type=action method=POST path=/api/v1/repositories/marvel19/branches/ranges request_id=c2a40b98-d5ff-4116-9fa7-28191ca9a775 service=api_gateway service_name=rest_api
ERROR
[2022-05-26T073832+0530]lakeFS/pkg/logging/logger.go250 pkg/logging.(*logrusEntryWrapper).Errorf
Aborting write to range: %!w(*fmt.wrapError=&{sstable file close: pebble: keys must be added in order: #0,DEL, #0,SET 0xc009b7d038})
DEBUG [2022-05-26T073832+0530]lakeFS/pkg/httputil/logging.go78 pkg/httputil.DebugLoggingMiddleware.func1.1 HTTP call ended host="localhost:8000" method=POST path=/api/v1/repositories/marvel19/branches/ranges request_id=c2a40b98-d5ff-4116-9fa7-28191ca9a775 sent_bytes=140 service_name=rest_api status_code=500 took=896.6992ms
i
Well, that's a progress šŸ™‚ Let me take a look at this specific error...
r
Actually from UI it works
The Import Button
I will check what's the problem in there with the change. I'm grateful to you. I wanted to thank you for your help. Thanks much
šŸ‘ 1
i
Thank you for your patience šŸ™