Giuseppe Barbieri
01/18/2024, 5:51 PMGiuseppe Barbieri
01/18/2024, 5:52 PM唐治喜
01/22/2024, 1:40 PMmohamed islam
01/22/2024, 4:36 PMGiuseppe Barbieri
01/24/2024, 11:25 AMLior Resisi
01/28/2024, 3:08 PMBill Li
01/30/2024, 9:11 PMSelva
02/01/2024, 2:33 PMYaphet Kebede
02/01/2024, 7:10 PMtime="2024-02-01T18:02:35Z" level=error msg="could not update metadata" func="pkg/gateway/operations.(*PathOperation).finishUpload" file="build/pkg/gateway/operations/operation_utils.go:51" error="postgres get: context canceled" host=**** matched_host=false method=POST operation_id=post_object path=wikidata-all.hdt physical_address=data/gi9jfn6d5a9ko68osg30/cmtsm96d5a9ko68osg3g ref=main repository=wikidata request_id=ddd64e7a-93e2-4a0f-aba1-36a84986da3e service_name=s3_gateway upload_id=cc4834cf2223468ebd5c0e6e97cacf11 user=admin
i get the above error, i updated postgres max connection lifetime to 3hrs
database:
type: postgres
postgres:
connection_max_lifetime: 3h
but i am not sure how to make sure the context is still intact,Florentino Sainz
02/02/2024, 8:57 AM{'id': 'cmuarlehdgvo09jaee8g'}
{'completed': False,
'ingested_objects': 0,
'update_time': datetime.datetime(2024, 2, 2, 8, 54, 45, 107239, tzinfo=tzutc())}
{'completed': False,
'ingested_objects': 2,
'update_time': datetime.datetime(2024, 2, 2, 8, 54, 46, 123730, tzinfo=tzutc())}
[ERROR][2024-02-02 09:54:47,192][lakefs_helper.py:379] Exception when calling ImportApi->import_status: (502)
Reason: Bad Gateway
HTTP response headers: HTTPHeaderDict({'Server': 'awselb/2.0', 'Date': 'Fri, 02 Feb 2024 08:54:47 GMT', 'Content-Type': 'text/html', 'Content-Length': '122', 'Connection': 'keep-alive'})
HTTP response body: <html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
</body>
</html>
my code
try:
# import data from object store
api_response : ImportCreationResponse = lakefs.import_api.import_start(repository,
branch, import_creation)
print(api_response)
finished = False
total_ingested = 0
while not finished:
time.sleep(0.2)
status_response : ImportStatusResp = lakefs.import_api.import_status(repository, branch,
api_response["id"])
print(status_response)
finished = status_response["completed"]
total_ingested = status_response["ingested_objects"]
assert not error_if_no_files or total_ingested > 0, f"No files found when processing CopyOp file {origin} to {destination}" # noqa: E501
except lakefs_client.ApiException as e:
logging.error("Exception when calling ImportApi->import_start/status: %s\n" % e)
Bhawana Prasain
02/03/2024, 2:11 PMGiuseppe Barbieri
02/06/2024, 9:07 AMpre-commit
, that is the File Format and the Basic File Schema Validator
why would I want to run those two on pre-merge
instead?Yiannis Zachariadis
02/06/2024, 2:01 PMFailed to resolve 'minio' ([Errno -2] Name or service not known)"))
.
This is what my code currently looks like. Is is something obvious I'm missing?
for object in repo.branch("main").objects():
source = get_filename(object.path)
file_size = repo.branch("main").object(object.path).stat().size_bytes
with repo.branch("main").object(object.path).reader(
mode="r", pre_sign=True
) as fd:
while fd.tell() < file_size:
print(fd.read(10))
fd.seek(10, os.SEEK_CUR)
Oliver Haney
02/06/2024, 5:00 PM胡有
02/07/2024, 1:41 AMGiuseppe Barbieri
02/07/2024, 11:32 AMfailed to create repository: found lakeFS objects in the storage namespace(s3://example) key(_lakefs/dummy): storage namespace already in use
Giuseppe Barbieri
02/08/2024, 9:55 AM*lakefs://*repo0*/*main*/*_lakefs_actions*/*dataset.yml
, however, when I try to commit my changes through the web-interface, I get:
> pre-commit hook aborted, run id '5dnve4sesbvc72vvdd50': 1 error occurred: * hook run id '0000_0000' failed on action 'Dataset' hook 'dataset_validator': Post "http://0.0.0.0:8080/webhooks/format": dial tcp 0.0.0.08080 connect: connection refusedBill Li
02/09/2024, 10:04 PMGiuseppe Barbieri
02/14/2024, 11:23 AMExpected URL scheme 'http' or 'https' but no scheme was found for /api/v...
Tongguo Pang
02/14/2024, 3:42 PMConstantine
02/14/2024, 4:16 PMconf.set("spark.sql.catalog.spark_catalog","org.apache.iceberg.spark.SparkCatalog")
conf.set("spark.sql.catalog.spark_catalog.io-impl","org.apache.iceberg.aws.s3.S3FileIO")
conf.set("spark.sql.catalog.spark_catalog.glue.endpoint","AWS_VPC_ENDPOINT_FOR_GLUE")
conf.set("spark.sql.catalog.spark_catalog.s3.endpoint","AWS_VPC_ENDPOINT_FOR_S3")
conf.set("spark.sql.catalog.spark_catalog.catalog-impl","org.apache.iceberg.aws.glue.GlueCatalog")
`conf.set("spark.sql.catalog.spark_catalog.lock.table","myGlueLockTable")`` and i notied that lakeFS has their own jars for this, does lakeFS still let you use a custom / different vpcs in aws for both glue and s3:
.config("spark.sql.catalog.lakefs", "org.apache.iceberg.spark.SparkCatalog") \
.config("spark.sql.catalog.lakefs.catalog-impl", "io.lakefs.iceberg.LakeFSCatalog") \
.config("spark.sql.catalog.lakefs.warehouse", f"lakefs://{repo_name}") \
.config("spark.sql.catalog.lakefs.uri", lakefsEndPoint)
Waqas Zubairy
02/16/2024, 12:30 PMIon
02/17/2024, 1:22 PMFlorentino Sainz
02/20/2024, 10:49 AMIon
02/21/2024, 2:36 PMIon
02/21/2024, 7:21 PMBill Li
02/22/2024, 3:35 PMIon
02/22/2024, 7:36 PMoutgoing-webhook
02/23/2024, 9:57 PMDhammika
03/12/2024, 8:11 PM