Hi, I have an s3 bucket with delta lake table, la...
# help
a
Hi, I have an s3 bucket with delta lake table, lakeFS pointing to it, and a table configured in Databricks hive metastore called
loan_by_state
I have access to the table and was able to read it. But when I try to insert information into it, it fails. Code snippet - python on databricks notebook:
Copy code
import time
i = 1
while i <= 6:
  # Execute Insert statement
  insert_sql = "INSERT INTO loan_by_state VALUES ('IA', 450)"
  spark.sql(insert_sql)
  print('loan_by_state: inserted new row of data, loop: [%s]' % i)
    
  # Loop through
  i = i + 1
  time.sleep(5)
Exception:
Copy code
Caused by: java.io.IOException: get object metadata using underlying s3 client
	at io.lakefs.MetadataClient.getObjectMetadata(MetadataClient.java:69)
	at io.lakefs.LakeFSFileSystem.linkPhysicalAddress(LakeFSFileSystem.java:659)
	at io.lakefs.LinkOnCloseOutputStream.close(LinkOnCloseOutputStream.java:66)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
	at org.apache.parquet.hadoop.util.HadoopPositionOutputStream.close(HadoopPositionOutputStream.java:64)
	at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:751)
	at org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:135)
	at org.apache.parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:165)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.close(ParquetOutputWriter.scala:42)
	at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.releaseResources(FileFormatDataWriter.scala:58)
	at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.commit(FileFormatDataWriter.scala:75)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$2(FileFormatWriter.scala:290)
	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1615)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:296)
	... 10 more
Caused by: java.lang.NoSuchMethodException: com.databricks.spark.metrics.FileSystemWithMetrics.getAmazonS3Client()
	at java.lang.Class.getDeclaredMethod(Class.java:2130)
	at io.lakefs.MetadataClient.getObjectMetadata(MetadataClient.java:61)
What did I miss?
e
Hi Adi, I will look into it and update here asap
a
I isolated the scenario, and it seems like the failure is because of the insert commend:
Copy code
%sql
INSERT INTO delta.`<lakefs://bi-reports/adi1/tables/loan_by_state/>` VALUES ('IA', 450)
exception:
Copy code
Caused by: java.io.IOException: get object metadata using underlying s3 client
	at io.lakefs.MetadataClient.getObjectMetadata(MetadataClient.java:69)
	at io.lakefs.LakeFSFileSystem.linkPhysicalAddress(LakeFSFileSystem.java:659)
	at io.lakefs.LinkOnCloseOutputStream.close(LinkOnCloseOutputStream.java:66)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
Updated hadoop-lakeFS library to 0.1.6 and now it works!
🙌 2