Hi, I am getting this error when trying to read a ...
# help
t
Hi, I am getting this error when trying to read a csv file via lakefs repo
Copy code
23/02/06 12:13:28 WARN FileStreamSink: Assume no metadata directory. Error while looking for metadata directory in the path: <lakefs://ragoldstandard/main/bronze_layer/sample.csv>.
java.io.IOException: statObject
	at io.lakefs.LakeFSFileSystem.getFileStatus(LakeFSFileSystem.java:779)
	at io.lakefs.LakeFSFileSystem.getFileStatus(LakeFSFileSystem.java:46)
	at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1777)
	at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
	at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:571)
	at jdk.internal.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.lakefs.hadoop.shade.api.ApiException: Content type "text/html; charset=utf-8" is not supported for type: class io.lakefs.hadoop.shade.api.model.ObjectStats
	at io.lakefs.hadoop.shade.api.ApiClient.deserialize(ApiClient.java:822)
	at io.lakefs.hadoop.shade.api.ApiClient.handleResponse(ApiClient.java:1020)
	at io.lakefs.hadoop.shade.api.ApiClient.execute(ApiClient.java:944)
	at io.lakefs.hadoop.shade.api.ObjectsApi.statObjectWithHttpInfo(ObjectsApi.java:1478)
	at io.lakefs.hadoop.shade.api.ObjectsApi.statObject(ObjectsApi.java:1451)
	at io.lakefs.LakeFSFileSystem.getFileStatus(LakeFSFileSystem.java:775)
Please can some help
i
Hi Temilola 👋🏽 , I guess you are using the Java client? Do you mind sharing the client version you use and what's the line that causes this error?
Your spark configurations might be helpful too 🙂
👍🏼 1
👍 1
a
The auto generated client doesn't like non JSON responses... and these are surprisingly common on the web. This is one of my favourite mistakes to make! Can you check your Spark configuration as @Idan Novogroder suggests? The endpoint is particularly noteworthy: as in the example in the docs , it has to end in /API/v1.
🙌🏽 1
t
Yeah, Thanks.. I checked my Spark Configurations and some jar dependencies. This solved the issue.
🎇 1
🙌🏽 1
a
Glad to hear that! I opened https://github.com/treeverse/lakeFS/issues/5213 to improve the debugging experience users have here. Please considering commenting there as well, support messages on an issue always help prioritize them! 🙂
682 Views