https://lakefs.io/ logo
#dev
Title
# dev
v

Vaibhav Kumar

05/19/2023, 7:06 PM
I am working on some issue around this client. So I have put some debugger in this code. I used MVN package then and created my jar under target path. Now following this link I am creating a scala session to read file form lakefs under this newly built jar created above. Spark-shell
Copy code
spark-shell --conf spark.hadoop.fs.s3a.access.key=minioadmin\ 
--conf spark.hadoop.fs.s3a.secret.key=minioadmin\ 
--conf spark.hadoop.fs.s3a.endpoint=<http://127.0.0.1:9090>\
 --conf spark.hadoop.fs.lakefs.impl=io.lakefs.LakeFSFileSystem\
 --conf spark.hadoop.fs.lakefs.access.key=AKIAIOSFODNN7EXAMPLE\
  --conf spark.hadoop.fs.lakefs.secret.key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'\
 --conf spark.hadoop.fs.lakefs.endpoint=<http://localhost:8000/api/v1>\
 —jars /Users/simar/lakeFS/clients/hadoopfs/target/hadoop-lakefs-0.1.0.jar\
 io.lakefs.LakeFSFileSystem
While reading it I am getting the below error
scala> val df = spark.read.parquet("<lakefs://example/main/sample1.json>")
Copy code
23/05/20 00:24:42 WARN FileStreamSink: Assume no metadata directory. Error while looking for metadata directory in the path: <lakefs://example/main/sample1.json>.
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class io.lakefs.LakeFSFileSystem not found
Does anyone know what could be causing this?
i

Iddo Avneri

05/19/2023, 7:35 PM
Hi, we will take a look. In the meantime, a few questions can help us: 1. What is the object store you are running against? 2. Can you try hadoop-lakefs-assembly:0.1.14?
b

Barak Amar

05/19/2023, 7:52 PM
The command you run in the shell is to read a parquet format and the file path looks like JSON.
v

Vaibhav Kumar

05/20/2023, 6:31 AM
@Iddo Avneri 1. I am using Minio as object store 2. while trying directly hadoop-lakefs-assembly:0.1.14 as
--packages
instread of the
--jars
in spark shell in getting some error around object store
Copy code
23/05/20 11:58:02 WARN FileSystem: Failed to initialize fileystem <lakefs://example/main/sample1.json>: java.io.IOException: Failed to get lakeFS blockstore type
java.io.IOException: Failed to get lakeFS blockstore type
@Barak Amar My bad, there was another issue there was one less
-
in the
jars
. After correcting it now I am getting a different error.
Copy code
scala>  val df = spark.read.json("<lakefs://example/main/sample1.json>")
java.lang.NoClassDefFoundError: io/lakefs/clients/api/ApiException
 at java.base/java.lang.Class.forName0(Native Method)
 at java.base/java.lang.Class.forName(Class.java:496)
 at java.base/java.lang.Class.forName(Class.java:475)
 at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2625)
 at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2590)
 at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2686)
 at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3431)
 at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466)
 at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
 at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
 at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:53)
 at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
 at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
 at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
 at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:361)
 at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:340)
 ... 43 elided
Caused by: java.lang.ClassNotFoundException: io.lakefs.clients.api.ApiException
a

Ariel Shaqed (Scolnicov)

05/20/2023, 7:44 AM
Hi @Vaibhav Kumar, I think you're trying to load the plain jar rather than the assembled package. The plain jar doesn't bring in any dependencies, and here I think you're missing the lakeFS API client. But don't try to fix that manually, you'll probably just run into more issues. Instead we have an assembly, sometimes known as an Überjar. This is a jar that includes the dependencies in it. If you build using "mvn package" you should end up with another jar that includes "assembly" in its name. Can you try to use that jar, and let us know how that goes?
v

Vaibhav Kumar

05/20/2023, 7:55 AM
Yeah I think the same because API exception was present as a java client but I just built hadoop client. Under which pom xml shall I now use mvn package ? Because earlier I was using mvn package under hadoop client @Ariel Shaqed (Scolnicov)
a

Ariel Shaqed (Scolnicov)

05/20/2023, 8:47 AM
You're using the correct pom.xml I reckon. But not the right assembly jar perhaps?
v

Vaibhav Kumar

05/20/2023, 10:42 AM
Upon using the
mvn package
under the hadoopfs client. I can only see the
hadoop-lakefs-0.1.0.jar
. Please refer the screenshot once. I can't see any asssembly jar.
a

Ariel Shaqed (Scolnicov)

05/20/2023, 10:47 AM
Sorry, it's been a while since I worked with this code. Can you try "mvn -Passembly package", please? If that fails, I'll have to get back to you tomorrow with a full fix that I've tested.
v

Vaibhav Kumar

05/20/2023, 10:59 AM
@Ariel Shaqed (Scolnicov) I tried as you said it ran, but with errors.
Copy code
Results :

Tests in error: 
  testExists_NotExistsNoPrefix(io.lakefs.LakeFSFileSystemPresignedModeTest): Unable to execute HTTP request: Read timed out

Tests run: 105, Failures: 0, Errors: 1, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  08:08 min
[INFO] Finished at: 2023-05-20T16:26:36+05:30
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project hadoop-lakefs-assembly: There are test failures.
[ERROR] 
[ERROR] Please refer to /Users/simar/lakeFS/clients/hadoopfs/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
a

Ariel Shaqed (Scolnicov)

05/20/2023, 12:50 PM
Can you try -DskipTests to see if that helps? (We won't pull without passing tests, of course, but this should be enough to unblock you - and at worst our beefier CI pipeline should be able to help those tests pass.
v

Vaibhav Kumar

05/20/2023, 1:17 PM
Thanks @Ariel Shaqed (Scolnicov) it worked . Now I am getting the same error when I was directly using the
--packages io.lakefs:hadoop-lakefs-assembly:0.1.14
.I am not sure why I am getting blocktype error considering I have already passed params them through my spark shell. Spark shell command
Copy code
spark-shell --conf spark.hadoop.fs.s3a.access.key=minioadmin --conf spark.hadoop.fs.s3a.secret.key=minioadmin --conf spark.hadoop.fs.s3a.endpoint=<http://127.0.0.1:9090> --conf spark.hadoop.fs.lakefs.impl=io.lakefs.LakeFSFileSystem --conf spark.hadoop.fs.lakefs.access.key=AKIAIOSFODNN7EXAMPLE  --conf spark.hadoop.fs.lakefs.secret.key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY' --conf spark.hadoop.fs.lakefs.endpoint=<http://localhost:8000/api/v1>  —jars /Users/simar/lakeFS/clients/hadoopfs/target/hadoop-lakefs-assembly-0.1.0.jar
--class io.lakefs.LakeFSFileSystem
Error while reading
Copy code
scala> val df = spark.read.json("<lakefs://example/main/sample1.json>")
23/05/20 18:42:18 WARN FileSystem: Failed to initialize fileystem <lakefs://example/main/sample1.json>: java.io.IOException: Failed to get lakeFS blockstore type
23/05/20 18:42:18 WARN FileStreamSink: Assume no metadata directory. Error while looking for metadata directory in the path: <lakefs://example/main/sample1.json>.
java.io.IOException: Failed to get lakeFS blockstore type
	at io.lakefs.LakeFSFileSystem.initializeWithClientFactory(LakeFSFileSystem.java:134)
	at io.lakefs.LakeFSFileSystem.initialize(LakeFSFileSystem.java:110)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
	at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:53)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
	at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
	at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:361)
	at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:340)
	at $line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:22)
	at $line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:26)
	at $line14.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:28)
	at $line14.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:30)
	at $line14.$read$$iw$$iw$$iw$$iw.<init>(<console>:32)
	at $line14.$read$$iw$$iw$$iw.<init>(<console>:34)
	at $line14.$read$$iw$$iw.<init>(<console>:36)
	at $line14.$read$$iw.<init>(<console>:38)
	at $line14.$read.<init>(<console>:40)
	at $line14.$read$.<init>(<console>:44)
	at $line14.$read$.<clinit>(<console>)
	at $line14.$eval$.$print$lzycompute(<console>:7)
	at $line14.$eval$.$print(<console>:6)
	at $line14.$eval.$print(<console>)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747)
	at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020)
	at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568)
	at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36)
	at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116)
	at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
	at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567)
	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594)
	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564)
	at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:865)
	at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:733)
	at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:435)
	at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:456)
	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:239)
	at org.apache.spark.repl.Main$.doMain(Main.scala:78)
	at org.apache.spark.repl.Main$.main(Main.scala:58)
	at org.apache.spark.repl.Main.main(Main.scala)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at <http://org.apache.spark.deploy.SparkSubmit.org|org.apache.spark.deploy.SparkSubmit.org>$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: io.lakefs.hadoop.shade.api.ApiException: Unauthorized
	at io.lakefs.hadoop.shade.api.ApiClient.handleResponse(ApiClient.java:1031)
	at io.lakefs.hadoop.shade.api.ApiClient.execute(ApiClient.java:944)
	at io.lakefs.hadoop.shade.api.ConfigApi.getStorageConfigWithHttpInfo(ConfigApi.java:466)
	at io.lakefs.hadoop.shade.api.ConfigApi.getStorageConfig(ConfigApi.java:447)
	at io.lakefs.LakeFSFileSystem.initializeWithClientFactory(LakeFSFileSystem.java:130)
	... 58 more
23/05/20 18:42:18 WARN FileSystem: Failed to initialize fileystem <lakefs://example/main/sample1.json>: java.io.IOException: Failed to get lakeFS blockstore type
java.io.IOException: Failed to get lakeFS blockstore type
 at io.lakefs.LakeFSFileSystem.initializeWithClientFactory(LakeFSFileSystem.java:134)
 at io.lakefs.LakeFSFileSystem.initialize(LakeFSFileSystem.java:110)
 at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
 at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
 at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
 at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$1(DataSource.scala:752)
 at scala.collection.immutable.List.map(List.scala:293)
 at org.apache.spark.sql.execution.datasources.DataSource$.checkAndGlobPathIfNecessary(DataSource.scala:750)
 at org.apache.spark.sql.execution.datasources.DataSource.checkAndGlobPathIfNecessary(DataSource.scala:579)
 at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:408)
 at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
 at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
 at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:361)
 at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:340)
 ... 43 elided
Caused by: io.lakefs.hadoop.shade.api.ApiException: Unauthorized
 at io.lakefs.hadoop.shade.api.ApiClient.handleResponse(ApiClient.java:1031)
 at io.lakefs.hadoop.shade.api.ApiClient.execute(ApiClient.java:944)
 at io.lakefs.hadoop.shade.api.ConfigApi.getStorageConfigWithHttpInfo(ConfigApi.java:466)
 at io.lakefs.hadoop.shade.api.ConfigApi.getStorageConfig(ConfigApi.java:447)
 at io.lakefs.LakeFSFileSystem.initializeWithClientFactory(LakeFSFileSystem.java:130)
a

Ariel Shaqed (Scolnicov)

05/20/2023, 2:22 PM
This might be https://github.com/treeverse/lakeFS/issues/5863, which was a permissions error on lakeFS. Try pulling a newer version, maybe recreate your users, or make sure to run with a user in group Admin.
v

Vaibhav Kumar

05/20/2023, 5:39 PM
I have pulled the latest code . Now My assembly jar is built using the latest code and the docker compose is also the latest. Still my error is same around the blockstore.
a

Ariel Shaqed (Scolnicov)

05/22/2023, 5:42 AM
Can you please try to check the permissions associated with the access key that you use? Also, if you could paste the lakeFS server logs from the time of the run that could help:
docker logs lakefs
.
v

Vaibhav Kumar

05/22/2023, 5:58 AM
Docker logs
I have used the access keys as per mentioned in the docker compose. But can check the permissions for the access key if you can let me know how to check them? Meanwhile pasting the docker logs above @Ariel Shaqed (Scolnicov)
j

Jonathan Rosenberg

05/22/2023, 8:42 AM
@Vaibhav Kumar, you are currently using a lakeFS server which needs to be patched (or rebuilt) using the latest code. That means that you need to compile it, run
docker build
and then set it as the docker image that docker compose uses to run lakeFS. You can instead, for the meantime, use the following image as the lakeFS server:
treeverse/experimental-lakefs:v0.100.0-23-g8b29-rc-rbac3
v

Vaibhav Kumar

05/22/2023, 8:59 AM
ok, I will use
treeverse/experimental-lakefs:v0.100.0-23-g8b29-rc-rbac3
. When you say latest image, I referred this as latest
image: treeverse/lakefs:latest
. I believe this would also pull the latest lakfefs only? Correct me if i am wrong. @Jonathan Rosenberg
j

Jonathan Rosenberg

05/22/2023, 9:23 AM
No, it’s un related. The lakeFS docker image is the server, while lakeFSFS is a jar client that you need to explicitly use. Set:
image: treeverse/experimental-lakefs:v0.100.0-23-g8b29-rc-rbac3
v

Vaibhav Kumar

05/22/2023, 10:26 AM
I have now used the above mentioned image in my docker compose. After doing docker compose up I am still getting the same error around blockstore when I try using the client with Minio storage.
Can you please try once from your end as well. The same set up with the below docker compose.
Copy code
version: "3.5"
services:

  lakefs:
    image: treeverse/experimental-lakefs:v0.100.0-23-g8b29-rc-rbac3
    container_name: lakefs
    depends_on:
      - minio-setup
    ports:
      - "8000:8000"
    environment:
      - LAKEFS_DATABASE_TYPE=local
      - LAKEFS_BLOCKSTORE_TYPE=s3
      - LAKEFS_BLOCKSTORE_S3_FORCE_PATH_STYLE=true
      - LAKEFS_BLOCKSTORE_S3_ENDPOINT=<http://minio:9000>
      - LAKEFS_BLOCKSTORE_S3_CREDENTIALS_ACCESS_KEY_ID=minioadmin
      - LAKEFS_BLOCKSTORE_S3_CREDENTIALS_SECRET_ACCESS_KEY=minioadmin
      - LAKEFS_AUTH_ENCRYPT_SECRET_KEY=some random secret string
      - LAKEFS_STATS_ENABLED
      - LAKEFS_LOGGING_LEVEL
      - LAKECTL_CREDENTIALS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
      - LAKECTL_CREDENTIALS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
      - LAKECTL_SERVER_ENDPOINT_URL=<http://localhost:8000>
    entrypoint: ["/bin/sh", "-c"]
    command:
        - |
          lakefs setup --local-settings --user-name docker --access-key-id AKIAIOSFODNN7EXAMPLE --secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY || true
          lakefs run --local-settings &
          wait-for -t 60 lakefs:8000 -- lakectl repo create <lakefs://example> <s3://example> || true
          wait

  minio-setup:
    image: minio/mc
    container_name: minio-setup
    environment:
        - MC_HOST_lakefs=<http://minioadmin:minioadmin@minio:9000>
    depends_on:
      - minio
    command: ["mb", "lakefs/example"]

  minio:
    image: minio/minio
    container_name: minio
    ports:
      - "9000:9000"
      - "9001:9001"
    entrypoint: ["minio", "server", "/data", "--console-address", ":9001"]



networks:
  default:
    name: bagel
j

Jonathan Rosenberg

05/22/2023, 10:30 AM
which error?
v

Vaibhav Kumar

05/22/2023, 12:26 PM
@Jonathan Rosenberg the below error when I run
scala> val df = spark.read.json("<lakefs://example/main/sample1.json>")
Copy code
23/05/22 15:52:12 WARN FileSystem: Failed to initialize fileystem <lakefs://example/main/sample1.json>: java.io.IOException: Failed to get lakeFS blockstore type
java.io.IOException: Failed to get lakeFS blockstore type
 at io.lakefs.LakeFSFileSystem.initializeWithClientFactory(LakeFSFileSystem.java:136)
 at io.lakefs.LakeFSFileSystem.initialize(LakeFSFileSystem.java:112)
 at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
 at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
 at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
 at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$1(DataSource.scala:752)
 at scala.collection.immutable.List.map(List.scala:293)
 at org.apache.spark.sql.execution.datasources.DataSource$.checkAndGlobPathIfNecessary(DataSource.scala:750)
 at org.apache.spark.sql.execution.datasources.DataSource.checkAndGlobPathIfNecessary(DataSource.scala:579)
 at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:408)
 at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
 at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
 at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:361)
 at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:340)
 ... 43 elided
Caused by: io.lakefs.hadoop.shade.api.ApiException: Unauthorized
 at io.lakefs.hadoop.shade.api.ApiClient.handleResponse(ApiClient.java:1031)
 at io.lakefs.hadoop.shade.api.ApiClient.execute(ApiClient.java:944)
 at io.lakefs.hadoop.shade.api.ConfigApi.getStorageConfigWithHttpInfo(ConfigApi.java:466)
 at io.lakefs.hadoop.shade.api.ConfigApi.getStorageConfig(ConfigApi.java:447)
 at io.lakefs.LakeFSFileSystem.initializeWithClientFactory(LakeFSFileSystem.java:132)
 ... 61 more
Can someone try to reproduce the things with MInio? I see a way to solve the issue 2801 but still my setup is not working as expected.
a

Ariel Shaqed (Scolnicov)

05/23/2023, 8:48 AM
Hi @Vaibhav Kumar, I suggest using the "everything bagel" Docker compose file. The blogs our Paul Singman published about it should get you up and running, then just change to use your locally-built version of lakeFS and lakeFSFS. But it's a complex environment, with many moving parts. I think it might be worthwhile to go through some simpler examples of Docker Compose first, and practice changing files and executables inside containers, or how to build them.
v

Vaibhav Kumar

05/23/2023, 6:18 PM
@Ariel Shaqed (Scolnicov) I was already using the docker compose mentioned in this blog. The only thing is my spark client is running out side the container as it slows down my system if I run all things in docker. I hope that can't be the issue?
Can someone connect with me to resolve it sometime?I am sure of any of you would have tried would have faced something similar because I have performed all the steps mentioned above.
6 Views