Why does the lakeFS hadoop filesystem depend on Ha...
# dev
y
Why does the lakeFS hadoop filesystem depend on Hadoop 2.7.7 and not a newer version? @Tal Sofer @Itai Admi
i
I have no idea..
t
@Ariel Shaqed (Scolnicov) @Barak Amar can you help with answering this question?
b
check Spark release and the package type associated
at the time spark 3.0.x and 3.1.x were using hadoop 2.7 and there was an option to use hadoop 3.2 for Spark 3.1.x we went with the lower hadoop to work with both versions of spark at the time.
@Ariel Shaqed (Scolnicov) correct me if I got it wrong
a
@Barak Amar is correct (as usual). We officially support Spark 2.4.7 and 3.0.1, which we test regularly. The common Hadoop version for these is 2.7.7. Spark appears to have poor plugin support, and Java / Scala have limited cross-building capabilities. Supporting newer Spark versions requires some effort because they require compiling different code. If you require support ideally give the Spark and Hadoop versions which you are running, so we can direct effort on the right directions.
t
Are you ok with documenting this helpful answer here https://github.com/treeverse/lakeFS/pull/2721?