FYI - Apache Spark (i.e. the OSS, not the Databric...
# dev
o
FYI - Apache Spark (i.e. the OSS, not the Databricks distribution) added a RocksDB StateStore in Spark 3.2.0 (recently released). This means new Spark versions now add a dependency on RocksDB - https://issues.apache.org/jira/browse/SPARK-34198 The version seems to be pinned to 6.20.3 - as we're currently bundling our our sstable parser, not sure if/how this affects us. Perhaps @Tal Sofer or @Ariel Shaqed (Scolnicov) can shed some light?
a
Due to our problems running on databricks, we no longer depend on rocksdbjni in any way. I think we're clear here.
👌 1
o
Thanks @Ariel Shaqed (Scolnicov) - out of curiosity - any idea if 6.20.3 would be compatible with our metadata format? if so, perhaps it paves a way to dropping our custom parser in the future..
a
I should however add that Spark 3.1 and 3.2 are untested and may require separately packaged clients to load correctly. So far this has not been a priority from any users (hint hint...).
🙂 1
👍 1
o
a
IIRC 6.20 would be new enough. Still need to wait for them to drop all earlier versions before we can go back to rocksdbjni and drop my hack. Note we don't want to have more than a single version: that's just looking for trouble when users take the wrong one... And that is transitive through anything our users write...