Oz Katz

04/30/2023, 9:50 AM
Not sure if this has any benefit (or implications) for the lakeFS metadata client, but as of Spark 3.4 which was recently released, there's an official Protobuf Data Source

Ariel Shaqed (Scolnicov)

04/30/2023, 11:38 AM
Uggh: not beneficial, IIRC Graveler RocksDB tables do not have protobuf values. Instead they use some varint encoding, and hold a buffer of bytes in the value. That ends up as a protobuf, but I don't think it is exposed as an actual Spark column before we finish decoding. But a great opportunity to see whether our shading rules are good enough to separate "their" protobuf from "our" protobuf. We shall see once we start supporting 3.4.
🙏 1
😄 1