Adi Polak

Adi Polak

10/04/2022, 10:56 AM
🥇some people say that old is gold? an interesting reference architecture from Oracle for big data processing. they broke the system down into parts and articulated that for a business to stay competitive, investments in a foundation data layer and access and performance layer are a must:
Foundation Data Layer: abstracts the data away from the business process through the use of a business process-neutral canonical data model. This gives the data longevity, so that changes in source systems or the interpretation placed on the data by current business processes does not necessitate model or data changes.
Access and Performance Layer: allows for multiple interpretations (e.g. past, present and future) over the same data as well as to simplify the navigation of the data for different user communities and tools. Objects in this layer can be rapidly added and changed as they are derived from the Foundation Data Layer.
The Foundation Data Layer and Access and Performance Layers offer two additional levels of abstraction that further reduce the impact of schema changes in the data platform while still presenting a single version of the truth to consumers.
It seems like with the shift to data lakes and object stores, these two are lost and there is a need to adopt better tools for these layers - or the tech under the hood that enables them, which is - data versioning engine a penny for your thoughts...

Robin Moffatt

10/10/2022, 3:39 PM
I’m not sure I quite follow your argument. Not to say that a data versioning engine isn’t also needed - but per the parallel I drew to the medallion architecture I think the two layers are distinct and required. It could be with performance improvements you’d end up without the access & performance layer, but you still need to model the data. Maybe that gets pushed out of the data [lake/lakehouse/warehouse] and into a logical layer somewhere that’s just resolved physically at runtime. (that may have been more like 0.02p than a penny 🙂 )