https://lakefs.io/ logo
Title
a

Adi Polak

08/04/2022, 7:21 AM
Interesting article that discusses the data lifecycle management challenges in BioPharma. here are some quotes:
" Recent years have seen the rise of “483s,” FDA regulatory warning letters, with data integrity violations accounting for most of the notices. In 2019, almost half (47%) of all warning letters issued by FDA concerned data integrity. By the end of 2021, that number had increased to 65% (1). "
Larger organizations are also seeing a resurge in investments in building centralized data repositories, such as data lakes, to help drive their digitization initiatives. A core business objective of many of these data lakes is to break down data silos to create centralized repository of data for end users that is easily accessible, coherent, and complete. However, automating the integration of such varied systems, whilst ensuring data integrity and regulatory compliance, remains a significant industry challenge (2).
Another one:
A less appreciated and more nuanced data integrity issue is data contextualization. Even if an operator can extract data from a specific system (e.g., chromatography), the data may be of little use without combining it with data stored in other systems, such as the experimental conditions under which the sample was generated.
The world of Parma has been working with spreadsheets (and spreadsheet likes tools) for many, many years; It's interesting to see the author speaks of embracing new technology and building a data lake!
👀 3