LIMS, ELN, and KMS – Databases to manage scientific data

lims-eln

In the late 1960’s, managing data has become an essential component of drug discovery and development. Technology advances made it possible to collect more data more easily and, today, scientists are submerged by a data deluge.

Manual compilation of results and report writing require a substantial amount of time and, in a business context, these non-productive activities are not valuated. Moreover, they are error-prone and subject to bias.

With years all major and medium pharmaceutical companies have put in place data management systems as part of their quality assurance. They expect the same from their partners, and a review of data management procedures is included in due diligences. So even the smallest research groups need adequate tools to manage all their information and transform it into knowledge.

LIMS – Laboratory Information Management System

The purpose of a LIMS is to track and capture all the data created by a process. LIMS are necessary to work according to the Good Laboratory Practices because they enforce the standard operating procedure (SOP) and they can detect and document deviations from said SOP.

Tied to instruments and processes, LIMS streamline tasks, ensuring that the SOP is applied, and track data at each step, from end to end. They also compile reports for specific tests, samples or studies. As a result they reduce the time from request to result, maximizing the throughput of data production and improving the quality of results.

However, because they are linked to a specific process, they create data siloes and do not enable seamless data integration. They are configured to follow a single process, and they are nearly exclusively used in regulated environment where processes do not change too often, in opposition to drug discovery where conditions and processes are regularly updated.

ELN – Electronic Lab Notebook

ELN are aimed at replacing and extending paper notebooks. They serve different areas from preclinical to food, beverage or cosmetics companies. ELN enhance productivity by storing all information, data, or intellectual property created by scientists into a central searchable location. Experimental documentation used to support a patent is protected, preserved and shared in a high-security place with timestamps, version control and record authentication.

The reporting and searching capabilities of ELN are unequal and depend largely of the degree of data structuration in the system. On the one hand, “free-texts” ELN are versatile and can store any kind of result, just like a Word or PDF document. These results in free-text are accessible from the experiment identifier or by full-text search, but the data is not structured. It is therefore hard to aggregate results across experiments to produce structure-activity tables. On the other hand, specialized ELN (chemistry, pharmacology…) are more structured and it is possible to search by chemical reaction or to apply predefined calculation templates, for instance to process in vivo raw data into results. But they are then creating data silo (one data store for chemistry, one other data store for pharmacology).

In practice, the main use of ELN in drug discovery is to replace paper notebooks, facilitate exchange of information and be compliant with IP regulations.

KMS – Knowledge Management System

KMS are made to collect and integrate (i.e., to create links between) all the data produced by disparate instruments or techniques in drug discovery and development and to present all or part of the project information on a single screen. They are a one-stop-shop to access all the information.

While the first KMS were made in the 80’s to record screening data for small molecules only, nowadays they support seamlessly multiple types of entities (small molecules, proteins, antibodies, cells, vaccines, natural products…) and all the information generated in discovery and development: ADME, QC characterization, pharmacokinetics, pharmacodynamics, PCD studies, and clinical reports.

As the data is strongly structured, it becomes possible to display and group data at multiple levels: compound level (all the screening data, ADME properties, QC, characterization, PK and PD results and reports for a single compound), project level (all the compounds, the results associated and their statistics, such as a structure-activity table), study level, or assay level. Data generated in the lab is also usually augmented with data from the public domain, for instance to link gene products results through genes (e.g., access all transcriptomics results for a single gene).

The key advantage of KMS in drug discovery is to enable secure and structured storage of heterogeneous data and easy aggregation and retrieval of all project data, from screening to clinical.

Conclusion

We have described herein the main three classes of data management solutions. LIMS are tracking instrument data and are used mainly in regulated environment (GLP studies), ELN are a publication tool where scientists annotate the data from several LIMS and process it into result and KMS is an integration tool that gathers all the primary results and reports them in the context of a project.

The choice of the right tool depends therefore on the researcher’s objectives: compliance with GLP (LIMS), preservation of IP (ELN) or project data tracking and sharing (KMS).