30 Years of Data for Life
Building rich databases before Big Data
For more than three decades, Eurofins has been considered a leading provider of analytical services globally. The Group performs hundreds of millions of tests each year to establish the safety, identity, composition, authenticity, origin, traceability, and purity of biological substances and products. Its extensive databanks offer years’ worth of information about pharmaceuticals and food and their properties.
Many of the tests that Eurofins performs around the world rely on one of the company’s extensive proprietary databases, not simply to compare results to, but very often to obtain the results themselves.
Eurofins’ DNA fingerprint database, for example, contains unique identifying characteristics (the “fingerprints”) of foodstuffs and enables proof of authenticity that had not previously been available. Not only does this prevent unscrupulous suppliers gaining an unfair market advantage, but it also prevents them endangering the health of the consumer.
Eurofins developed its proprietary DNA fingerprint databases for several specific analytical requests (e.g. basmati / fragrant rice authenticity or determination of different pine nut species), using proprietary or published DNA fingerprint methods and reference samples which were made available by authorities or via Eurofins’ laboratory network and its customers. The databases allow identification of pure and mixed samples, but also of the presence of as-yet unknown or unapproved species or varieties. The testing methods meet the need for traceability – a major requirement of EU food legislation – and can be adapted very fast to the needs of the market, for example if new species or varieties are approved by authorities and reference material is available.
Another hugely exciting step-change in proprietary databases came with the launch in 1999 of Eurofins’ BioPrint™ databases to improve and optimise the selection of drug candidates. The database comprises a large and homogeneous set of experimental data, generated in-house and containing more than 2,400 compounds, including marketed drugs, compounds which failed in clinical trials, and reference compounds.
Each compound has hundreds of pieces of information stored, with the BioPrint™ database covering in vitro assays as well as in vivo characteristics such as drug reactions, pharmacokinetics and therapeutic indications. On average there are 400 records for each compound, meaning more than 1 million records are stored. High quality and extensive datasets, combined with modelling and mining tools, place new drug candidates in the context of well-understood drugs. This allows scientists to anticipate adverse drug reactions and supports lead compound characterisation and prioritisation (a lead is a possible drug candidate but which may still have suboptimal structure and characteristics).
A further development is a cloud-based proprietary database from Eurofins QTA that provides a major advancement in the way customers perform their infrared (IR) analysis. IR analysis is a method to measure at what wavelength and how much light is absorbed by a material type and at what intensity light passes through it. Individual wavelengths interact with different compounds in different ways. This allows chemists to use the data to map the physical and chemical properties with quantitative and/or qualitative data.
The QTA® methods provide a simple interface for non-skilled users, thereby overcoming the challenges of traditional infrared analyses which have to be developed and maintained by experienced spectroscopists or chemometricians. QTA provides clients with a uniform analysis method that can be utilised by any non-technical personnel across multiple locations, reducing the human error that can occur when analysing with traditional chemistry methodology.
Instruments at the testing site scan a sample and send the light spectra of the sample to the secure server via the internet. The analysis and data interpretation are completed within minutes on the server, and the results are returned in real time. Common sample matrices include food products, chemical manufacturing, oils, agriculture, aquaculture, and dairy. Eurofins QTA pushes the boundaries of where NIR is able to be used in order to introduce this analytical solution into new applications and emerging industries using benchtop (i.e. in the laboratory) or in-line (i.e. on the production line) systems.
The science behind
Information about the origin of a food product is often encrypted in its chemical composition, and rapid developments in science and technology over the last few decades allow its analysis and interpretation. Two of the most important techniques within DNA fingerprinting are DNA fragment length analysis and microsatellite or short tandem repeat (STR) analysis. DNA fragment length analysis considers changes in the length of a specific DNA sequence to indicate the presence or absence of a genetic marker or the presence of a specific variant of a genetic marker. Eurofins used this technique to successfully detect a species of poor-tasting pine nuts which had triggered 39 biotoxin notifications in the EU Rapid Alert System for food and feed. STR analysis is used to compare specific areas on DNA from two or more samples. This technique, again based on Eurofins’ extensive proprietary database of specific DNA fingerprints, was used by the company to prove the authenticity of Basmati rice when the market was flooded with cheap imitations.
The BioPrint™ project was started with the hypothesis that the in vitro pharmacological profiles of new drug candidates generated in Eurofins’ laboratory could act as a fingerprint, capturing information on the in vivo activity of the compound. The company found that hierarchical clustering of the drug and reference compounds based on their in vitro pharmacological profiles, achieved the grouping of many by their therapeutic areas or biological actions; for example, antidepressants clustered with other antidepressants, and antifungals with other antifungal drugs. Using this “fingerprint”, choices about a drug candidate’s potential therapeutic use, and adverse reactions in the context of all the drug and reference compounds present in the database by performing simple profile similarity analysis to identify “neighbour” compounds.
The backbone of the QTA® solution is the comprehensive approach taken in model and application development, technical consultation, and full service support. QTA’s unique data treatment methodologies applied to calibration models for spectroscopic qualitative analysis applications. The proprietary database and algorithms are dynamically maintained for superior accuracy and precision, with primary data generated using industrial standard methods and stored in a highly-secure central server.