R&D Services | Bioinformatics and Biostatistics

Computer analyses of study results are often required to verify the distribution of the data obtained. These statistical analyses help to ensure the probability of the data.

Bioinformatics and biostatistics services make it possible to extract the maximum amount of information from data, images, and studies carried out within the framework of a research project, a preclinical project, or a clinical project.

Computational modeling provides a better representation of biological functions when applied to this area. Bioinformatics is interdisciplinary in nature, allowing biologists to understand more complex systems, and computer scientists to develop software tools for understanding biological data.

Bioinformatics combines biology, computer science, information engineering, mathematics, and statistics.

Labtoo developed a tailor-made service to support Labs, Biotechs, Medtech, and Pharma companies in their procurement of samples.

Labtoo can be mandated in a fully customized mission to set up any type of bioinformatics and biostatistics studies.

Bioinformatics and Genomics

Historically, bioinformatics emerged with the understanding that biology used sequences at different levels: nucleic acid sequences for DNA and RNA, amino acid sequences for proteins.

Understanding, analyzing and comparing sequences are part of the fundamentals of biology and require the development of computer tools.

The development of sequencing technologies in recent years (NGS technologies for example) has led to the production of a large mass of information. The different fields of action of bioinformatics at the service of genomics are as follows:

Sequence assembly

Sequencing techniques produce short sequences, which must then be assembled. The shotgun sequencing technique, for example, generates fragments of 35 to 900 nucleotides, which must then be assembled. Sequence alignment for a known genome, such as the human genome, requires significant computing resources, although advances in computer science are making it possible to move faster. The presence of "gaps" in the genome is common and requires more focused work in a second step.

In the case of unknown genomes (de novo sequencing), the alignment may be more complex, and it is possible that some regions are only very difficult to sequence.

Genome annotation

Annotation is the process of labeling the specificities of a DNA sequence: introns and exons (coding sequences), regulatory sequences, methylation profiles, etc.

Evolutionary biology

Sequence analysis can reveal links between species, which is defined by the term evolutionary biology. The phenomena studied are typically gene duplications, horizontal transfers, and large-scale comparisons of genomes, which makes it possible to consolidate or compare the taxonomic or physiological methods used so far for the classification of species.

Bioinformatics tools will enable the construction of model populations to predict the evolution of the system over the long term.

Comparative genomics

Sequence comparison starts with the comparison of two gene sequences from two different organisms.

The differences observed, from point mutations in a nucleotide to changes in chromosomal segments such as duplications, transfers, inversions etc., allow the complexity of evolution to be understood.

Mutation analysis

In the case of certain diseases such as cancers, the genomes of the affected cells are very widely modified: rearrangements, point mutations, etc.

Bioinformatics will allow two types of comparative analyses based on sequencing data: between cancer cells and normal cells of an organism, and between cancer cells of an organism and cancer cells of other organisms. This type of study makes it possible to classify and list changes in the genomes of cancer patients in order to ultimately save time in terms of diagnosis and propose the best treatments.

For more information on the tools available, the Open Bioinformatics Foundation lists tools such as Biopython, BioJS or Bioperl.

Artificial Intelligence at the service of Bioinformatics

Developments in artificial intelligence in recent years, including machine learning and deep learning, have been applied in the field of bioinformatics, particularly in the prediction of protein structure.

A protein is a sequence of amino acids structured as follows:

Primary structure: a sequence of amino acids
Secondary structure: alpha-helix folding and beta lamination
Tertiary structure: three-dimensional folding by covalent or non-covalent bonding
Quaternary structure: integration in a protein complex

With the rise of AI, bioinformatics tools allow us to go much further in the study and prediction of protein structures.

Classifying proteins into new superfamilies

AI tools will analyze primary protein sequences and extract essential information (typically essential for their structure, or highly conserved). This will lead to the prediction of pseudo-proteins, which serve as a reference for the classification of unknown proteins in superfamilies.

Generating models of protein structures

One of the powerful machine learning tools invented in 2014 is the Generative Adversarial Network (GAN). This tool is used to generate data that would be similar to the original data. This is particularly relevant for generating models of tertiary protein structures, which would be "similar" or consistent with reference models. One paper used GANs to generate structures, which are checked for consistency or incoherence and fed back into the generator. This makes it possible to propose robust structure solutions, especially in cases where part of a protein's structure is missing or corrupted.

IA studies can also be applied to modeling antibody-antigen interaction domains, in order to minimize developmental steps in animals or by phage display.

Types of providers

Freelancers

Many researchers or physicians may conduct bioinformatics and biostatistics studies on an ad hoc basis or over the duration of research or clinical projects.

Specialist companies and CROs

There are firms specializing in clinical studies that offer biostatistical and bioinformatics analysis.
The companies conducting the clinical trials (contract research organization CROs) generally have the internal capacity to process the data collected in order to compile regulatory dossiers.

The Importance of Big Data in Health

Clinical Data

During the course of a clinical trial, different types of data are collected, transformed into analyzable data sets to answer specific research questions, and used to generate various publications and reports for different audiences. Biostatistics are used to collect, analyze and interpret the results.
They will be assisted by a biostatistician in the following steps:

Definition of the hypothesis
Choice of statistical tests and determination of their power
Sample Size
Definition of risk and influence factors
Understanding Correlation and Regression
Explanation of the phenomena of multiplicity

Real-world patient data

Real-world data (RWD) is data from healthy and treated patients from a variety of sources, typically generated directly from the patient.

The processing of this data will generate "real-world evidence" (RWE), useful for the following areas:

The possible applications of these data are economically and socially critical; this requires competent and specialized tools and statisticians.

Bioinformatics & Biostatistics Services for Drug Development

Select your Bioinformatics service for genomics, proteomics, and imaging

Proteomics data analysis

Image analysis & modeling

NGS sequencing data analysis

In silico models applied to organisms

Cytometry results analysis

In silico drug screening

Multi-omics single-cell data analysis

In silico prediction of toxicity

Select your Biostatistics and Clinical Data Analysis Service

Biological data analysis

Pre-clinical & clinical data analysis

Our team will handle your R&D service management from the beginning to the end