Descripció del projecte

·Target-Specific Machine Learning Models for Improved Protein-Ligand Binding Affinity Predictions.

Proteins play a crucial role in the correct functioning of the organism and often their dysfunctionality results in diseases. Accordingly, most drug discovery processes consist on the identification of small molecules capable of interacting with the target protein by either activating, modulating or inhibiting its activity. Structure-based Virtual Screening is a computational approach used to reduce the time and costs associated with hit identification campaigns and it is thus extensively used in the early phases of a drug discovery project. These methods regard the interaction between a protein and a ligand as a lock that requires the right key to open a door. In this respect, one of the techniques commonly used is molecular docking that combines an algorithm that explores the position and orientation of the ligand in the protein binding cavity (ligand pose) with a scoring function (SF) that assigns a numerical value to the strength of the protein-ligand interaction for that ligand pose.

Many SFs rely on the physical and biological characteristics of the protein and the ligand to obtain a total estimation of the system and they have been key in the successful identification of small molecule hits in numerous structure-based virtual screening campaigns. However, the performance of the SFs has been shown to depend largely on the protein system and the lack of a correlation between SF values and experimental binding affinities of ligands, particularly within the same chemical series, preclude their use for hit and lead optimisation.
To alleviate these intrinsic problems, attempts to develop target-specific SFs have been made in recent years and, in general, they outperform the more traditional and general SFs when applied to the same protein system. In this respect, target-specific machine learning (ML) models have been shown to compete with physics-based SFs. Lately, this type of ML-based SFs has gained popularity due to significant increase in the availability of protein-ligand binding affinity data but also to the good performance that they show compared with other SFs .

The novelty and relevance of this investigation is justified by the following points:

A. Despite the enormous possibilities that this line of research currently offers, only very few target-specific machine learning models have been developed and published recently, leaving room for significant improvements.
B. Most models have been trained on binding affinity available in the public domain. Since the intention is that this work will be performed within the remit of an Industrial Doctorate, we will have access to several times more binding affinity data that we expect will result in improved binding affinity models.
C. This project will build on our recent work on Extended Connectivity Interaction Features (ECIF) and the models derived from them which were recently shown to produce state-of-the-art performances.
D. The outcome of the project should result in more active and more selective small molecule hits identified and less false positives encountered.

Objectives:

1. Creation and evaluation of target-specific machine learning models for those proteins with a substantial amount of binding affinity data
2. Creation and evaluation of target-specific machine learning models for those protein families for which perhaps there is not enough binding affinity data for a particular protein, but the availability of the data is enough when all proteins members of the family are considered
3. Creation and evaluation of an improved general machine learning model to be applied on those proteins for which neither a protein model nor a protein-family model can be applied
4. Confirm computational predictions with experimental in vitro binding affinity data of both commercially available as well as de novo designed small molecules



MÉS INFORMACIÓ

Si t’interessa l’oferta, omple el pdf amb les teves dades i envia´l a doctorats.industrials.recerca@gencat.cat