Berlin 3DSIG meeting in Berlin

BerlinI just came back from this year’s 3DSIG, a satellite meeting of the ISMB/ECCB conference focusing on structural bioinformatics and drug discovery applications. The programme was packed with fantastic talks and posters. I was inspired by Tom Blundell’s keynote talk and the historical perspective that gives you listening to his nearly 50-year record of top science.

Robert Preissner’s talk on the many uses of similarity search to characterise target space was brilliant. I am glad he has accepted our invitation to give a seminar at EBI soon. Another particularly interesting talk was by Michael Schroeder, who talked about his structure-based work on repurposing a herpes drug for cancer.

A machine learning approach to docking: improving SFC models

Picture1The Scoring Function Consortium (SFC) was a collaborative effort with various pharmaceutical companies and the Cambridge Crystallographic Data Centre (CCDC) intended to compile structural data and subsequently use it to setup different training sets for the parameterization of new scoring functions. Over 60 different descriptors were evaluated for all complexes, which led to the most accurate scoring functions at that time.

Recently, the group of one of the leading authors in the SFC consortium, Prof. Sotriffer, has published a paper with improved SFC scoring functions. By applying our proposed machine learning approach, the correlations of SFC scoring functions increased from 0.64 to 0.78 on the PDBbind benchmark. This is a very large improvement for this problem, especially taking into account that only the regression model was changed (descriptors, training set and test set remain the same). The performance on the diverse CSAR-NRC set was also very high. It would be very interesting to see how well this new scoring function compares to functions tested on the CSAR-NRC test set, once the same training set is used for all functions. As it was explained here, the anonymous scoring functions tested on CSAR-NRC used different training sets, whose composition was not disclosed but are known to overlap with complexes in this test set.  More information about SFC scoring functions can be found in the slides of a talk for 3rd Strasbourg Summer School on Chemoinformatics.