The Scoring Function Consortium (SFC) was a collaborative effort with various pharmaceutical companies and the Cambridge Crystallographic Data Centre (CCDC) intended to compile structural data and subsequently use it to setup different training sets for the parameterization of new scoring functions. Over 60 different descriptors were evaluated for all complexes, which led to the most accurate scoring functions at that time.
Recently, the group of one of the leading authors in the SFC consortium, Prof. Sotriffer, has published a paper with improved SFC scoring functions. By applying our proposed machine learning approach, the correlations of SFC scoring functions increased from 0.64 to 0.78 on the PDBbind benchmark. This is a very large improvement for this problem, especially taking into account that only the regression model was changed (descriptors, training set and test set remain the same). The performance on the diverse CSAR-NRC set was also very high. It would be very interesting to see how well this new scoring function compares to functions tested on the CSAR-NRC test set, once the same training set is used for all functions. As it was explained here, the anonymous scoring functions tested on CSAR-NRC used different training sets, whose composition was not disclosed but are known to overlap with complexes in this test set. More information about SFC scoring functions can be found in the slides of a talk for 3rd Strasbourg Summer School on Chemoinformatics.