Precision and recall oncology: combining multiple gene mutations for improved identification of drug-sensitive tumours

Clinical treatment is progressively incorporating genomic markers to personalize the treatment of cancer patients or decide their inclusion in clinical trials. Unfortunately, only a small part of the responsive patients can be currently identified before administering the treatment. Therefore, there is a need for new methods able to better discriminate between sensitive and resistant tumours. There are now sufficiently large in vitro pharmacogenomics data sets to carry out such study.

Our paper presents a first-in-kind large-scale comparison of the performance of single-gene markers and multi-gene machine learning markers across 127 drugs in the common clinical scenario where only genomic data is available. We are also the first, to our knowledge, to test genomic markers on a truly independent data set. From the results of this rigorous validation, we conclude that combining multiple gene mutations via machine learning results in better discrimination than that provided by single-gene markers in about half of these drugs (e.g. Temsirolimus, 17-AAG or Methotrexate).

In the light of these results, we discuss why clinical personalized cancer treatment only manages to treat a fraction of the patients that were expected to be helped with single-gene markers and how this could be greatly alleviated by multi-gene markers in some cases without the need of acquiring further data. To this end, we stress the importance of not only the precision of a marker, but also its recall (sensitivity) that we have found to be greatly improved by multi-gene machine-learning markers. As genomic markers continue to grow more popular in clinical settings, more attention needs to be paid to the recall of the predictive models that are used to identify responsive tumours as a part of a precision and recall oncology approach enabled by machine-learning modelling.

This study is freely available at