Making complex prediction rules applicable for readers: Current practice in random forest literature and recommendations

Anne Laure Boulesteix, Silke Janitza, Roman Hornung, Philipp Probst, Hannah Busen, Alexander Hapfelmeier

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

Ideally, prediction rules should be published in such a way that readers may apply them, for example, to make predictions for their own data. While this is straightforward for simple prediction rules, such as those based on the logistic regression model, this is much more difficult for complex prediction rules derived by machine learning tools. We conducted a survey of articles reporting prediction rules that were constructed using the random forest algorithm and published in PLOS ONE in 2014–2015 in the field “medical and health sciences”, with the aim of identifying issues related to their applicability. Making a prediction rule reproducible is a possible way to ensure that it is applicable; thus reproducibility is also examined in our survey. The presented prediction rules were applicable in only 2 of 30 identified papers, while for further eight prediction rules it was possible to obtain the necessary information by contacting the authors. Various problems, such as nonresponse of the authors, hampered the applicability of prediction rules in the other cases. Based on our experiences from this illustrative survey, we formulate a set of recommendations for authors who aim to make complex prediction rules applicable for readers. All data including the description of the considered studies and analysis codes are available as supplementary materials.

Original languageEnglish
Pages (from-to)1314-1328
Number of pages15
JournalBiometrical Journal
Volume61
Issue number5
DOIs
StatePublished - 1 Sep 2019

Keywords

  • logistic regression
  • machine learning
  • prediction rule
  • reproducibility
  • reproducible research

Fingerprint

Dive into the research topics of 'Making complex prediction rules applicable for readers: Current practice in random forest literature and recommendations'. Together they form a unique fingerprint.

Cite this