Decision Trees and Random Forests: Machine Learning Techniques to Classify Rare Events

Publikation: Beitrag in FachzeitschriftArtikelBegutachtung

31 Zitate (Scopus)

Abstract

The article introduces machine learning algorithms for political scientists. These approaches should not be seen as a new method for old problems. Rather, it is important to understand the different logic of the machine learning approach. Here, data is analyzed without theoretical assumptions about possible causalities. Models are optimized according to their accuracy and robustness. While the computer can do this work more or less alone, it is the researcher's duty to make sense of these models afterward. Visualization of machine learning results, therefore, becomes very important and is in the focus of this paper. The methods that are presented and compared are decision trees, bagging, and random forests. The latter are more advanced versions of the former, relying on bootstrapping procedures. To demonstrate these methods, extreme shifts in the US budget and their connection to the attention of political actors are analyzed. The paper presents a comparison of the accuracy of different models based on ROC curves and shows how to interpret random forest models with the help of visualizations. The aim of the paper is to provide an example, how these methods can be used in political science and to highlight possible pitfalls as well as advantages of machine learning.

OriginalspracheEnglisch
Seiten (von - bis)98-120
Seitenumfang23
FachzeitschriftEuropean Policy Analysis
Jahrgang2
Ausgabenummer1
DOIs
PublikationsstatusVeröffentlicht - 1 März 2016

Fingerprint

Untersuchen Sie die Forschungsthemen von „Decision Trees and Random Forests: Machine Learning Techniques to Classify Rare Events“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren