Predicting Large-Scale Violence in Criminal Wars




Skigin, Natán, “Predicting Large-Scale Violence in Criminal Wars.” In progress.


Due to the failure to successfully forecast the onset of civil wars, students of conflict have recently resorted to more complex models to improve accuracy in prediction. While the rise of criminal wars in Latin America and elsewhere has led many scholars to examine this relatively new phenomenon, we still know comparatively little about the ultimate causes of such episodes. More importantly, traditional statistical methods typically fare poorly at prediction, which complicates policymakers’ possibilities to prevent conflict and develop early-warning systems to aid population. This paper attempts to fill the gap by predicting subnational criminal violence in Mexico – a country that experienced a major rise of inter-cartel and state-cartel wars – with a range of machine learning techniques. I exploit rich socioeconomic, demographic, and political data to compare the performance of various forms of logistic regression (classic, rare event, and lasso) with random forests, gradient boosted machines, and AdaBoosted trees. The results indicate that gradient boosted trees offer better accuracy in predicting out-of-sample episodes of criminal violence than other machine learning approaches, and significantly stronger predictive power than logistic regressions.