View Complete Reference

Cantu, F and Saiegh, SM (2010)

A Supervised Machine Learning Procedure to Detect Electoral Fraud Using Digital Analysis

Preprint posted on SSRN; last accessed August 5, 2021.

ISSN/ISBN: Not available at this time. DOI: 10.2139/ssrn.1594406

Abstract: This paper introduces a naive Bayes classifier to detect electoral fraud using digit patterns in vote counts with authentic and synthetic data. The procedure is the following: (1) we create 10,000 simulated electoral contests between two parties using Monte Carlo methods. This training set is composed of two disjoint subsets: one containing electoral returns that follow a Benford distribution, and another where the vote counts are purposively “manipulated” by electoral tampering – a percentage of votes are taken away from one party and given to the other; (2) we calibrate membership values of the simulated elections (i.e. clean or fraud- ulent) using logistic regression; (3) we recover class-conditional densities using the relative frequencies from the training set; (4) we apply Bayes’ rule to class-conditional probabilities and class priors to establish the membership probabilities of authentic observations. To illustrate our technique, we examine elections in the province of Buenos Aires (Argentina) between 1932 and 1942, a period with a checkered history of fraud. Our analysis allows us to successfully classify electoral contests according to their degree of fraud. More generally, our findings indicate that Benford’s Law is an effective tool for identifying fraud, even when minimal information (i.e. electoral returns) is available.

@misc{, author = {Francisco Cantu and Sebastian M. Saiegh}, title = {A Supervised Machine Learning Procedure to Detect Electoral Fraud Using Digital Analysis}, year = {2010}, doi = {10.2139/ssrn.1594406}, }

Reference Type: Preprint

Subject Area(s): Voting Fraud