Sometimes (most of the time), a data scientist’s life may seem like fun and games. But sometimes, we have to deal with the graver topics in life. Like armed conflicts.

One of the most interesting public data sources around was created by the Armed Conflict Location & Event Data Project (ACLED). They have very accurate and ever growing data of at the time of this writing some 100k individual events around armed conflicts primarily in Africa, with the exception of a few in countries in South and Southeast Asia. Their purpose is to

  • Recording acts of violence between rebels, militias and governments.

  • Record acts of violence against civilians.

  • Map out strongholds of armed groups.

  • Collect data on riots and protest.

You see, it’s grave business. But at the same time extremely important work. I’d encourage you to check their website. The data range back to 1997 and contain for each event

  • The country.

  • The date.

  • The two main parties involved.

  • The geographical location.

  • The number of casulaties.

  • Notes about the conflict.

So what could one do with this kind of data? The first step is as usual to get an overview. We do this again in R. For displaying geo-spatial data, we will the excellent ggmap package.

We would like to plot all the incidents recorded on a map, distinguishing the time of the events and number of fatalities. We will represent each event by a dot, with the size representing the number of fatalities and the color indicating the year. All this can be done with a few lines of R code. You can download the script from github.

library(readr)
library(ggmap)
library(dplyr)
conflict.data <- read_csv(
    "ACLED Version 6 All Africa 1997-2015_csv_dyadic.csv.gz")
bbox <- make_bbox(LONGITUDE,
                  LATITUDE,
                  data=conflict.data,
                  f=0.2)
africa <- get_map(bbox)
ggmap(africa) +
    geom_point(aes(x=LONGITUDE,
                   y=LATITUDE,
                   color=YEAR,
                   size=FATALITIES),
               data=conflict.data %>%
                   filter(FATALITIES < 6000),
               alpha=0.5) +
    xlim(-20, 40) +
    ylim(-35, 35) +
    scale_color_gradient(limits=c(1997, 2015),
                         low="orangered1",
                         high="red4")

You see that we had to filter on the number of fatalities in line 17, which effectively just removes one sad event from the First Congo War where a mass grave was found. I’ve warned you that we’re dealing with a grave topic. But now you know how to make graphs like the one below and inform people what’s going wrong in the world and maybe do your part to help improving matters.

Conflicts, Data By Google Maps

Conflicts, Data By Google Maps

I think ACLED did an amazing job to provide us with a fascinating data set. Stay tuned for a deeper dive into it.