top of page

Helicopter Prison Break - Finding the Prison Escape Patterns | Python

  • Writer: Igli Ferati
    Igli Ferati
  • Jul 11, 2023
  • 4 min read

Updated: Jul 28, 2023

Introduction

In this project, we analyse the dataset of helicopter escapes (1971 - 2020) obtained from Wikipedia to find the patterns of helicopter prison breaks. The result of our analysis is based on the dataset being analysed at that very moment; therefore, the conclusion can differ from the most updated data on the Wikipedia article.

We use basic Python techniques to analyse the data and Matplotlib to visualise some results. We find the patterns based on years, countries and escapees.

The original dataset contains six self-explainable columns as below:

  • Date

  • Prison name

  • Country

  • Succeeded

  • Escapee(s)

  • Details

This project aims to answer the questions below:

  • Which year showed the maximum number of helicopter prison break attempts?

  • Which countries demonstrated the highest amount of helicopter prison break attempts?

  • Which countries recorded the greatest chance of success for helicopter prison breaks?

  • How did the number of escapees affect success?

  • Which escapees have done it more than once?

Preparing the data

We use one of the Dataquest's helper functions from the helper.py file to get our data into Python.


from helper import *

Get the Data

Now, let's get the data from the [List of helicopter prison escapes] (https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes) Wikipedia article.


url = "https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes"
data = data_from_url(url)

Let's print the first three rows


for row in data[:3]:
    print(row)
[1971, 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'] [1973, 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"] [1978, 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']

Removing the Details

We initialize an 'index' variable with the value of '0'. The purpose of this variable is to help us track which row we're modifying.



index = 0for row in data:
    data[index] = row[:-1]
    index += 1
print(data[:3])
[[1971, 'Santa Martha Acatitla', 'Mexico', 'Yes'], [1973, 'Mountjoy Jail', 'Ireland', 'Yes'], [1978, 'United States Penitentiary, Marion', 'United States', 'No']]

Extracting the Year

In the code cell below, we iterate over data using the iterable variable row and: * With every occurrence of row[0], we refer to the first entry of row, i.e., the date. * Thus, with date = fetch_year(row[0]), we're extracting the year out of the date in row[0] and assiging it to the variable date. * We then replace the value of row[0] with the year that we just extracted.


for row in data:
    date = fetch_year(row[0])
    row[0] = date
print(data[:3])
[[1971, 'Santa Martha Acatitla', 'Mexico', 'Yes'], [1973, 'Mountjoy Jail', 'Ireland', 'Yes'], [1978, 'United States Penitentiary, Marion', 'United States', 'No']]

Attempts per Year


min_year = min(data, key=lambda x: x[0])[0]
max_year = max(data, key=lambda x: x[0])[0]

Before we move on, let's check what are the earliest and latest dates we have in our dataset.


print(min_year)
print(max_year)
1971 2020

Now we'll create a list of all the years ranging from 'min_year' to 'max_year'. Our goal is to then determine how many prison break attempts there were for each year. Since years in which there weren't any prison breaks aren't present in the dataset, this will make sure we capture them.


years = []
for year in range(min_year, max_year + 1):
    years.append(year)

Let's take a look at 'years' to see if it looks like we expected.


print(years)
[1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]

Now we create a list where each element looks like '[, 0]'.


attempts_per_year = []
for year in years:
    attempts_per_year.append([year, 0])

And finally we increment the second entry (the one on index '1' which starts out as being '0') by '1' each time a year appears in the data.



for row in data:
    for year_attempt in attempts_per_year:
    year = year_attempt[0]
    if row[0] == year:
    year_attempt[1] += 1
print(attempts_per_year)
[[1971, 1], [1972, 0], [1973, 1], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 1], [1979, 0], [1980, 0], [1981, 2], [1982, 0], [1983, 1], [1984, 0], [1985, 2], [1986, 3], [1987, 1], [1988, 1], [1989, 2], [1990, 1], [1991, 1], [1992, 2], [1993, 1], [1994, 0], [1995, 0], [1996, 1], [1997, 1], [1998, 0], [1999, 1], [2000, 2], [2001, 3], [2002, 2], [2003, 1], [2004, 0], [2005, 2], [2006, 1], [2007, 3], [2008, 0], [2009, 3], [2010, 1], [2011, 0], [2012, 1], [2013, 2], [2014, 1], [2015, 0], [2016, 1], [2017, 0], [2018, 1], [2019, 0], [2020, 1]]

%matplotlib inline 
barplot(attempts_per_year)

The years in which the most helicopter prison break attempts occurred were 1986, 2001, 2007 and 2009, with a total of three attempts each.

Attempts by Country


countries_frequency = df['Country'].value_counts()

We use 'print_pretty_table()' from the 'helper function' to print the frequency table.

print_pretty_table(countries_frequency)

Conclusion

The goal of this project is to study the patterns of helicopter prison breaks between the years 1971 to 2020. By using basic Python techniques and Matplotlib, our findings show that:

  • The highest prison break attempts are three times, which happened in years 1986, 2001, 2007 and 2009.

  • France with a total of 15 attempts, is the top country for helicopter prison breaks.

  • We could not statistically conclude the countries with the greatest chance of success, as all the countries with 100% success rate only had one to two prison break attempts.

  • The longer the escapees' names, the greater the chance of prison break success. This is based on the assumption that the number of escapees positively correlate with the lengths of their combined names.

  • Michel Vaujour and Pascal Payet from France, have committed helicopter prison break twice respectively.


Follow Me

vecteezy_linkedin-logo-png-linkedin-icon-transparent-png_18930587_72.png
vecteezy_github-logo-black-transparent-png_24555266_956.png
tableau.png

©2023 by Igli Ferati

bottom of page