Analyzing the Global Impact of COVID-19 | SQL & Tableau
- Igli Ferati
- Jul 11, 2023
- 5 min read
Updated: Jul 28, 2023
Introduction:
In the midst of the unprecedented COVID-19 pandemic, data analysis has become crucial in understanding the magnitude of its impact worldwide. As a passionate data enthusiast, I embarked on a project that involved harnessing the power of SQL and Tableau to create insightful reports on COVID-19 data. This blog post aims to showcase the results of my work, highlighting the countries with the highest infection rates, total deaths on a global scale, and the percentage of populations affected by the virus in various countries.
Exploring Countries with the Highest Infection Rates:
Using SQL queries, I retrieved and analyzed COVID-19 data from reliable sources to identify the countries with the highest infection rates. By aggregating the number of confirmed cases, I was able to determine which nations were most severely impacted by the virus. Through Tableau visualizations, I illustrated these findings, enabling viewers to grasp the scale of the pandemic's spread across different regions.
SELECT *
FROM [COVID 19 PROJECT]..CovidDeaths$
WHERE continent is not null
ORDER BY 3,4
--SELECT *
--FROM [COVID 19 PROJECT]..CovidVaccinations$
--ORDER BY 3,4
--Select data that we are going to be using in our project
SELECT location, date, total_cases, new_cases, total_deaths, population
FROM [COVID 19 PROJECT]..CovidDeaths$
WHERE continent is not null
ORDER BY 1,2
-- Looking at Total Cases vs Total Deaths
-- Shows likelihood of dying people that contracted covid in your country
SELECT location, date, total_cases, new_cases, total_deaths, (total_deaths/total_cases)*100 AS DeathPercentage
FROM [COVID 19 PROJECT]..CovidDeaths$
WHERE location like '%albania%'
AND continent is not null
ORDER BY 1,2
-- Looking at the Total Cases vs Population
-- Shows what percentage of population got Covid
SELECT location, date,population, new_cases, total_cases, (total_cases/population)*100 AS PercentPopulationInfected
FROM [COVID 19 PROJECT]..CovidDeaths$
--WHERE location like '%albania%'
WHERE continent is not null
ORDER BY 1,2
-- Looking at Countries with Highest Infection Rate compared to Population
SELECT location,population, MAX(total_cases) AS HighestInfectionCount, MAX((total_cases/population))*100 AS PercentPopulationInfected
FROM [COVID 19 PROJECT]..CovidDeaths$
--WHERE location like '%albania%'
WHERE continent is not null
GROUP BY location,population
ORDER BY PercentPopulationInfected DESC
-- Showing Countries with Highest Death Count per Population
SELECT location, MAX(CAST(Total_deaths as int)) AS TotalDeathCount
FROM [COVID 19 PROJECT]..CovidDeaths$
--WHERE location like '%albania%'
WHERE continent is not null
GROUP BY location
ORDER BY TotalDeathCount DESC
--Breaking things down by Continent
-- Showing contintents with the highest death count per population
SELECT continent, MAX(CAST(Total_deaths as int)) AS TotalDeathCount
FROM [COVID 19 PROJECT]..CovidDeaths$
--WHERE location like '%albania%'
WHERE continent is not null
GROUP BY continent
ORDER BY TotalDeathCount DESC
-- Global Numbers
-- Total Death Percentage
SELECT SUM(new_cases) AS Total_Cases, SUM(CAST(new_deaths AS int)) AS Total_Deaths, SUM(CAST(new_deaths AS int))/SUM(new_cases)*100 AS DeathPercentage
FROM [COVID 19 PROJECT]..CovidDeaths$
WHERE continent is not null
--GROUP BY date
ORDER BY 1,2
--Total Death Percentage by Date
SELECT date, SUM(new_cases) AS Total_Cases, SUM(CAST(new_deaths AS int)) AS Total_Deaths, ROUND(SUM(CAST(new_deaths AS int))/SUM(new_cases)*100,2) AS DeathPercentage
FROM [COVID 19 PROJECT]..CovidDeaths$
WHERE continent is not null
GROUP BY date
ORDER BY 1,2
-- Looking at Total Population vs Vaccinations
-- What is Total Population in the World getting Vaccinated?
SELECT dea.continent, dea.location, dea.date, dea.population, vac.new_vaccinations,
SUM(CAST(vac.new_vaccinations as bigint)) OVER (Partition by dea.Location ORDER by dea.location,
dea.Date) AS RollingPeopleVaccinated
---, (RollingPeopleVaccinated/population)*100
FROM [COVID 19 PROJECT]..CovidVaccinations$ vac
JOIN [COVID 19 PROJECT]..CovidDeaths$ dea
ON dea.location = vac.location
and dea.date = vac.date
WHERE dea.continent is not null
ORDER BY 2,3
-- USE CTE
With PopvsVac (Continent, Location, Date, Population, New_Vaccinations, RollingPeopleVaccinated)
AS
(
SELECT dea.continent, dea.location, dea.date, dea.population, vac.new_vaccinations,
SUM(CAST(vac.new_vaccinations as bigint)) OVER (Partition by dea.Location ORDER by dea.location,
dea.Date) AS RollingPeopleVaccinated
---, (RollingPeopleVaccinated/population)*100
FROM [COVID 19 PROJECT]..CovidVaccinations$ vac
JOIN [COVID 19 PROJECT]..CovidDeaths$ dea
ON dea.location = vac.location
and dea.date = vac.date
WHERE dea.continent is not null
)
SELECT *, (RollingPeopleVaccinated/Population)*100
FROM PopvsVac
-- TEMP TABLE
Create Table #PercentPopulationVaccinated
(
Continent nvarchar(255),
Location nvarchar(255),
Date datetime,
Population numeric,
New_vaccinations numeric,
RollingPeopleVaccinated numeric
)
Insert into #PercentPopulationVaccinated
SELECT dea.continent, dea.location, dea.date, dea.population, vac.new_vaccinations,
SUM(CAST(vac.new_vaccinations as bigint)) OVER (Partition by dea.Location ORDER by dea.location,
dea.Date) AS RollingPeopleVaccinated
---, (RollingPeopleVaccinated/population)*100
FROM [COVID 19 PROJECT]..CovidVaccinations$ vac
JOIN [COVID 19 PROJECT]..CovidDeaths$ dea
ON dea.location = vac.location
and dea.date = vac.date
--WHERE dea.continent is not null
SELECT *, (RollingPeopleVaccinated/Population)*100
FROM #PercentPopulationVaccinated
-- Crreating View to store data for later visualizations
Create View PercentPopulationVaccinated as
SELECT dea.continent, dea.location, dea.date, dea.population, vac.new_vaccinations,
SUM(CAST(vac.new_vaccinations as bigint)) OVER (Partition by dea.Location ORDER by dea.location,
dea.Date) AS RollingPeopleVaccinated
---, (RollingPeopleVaccinated/population)*100
FROM [COVID 19 PROJECT]..CovidVaccinations$ vac
JOIN [COVID 19 PROJECT]..CovidDeaths$ dea
ON dea.location = vac.location
and dea.date = vac.date
WHERE dea.continent is not null
--ORDER BY 2,3
SELECT *
FROM PercentPopulationVaccinated
Understanding Total Deaths on a Global Scale:
Another crucial aspect of my project involved examining the total number of deaths caused by COVID-19 worldwide. By extracting data on fatalities using SQL, I analyzed the impact of the virus in terms of human lives lost. Visualizing this data in Tableau allowed me to present a clear picture of the devastating consequences of the pandemic.
Calculating the Percentage of Population Infected:
To gain a deeper understanding of the virus's impact, I used SQL to calculate the percentage of populations affected by COVID-19 in various countries. By comparing the number of confirmed cases to the total population, I was able to uncover the severity of the virus's spread relative to the size of each country. Tableau visualizations helped depict this information in a concise and impactful manner.
/*
Queries used for Tableau Project
*/
-- 1.
Select SUM(new_cases) as total_cases, SUM(cast(new_deaths as int)) as total_deaths, SUM(cast(new_deaths as int))/SUM(New_Cases)*100 as DeathPercentage
From [COVID 19 PROJECT]..CovidDeaths$
--Where location like '%states%'
where continent is not null
--Group By date
order by 1,2
-- 2.
-- We take these out as they are not inluded in the above queries and want to stay consistent
-- European Union is part of Europe
Select location, SUM(cast(new_deaths as int)) as TotalDeathCount
From [COVID 19 PROJECT]..CovidDeaths$
--Where location like '%states%'
Where continent is null
and location not in ('World', 'European Union', 'International')
Group by location
order by TotalDeathCount desc
-- 3.
Select Location, Population, MAX(total_cases) as HighestInfectionCount, Max((total_cases/population))*100 as PercentPopulationInfected
From [COVID 19 PROJECT]..CovidDeaths$
--Where location like '%states%'
Group by Location, Population
order by PercentPopulationInfected desc
-- 4.
Select Location, Population,date, MAX(total_cases) as HighestInfectionCount, Max((total_cases/population))*100 as PercentPopulationInfected
From [COVID 19 PROJECT]..CovidDeaths$
--Where location like '%states%'
Group by Location, Population, date
order by PercentPopulationInfected desc
Conclusion:
Through my project on COVID-19 data analysis using SQL and Tableau, I was able to provide valuable insights into the global impact of the pandemic. By identifying countries with the highest infection rates, visualizing total deaths on a global scale, and calculating the percentage of population infected, I shed light on the severity of COVID-19 and its ramifications worldwide. This project not only allowed me to showcase my technical skills in SQL and Tableau but also deepened my understanding of data analysis and its importance in addressing global challenges.
As we continue to navigate the ongoing pandemic, the ability to analyze and interpret data becomes increasingly valuable. By leveraging the power of SQL and visualization tools like Tableau, we can contribute to a better understanding of COVID-19 and help inform decision-making processes. I am proud to have undertaken this project and believe it is a valuable addition to my portfolio, demonstrating my ability to harness data for impactful insights.
Note: Please ensure that you use accurate and up-to-date COVID-19 data sources and give credit to the sources used in your project.