Analysis Notebooks:
  1. Part-1 COVID-19 trends in US nursing homes - Part 1/2
  2. Part-2 COVID-19 trends in US nursing homes - Part 2/2
  3. Part-3 Covid Data Preparation

Key Libraries: ArcGIS API for Python, ArcPy, Matplotlib, Seaborn, Statsmodels, pmdarima.

Language: Python

Project Details:

The Covid-19 pandemic has swept the long-term care facilities in the United States, resulting in nearly 4% of all cases (1.4M) and 31% of all deaths (182K) as of June 2021. The impact on nursing homes is even worse where nearly 1 in 10 residents have died.

In this 2-part study, we use the Nursing Home Covid-19 data provided by the Centers for Medicare and Medicaid Services. Each record in the data represents information about an individual nursing home in the United States for each week from May 25, 2020 - June 6, 2021. We will:

  • Explore the data to understand the distribution of average weekly resident deaths per 1000 residents.
  • Cluster the data using Time Series Clustering to better understand spatiotemporal patterns in our data.
  • Generate forecasts for average weekly resident deaths per 1000 residents using:
    • Exponential Smoothing Forecast model
    • Forest-based Forecast model
    • ARIMA model
  • Create plots to visualize the fitted values, forecasted values and their confidence intervals, and actual data values for all models.
  • Plot forecast errors generated by all models.

Part-1 COVID-19 trends in US nursing homes - Part 1/2
In this notebook, we will:

  • Start by exploring the data to understand the distribution of average weekly resident cases and deaths per 1000 residents.
  • Create a Space Time Cube to structure the data into a netCDF data format and aggregate resident death rates at the US county geographic level.
  • Cluster the data using Time Series Clustering to better understand spatiotemporal patterns in our data.
  • Generate forecasts for average weekly resident deaths per 1000 residents using Exponential Smoothing Forecast and Forest-based Forecast tools.
  • Plot forecast and validation errors as generated by the two models.

Part-2 COVID-19 trends in US nursing homes - Part 2/2
In this notebook, we will:

  • Build ARIMA models to generate forecasts for average weekly resident deaths per 1000 residents in each county.
  • Combine ARIMA results with Exponential Smoothing, Forest-based Forecast models and actual data.
  • Create plots to visualize the fitted values, forecasted values and their confidence intervals, and actual data values for all models.
  • Plot forecast errors generated by all models.

Part-3 Covid Data Preparation
This notebook cleans and engineers the data to be used for Covid-19 Nursing Home Resident Deaths study. We downloaded the data as a .csv file where each record represents information about an individual nursing home in the United States for each week from May 25, 2020 - June 6, 2021