- 👟 Ready To Run!
- 🔬 Data Science
- 🚀 Data Exploration and Cleaning
- 📈 Statistics & Graphing
- ⚕️ Healthcare
Requirements
- 🗺️ Living Atlas configured
In this part of the study, we will explore shortage of Mental Healthcare Providers and OB-GYN Providers. We will study:
# Import Libraries
from IPython.display import display
# Import arcgis
import arcgis
from arcgis.gis import GIS
from arcgis.features import FeatureLayer
from arcgis.mapping import WebMap
# Import libraries for data exploration
import pandas as pd
pd.set_option('display.max_columns', 500)
import numpy as np
# Import plotting libraries
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
# Import library for time
import time
# Create a GIS connection
gis = GIS("home")
Provider data
was geocoded using the GeoAnalytics server. We will get the geocoded provider data feature layer
. We will also get County level demographic data
from the 2018 USA Population Density layer.
# Search the feature layer
data_groups = gis.groups.search('"ArcGIS Sample Notebooks Data" owner:esri_notebook',
outside_org=True)
group_query = f"group: {data_groups[0].id}" if data_groups else ""
provider_data_item = gis.content.search(f'provider_data_geocoded_7_30 {group_query}',
item_type = "Feature Service",
outside_org=True)[0]
provider_data_item
# Get the layer needed for analysis
provider_data_layer = provider_data_item.layers[0]
provider_data_layer
# Look at the first 5 fields and their data types
for f in provider_data_layer.properties.fields[:5]:
print(f['name'], "{0:>35}".format(f['type']))
# Search for and get Population data layer
popdensity = gis.content.get('b9095ebdf5e8442588ab3f269dc7ee5e') #2020 layer
popdensity
# Check first 5 layers in population Density
popdensity.layers[:5]
# Look at first few field names for county layer
county_layer = popdensity.layers[46]
print('FIELD NAME', "{0:>50}".format('FIELD ALIAS'))
for field in county_layer.properties.fields[:10]:
print(field['name'], "{0:>50}".format(field['alias']))
%%time
# Get specific attributes for Counties
county_layer = popdensity.layers[46]
county_df = pd.DataFrame()
out_fields = ['Shape','ST_ABBREV','NAME','ASIAN_CY','AMERIND_CY','AVGHHSZ_CY','AVGHINC_CY','BLACK_CY',
'EDUCBASECY','HISPPOP_CY','MEDAGE_CY','MINORITYCY','OTHRACE_CY','PCI_CY','POPDENS_CY',
'UNEMPRT_CY','WHITE_CY','SMCOLL_CY','ASSCDEG_CY','BACHDEG_CY','GRADDEG_CY','TOTPOP_CY']
offset = 0
while offset <= 3000:
chunk_df = county_layer.query(out_fields=out_fields,return_all_records=False,
result_offset=offset,result_record_count=750,as_df=True)
county_df = pd.concat([chunk_df, county_df], ignore_index=True)
offset += 750
county_df.shape
The table below shows a list of providers shortlisted as Mental healthcare providers. Provider Taxonomy codes were filtered using this reference.
Taxonomy Code | Description | Taxonomy Code | Description |
---|---|---|---|
207QG0300X | Family Medicine Geriatric Medicine | 273R00000X | Hospital Units Psychiatric Unit |
103T00000X | Psychologist | 103TA0400X | Psychologist Addiction (Substance Abuse Disorder) |
103TA0700X | Psychologist Adult Development & Aging | 103TC0700X | Psychologist Clinical |
103TC2200X | Psychologist Clinical Child & Adolescent | 103TB0200X | Psychologist Cognitive & Behavioral |
103TC1900X | Psychologist Counseling | 103TE1000X | Psychologist Educational |
103TE1100X | Psychologist Exercise & Sports | 103TF0000X | Psychologist Family |
103TF0200X | Psychologist Forensic | 103TP2701X | Psychologist Group Psychotherapy |
103TH0004X | Psychologist Health | 103TH0100X | Psychologist Health Service |
103TM1700X | Psychologist Men & Masculinity | 103TM1800X | Psychologist Mental Retardation |
103TP0016X | Psychologist Prescribing (Medical) | 103TP0814X | Psychologist Psychoanalysis |
103TP2700X | Psychologist Psychotherapy | 103TR0400X | Psychologist Rehabilitation |
103TS0200X | Psychologist School | 103TW0100X | Psychologist Women |
106E00000X | Assistant Behavior Analyst | 106S00000X | Technician |
103TC0700X | Psychologist Clinical | 2084A0401X | Physicians Psychiatry & Neurology |
2084P0802X | Physicians Addiction Psychiatry | 2084B0002X | Physicians Bariatric Medicine |
2084P0804X | Physicians Child & Adolescent Psychiatry | 2084N0600X | Physicians Clinical Neurophysiology |
2084D0003X | Physicians Diagnostic Neuroimaging | 2084F0202X | Physicians Forensic Psychiatry |
2084P0805X | Physicians Geriatric Psychiatry | 2084H0002X | Physicians Hospice & Palliative Medicine |
2084P0005X | Physicians Neurodevelopmental Disabilities | 2084N0400X | Physicians Neurology |
2084N0402X | Physicians Child Neurology | 2084N0008X | Physicians Neuromuscular Medicine |
2084P2900X | Physicians Pain Medicine | 2084P0800X | Physicians Psychiatry |
2084P0015X | Physicians Psychosomatic Medicine | 2084S0012X | Physicians Sleep Medicine |
2084S0010X | Physicians Sports Medicine | 2084V0102X | Physicians Vascular Neurology |
364SA2200X | Nurse/Assiatant Adult Health | 364SP0808X | Nurse/Assiatant Psychiatric/Mental Health |
364SP0809X | Nurse/Assiatant Mental Health Adult | 364SP0807X | Nurse/Assiatant Mental Health Child & Adolescent |
364SP0810X | Nurse/Assiatant Mental Health Child & Family | 364SP0811X | Nurse/Assiatant Mental Health Chronically Ill |
364SP0812X | Nurse/Assiatant Mental Health Community | 364SP0813X | Nurse/Assiatant Mental Health Geropsychiatric |
283Q00000X | Hospitals Psychiatric Hospital | 261QM0801X | Ambulatory Health Care Facilities/Clinic/Center Mental Health |
2084P0800X | Physician Psychiatry |
Let's explore the distribution of mental health providers in the US using a heatmap.
# Create a map
mental_map = gis.map('USA', 4)
mental_map
This map paints a grim picture of the availability of mental healthcare providers. There are states in every region with vast areas of no mental health providers or very few providers. To list, some of these states include:
# Add provider data to map
renderer = {"renderer": "autocast", #This tells python to use JS autocasting
"type": "heatmap",
"blurRadius":1, # changes the size of the clusters
"maxPixelIntensity":2,
"minPixelIntensity":0,
"field":None}
renderer["colorStops"] = [{"ratio":0,"color":[63, 40, 102, 0]},
{"ratio":0.25,"color":[167,97,170,179]},
{"ratio":0.50,"color":"#7b3ce9"},
{"ratio":0.75,"color":[222,102,0,179]},
{"ratio":1,"color":[244,204,0,179]}]
mental_map.add_layer(provider_data_layer,
{ "type": "FeatureLayer",
"renderer": renderer,
"definition_expression" : "user_taxonomy_code_1 in ('2084P0800X','207QG0300X','273R00000X','103T00000X','103TA0400X','103TA0700X','103TC0700X','103TC2200X','103TB0200X','103TC1900X','103TE1000X','103TE1100X','103TF0000X','103TF0200X','103TP2701X','103TH0004X','103TH0100X','103TM1700X','103TM1800X','103TP0016X','103TP0814X','103TP2700X','103TR0400X','103TS0200X','103TW0100X','106E00000X','106S00000X','2084A0401X','2084P0802X','2084B0002X','2084P0804X','2084N0600X','2084D0003X','2084F0202X','2084P0805X','2084H0002X','2084P0005X','2084N0400X','2084N0402X','2084N0008X','2084P2900X','2084P0015X','2084S0012X','2084S0010X','2084V0102X','364SP0808X','364SP0809X','364SP0807X','364SP0810X','364SP0811X','364SP0812X','364SP0813X','283Q00000X','261QM0801X')"
})
# Add Legend
mental_map.legend = True
# Save as a web map item
mental_map_item = mental_map.save({'title':'mental_heatmap',
'snippet':'Heat map',
'tags':'Mental Health',
'typeKeywords':'WebMap'})
mental_map_item
Let's find out the ratio of population to mental healthcare providers to understand which states have the least number of providers.
We will use provider_data_layer
to subset mental healthcare providers.
%%time
# Get provider data for mental healthcare providers only
taxonomy_codes = ['2084P0800X','207QG0300X','273R00000X','103T00000X','103TA0400X','103TA0700X',
'103TC0700X','103TC2200X','103TB0200X','103TC1900X','103TE1000X','103TE1100X',
'103TF0000X','103TF0200X','103TP2701X','103TH0004X','103TH0100X','103TM1700X',
'103TM1800X','103TP0016X','103TP0814X','103TP2700X','103TR0400X','103TS0200X',
'103TW0100X','106E00000X','106S00000X','2084A0401X','2084P0802X','2084B0002X',
'2084P0804X','2084N0600X','2084D0003X','2084F0202X','2084P0805X','2084H0002X',
'2084P0005X','2084N0400X','2084N0402X','2084N0008X','2084P2900X','2084P0015X',
'2084S0012X','2084S0010X','2084V0102X','364SP0808X','364SP0809X','364SP0807X',
'364SP0810X','364SP0811X','364SP0812X','364SP0813X','283Q00000X','261QM0801X']
out_fields = ['user_npi','user_entity_type','user_provider_gender','user_taxonomy_code_1',
'user_full_address','postal','city','subregion','region','regionabbr']
where_str = "user_taxonomy_code_1 in ({})".format(",".join(f"'{x}'" for x in taxonomy_codes))
out_str = "{}".format(",".join(f"{x}" for x in out_fields))
mental_df = provider_data_layer.query(where=where_str,
out_fields=out_str, as_df=True)
# Create dataframe of mental healthcare provider counts by state
mental_count_df = pd.DataFrame(mental_df['regionabbr'].value_counts().reset_index().values,
columns=['regionabbr','Provider_Count'])
# Plot mental healthcare Providers by State
plt.figure(figsize=(25,12))
sns.barplot(mental_count_df['regionabbr'].iloc[:-19],
mental_count_df['Provider_Count'].iloc[:-19],
color='lightcoral')
plt.title('Mental Healthcare Providers by State', fontsize=22)
plt.xlabel('States', fontsize=18)
plt.ylabel('Provider Count', fontsize=18);
We will use the population density layer at state level to create this dataframe
# State population dataframe
state_layer = popdensity.layers[43]
state_df = state_layer.query(out_fields='STATE_NAME,ST_ABBREV,TOTPOP_CY', as_df=True)
state_df.head()
# Merge provider count and population at state level
# Merge with State data on left to preserve 'polygon' geometry
state_mental_df = pd.merge(state_df,mental_count_df,right_on='regionabbr',
left_on='ST_ABBREV',how='inner')
state_mental_df.head()
# Create new columns that shows people per provider
state_mental_df['people_per_prov'] = state_mental_df['TOTPOP_CY']/state_mental_df['Provider_Count']
# Arrange dataframe by people_per_prov descending
state_mental_df = state_mental_df.sort_values(by=['people_per_prov'], ascending=False)
state_mental_df.head()
# Check geometry type
state_mental_df.spatial.geometry_type
Let's plot the number of people per mental healthcare provider accross all states on a map.
mental_shortage_map = gis.map('USA', zoomlevel=4)
mental_shortage_map
From this map we can see that, Mississippi is the worst with highest no. of people per mental healthcare provider followed by Alabama, Idaho, Texas, Arkansas and Iowa.
mental_shortage_map = gis.map('USA', zoomlevel=4)
mental_shortage_map
# Define Renderer
esriTest = {"renderer": { #This tells python to use JS autocasting
"type": "classBreaks",
"field":"people_per_prov",
"transparency":.5,
"minValue":1}}
# Define Manual Class breaks
esriTest['renderer']["classBreakInfos"] = [{
"classMaxValue": 1000.00,
"label": "0 - 1000.00",
"description": "0 - 1000.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [255,247,236,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 1500.00,
"label": "1000.001 - 1500.00",
"description": "1000.001 - 1500.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [253,220,174,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 2000.00,
"label": "1500.001 - 2000.00",
"description": "1500.001 - 2000.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [252,177,123,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 2500.00,
"label": "2000.001 - 2500.00",
"description": "2000.001 - 2500.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [241,109,75,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 3000.00,
"label": "2500.001 - 3000.00",
"description": "2500.001 - 3000.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [200,28,18,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}]
# Plot Map using defined Renderer
state_mental_df.spatial.plot(map_widget=mental_shortage_map, renderer=esriTest['renderer'])
# Add Legend
mental_shortage_map.legend=True
# Save as a Web Map Item
mental_shortage_item = mental_shortage_map.save({'title':'mental_shortage_map',
'snippet':'shortage map',
'tags':'Mental Health',
'typeKeywords':'WebMap'})
mental_shortage_item
# Plot No. of People per Mental Heathcare Provider by State
plt.figure(figsize=(25,12))
sns.barplot(state_mental_df['regionabbr'], state_mental_df['people_per_prov'], color='lightcoral')
plt.title('No. of People per Mental Heathcare Provider by State', fontsize=22)
plt.xlabel('States', fontsize=18)
plt.ylabel('No. of People', fontsize=18)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15);
On average, there is 1 mental healthcare provider for ~2754 people in Mississippi compared to 1 mental healthcare provider for ~364 people per provider in DC. The difference is drastic.
Mississippi has the highest number of people per mental healthcare provider. Let's explore Mississippi to find out which counties have the lowest number of providers.
# County population df
county_layer = popdensity.layers[46]
MS_pop_df = county_layer.query(where="ST_ABBREV='MS'",
out_fields='ST_ABBREV,NAME,TOTPOP_CY', as_df=True)
MS_pop_df.head()
# Get provider data for obgyn providers only
mental_MS_df = mental_df[mental_df['regionabbr']=='MS']
mental_MS_df.head()
mental_MS_df.shape
# Create dataframe of provider counts by county
mental_MScounty_df = pd.DataFrame(mental_MS_df['subregion'].value_counts().reset_index().values,
columns=['County','Provider_Count'])
mental_MScounty_df.head()
# Merge provider count and women data at county level for ND
county_mental_df = pd.merge(MS_pop_df,mental_MScounty_df,
left_on='NAME', right_on='County',how='left')
county_mental_df.head()
# Look at null values
county_mental_df[county_mental_df['Provider_Count'].isnull()]
We can see that,
- 18 counties in Mississippi do not have any mental healthcare provider.
We will replace NaN values for Provider Count in these counties with 1 to plot them on the map to see which counties have highest number of people per provider.
county_mental_df['Provider_Count'].replace(np.nan,1,inplace=True)
# Create new columns that shows Mental Healthcare provider by population
county_mental_df['people_per_prov'] = county_mental_df['TOTPOP_CY']/county_mental_df['Provider_Count']
Let's plot the number of people per provider for all counties in Mississippi on a map.
mental_countyshortage_map = gis.map('Mississippi, USA', zoomlevel=7)
mental_countyshortage_map.layout.height="650px"
mental_countyshortage_map
We can see that,
- Marshall County seems to be the worst with 1 provider for 37137 people. Some other counties with high population per provider are Pontotoc, Neshoba, Scott and Union.
# Define Renderer
mentalCountyTest = {"renderer": { #This tells python to use JS autocasting
"type": "classBreaks",
"field":"people_per_prov",
"transparency":.5,
"minValue":1}}
# Define Manual Class breaks
mentalCountyTest['renderer']["classBreakInfos"] = [{
"classMaxValue": 7500.00,
"label": "0 - 7500.00",
"description": "0 - 7500.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [255,247,236,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 15000.00,
"label": "7500.001 - 15000.00",
"description": "7500.001 - 15000.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [253,220,174,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 22500.00,
"label": "15000.001 - 22500.00",
"description": "15000.001 - 22500.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [252,177,123,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 30000.00,
"label": "22500.001 - 30000.00",
"description": "22500.001 - 30000.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [241,109,75,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 37500.00,
"label": "30000.001 - 37500.00",
"description": "30000.001 - 37500.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [200,28,18,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}]
# Plot Map using defined Renderer
county_mental_df.spatial.plot(map_widget=mental_countyshortage_map, renderer=mentalCountyTest['renderer'])
# Plot legend
mental_countyshortage_map.legend = True
# Save as a Web Map Item
mental_cs_item = mental_countyshortage_map.save({'title':'mental_countyshortage_map',
'snippet':'shortage map',
'tags':'Mental Health',
'typeKeywords':'WebMap'})
mental_cs_item
The table below shows top 10 counties with highest number of people per mental healthcare provider
# Counties with highest number of people per mental healthcare provider
county_ppp = county_mental_df.sort_values(by='people_per_prov', ascending=False)[['NAME','TOTPOP_CY','Provider_Count','people_per_prov']]
county_ppp.head(10)
To visualize high level key facts about a geography and gain greater insights, various infographics can be created using ArcGIS Business Analyst. Learn more about ArcGIS Business Analyst.
The image below shows an infographic of key facts for Marshall County, MS
.
To summarize, Mississippi has the highest number of people per Mental Healthcare Provider. Also:
The table below shows a list of providers shortlisted as OB-GYN providers. Provider Taxonomy codes were filtered for OB-GYN providers using this reference.
Taxonomy Code | Description |
---|---|
207V00000X | Obstetrics & Gynecology |
207VC0200X | Critical Care Medicine |
207VF0040X | Female Pelvic Medicine and Reconstructive Surgery |
207VX0201X | Gynecologic Oncology |
207VG0400X | Gynecology |
207VH0002X | Hospice and Palliative Medicine |
207VM0101X | Maternal & Fetal Medicine |
207VB0002X | Obesity Medicine |
207VX0000X | Obstetrics |
207VE0102X | Reproductive Endocrinology |
363LX0001X | Nurse Practitioner-Obstetrics & Gynecology |
163WR1000X | Registered Nurse-Reproductive Endocrinology/Infertility |
163WW0101X | Registered Nurse-Women's Health Care, Ambulatory |
282NW0100X | General Acute Care Hospital - Women |
Let's explore the distribution of OB-GYN providers in US using a heatmap.
# Create Map
women_map = gis.map('USA', 4)
women_map
This map paints a grim picture of the availability of OBGYN healthcare providers. We can see vast areas in Midwest and West with NO or very few OBGYN providers.
# Define renderer and add provider data for OBGYN providers
renderer = {"renderer": "autocast", #This tells python to use JS autocasting
"type": "heatmap",
"blurRadius":1, # changes the size of the clusters
"maxPixelIntensity":2,
"minPixelIntensity":0,
"field":None}
renderer["colorStops"] = [{"ratio":0,"color":[63, 40, 102, 0]},
{"ratio":0.25,"color":[167,97,170,179]},
{"ratio":0.50,"color":"#7b3ce9"},
{"ratio":0.75,"color":[222,102,0,179]},
{"ratio":1,"color":[244,204,0,179]}]
women_map.add_layer(provider_data_layer,
{ "type": "FeatureLayer",
"renderer": renderer,
"definition_expression" : "user_taxonomy_code_1 in ('207V00000X','207VC0200X','207VF0040X','207VX0201X','207VG0400X','207VH0002X','207VM0101X','207VB0002X','207VX0000X','207VE0102X','363LX0001X','163WR1000X','163WW0101X','282NW0100X')"
# "definition_expression" : "user_taxonomy_code_1 = '207VC0200X' or user_taxonomy_code_1 = '207V00000X'"
})
# Add Legend
women_map.legend = True
# Save as a web map item
women_map_item = women_map.save({'title':'women_heatmap',
'snippet':'Heat map',
'tags':'Women Health',
'typeKeywords':'WebMap'})
women_map_item
In this section our goal is to identify which states have the highest mother's to providers ratio. Let's get data from the American Community Survey (ACS) about fertility in past 12 months by age of mother using ACS_Fertility_by_Age_Boundaries layer at State, County and Tract level. We will merge OBGYN providers data with Fertility data to achieve this.
fertility_item = gis.content.get('7995cd11748249fb8e23c4a5ad901dde')
fertility_item
fertility_item.layers
# Define layers for State, County and Tract (Percent of women 15 to 50 who had a birth in the past 12 months)
fertility_state = fertility_item.layers[0]
fertility_county = fertility_item.layers[1]
fertility_tract = fertility_item.layers[2]
We will use the fertility layer at state level to create this dataframe
# State population dataframe
fertility_df = fertility_state.query(where="B13016_001E>1", as_df=True)
fertility_df.head()
We will use provider_data_layer
to subset OB-GYN providers
%%time
# Get provider data for obgyn providers only
taxonomy_codes = ['207V00000X','207VC0200X','207VF0040X','207VX0201X','207VG0400X',
'207VH0002X','207VM0101X','207VB0002X','207VX0000X','207VE0102X',
'363LX0001X','163WR1000X','163WW0101X','282NW0100X']
out_fields = ['user_npi','user_entity_type','user_provider_gender','user_taxonomy_code_1',
'user_full_address','postal','city','subregion','region','regionabbr']
where_str = "user_taxonomy_code_1 in ({})".format(",".join(f"'{x}'" for x in taxonomy_codes))
out_str = "{}".format(",".join(f"{x}" for x in out_fields))
obgyn_df = provider_data_layer.query(where=where_str,
out_fields=out_str, as_df=True)
obgyn_df.head()
# Create dataframe of obgyn provider counts by state
obgyn_count_df = pd.DataFrame(obgyn_df['regionabbr'].value_counts().reset_index().values,
columns=['regionabbr','Provider_Count'])
# Plot OB-GYN Providers by State
plt.figure(figsize=(25,12))
sns.barplot(obgyn_count_df['regionabbr'].iloc[:-8],
obgyn_count_df['Provider_Count'].iloc[:-8],
color='lightcoral')
plt.title('OB-GYN Providers by State', fontsize=22)
plt.xlabel('States', fontsize=18)
plt.ylabel('Provider Count', fontsize=18);
# Merge provider count and women_df at state level
state_obgyn_df = pd.merge(fertility_df,obgyn_count_df,right_on='regionabbr',
left_on='STUSPS',how='inner')
Create 2 new attributes: women_per_prov
and mother_per_prov
to capture the number of women and mothers per OB-GYN provider
# Create new columns that shows provider by women pop
state_obgyn_df['women_per_prov'] = state_obgyn_df['B13016_001E']/state_obgyn_df['Provider_Count']
# Create new columns that shows provider by mother pop
state_obgyn_df['mother_per_prov'] = state_obgyn_df['B13016_002E']/state_obgyn_df['Provider_Count']
# Arrange dataframe by mother_per_prov descending
state_obgyn_df = state_obgyn_df.sort_values(by=['mother_per_prov'], ascending=False)
# Subset dataframe for specific columns
state_obgyn_df = state_obgyn_df.loc[:,['OBJECTID','SHAPE','regionabbr','Provider_Count',
'women_per_prov','mother_per_prov']]
state_obgyn_df.head()
# Check geometry type
state_obgyn_df.spatial.geometry_type
Let's plot the number of Mothers per OBGYN provider accross all states on a map.
obgyn_shortage_map = gis.map('USA', zoomlevel=4)
obgyn_shortage_map
From this map, we can see that number of women (ages 15 to 50) who had a birth in the past 12 months per OBGYN provider is high in Utah, North Dakota, South Dakota and Arkansas.
State | No. of Mothers per OBGYN provider |
---|---|
ND | 112 |
UT | 100 |
AR | 93 |
SD | 93 |
ID | 92 |
# Define Renderer
stateObgynTest = {"renderer": { #This tells python to use JS autocasting
"type": "classBreaks",
"field":"mother_per_prov",
"transparency":.5,
"minValue":1}}
# Define Manual Class breaks
stateObgynTest['renderer']["classBreakInfos"] = [{
"classMaxValue": 23.00,
"label": "0 - 23.00",
"description": "0 - 23.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [255,247,236,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 46.00,
"label": "23.001 - 46.00",
"description": "23.001 - 46.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [253,220,174,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 69.00,
"label": "46.001 - 69.00",
"description": "46.001 - 69.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [252,177,123,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 92.00,
"label": "69.001 - 92.00",
"description": "69.001 - 92.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [241,109,75,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 115.00,
"label": "92.001 - 115.00",
"description": "92.001 - 115.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [200,28,18,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}]
# Plot Map using defined Renderer
state_obgyn_df.spatial.plot(map_widget=obgyn_shortage_map, renderer=stateObgynTest['renderer'])
# Add Legend
obgyn_shortage_map.legend=True
# Plot No. of Mothers (age 15 to 50) per OB-GYN Provider by State
plt.figure(figsize=(25,12))
sns.barplot(state_obgyn_df['regionabbr'], state_obgyn_df['mother_per_prov'],color='lightcoral')
plt.title('No. of Mothers (age 15 to 50) per OB-GYN Provider by State', fontsize=22)
plt.xlabel('States', fontsize=18)
plt.ylabel('No. of Mothers (age 15 to 50)', fontsize=18)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15);
On average, there were ~112 mothers and 1655 women per provider in ND compared to ~31 mothers and 662 women per provider in DC. The difference is drastic.
North Dakota has the highest number of mothers and women per OB-GYN provider. From the fertility map above, we also saw that percent of women (ages 15 to 50) who had a birth in the past 12 months is high in North Dakota.
Let's explore North Dakota to find out which counties have the lowest number of OBGYN providers.
# State population df
ND_fertility_df = fertility_county.query(where="STATE='North Dakota'", as_df=True)
ND_fertility_df.head()
obgyn_ND_df = obgyn_df[obgyn_df['regionabbr']=='ND']
obgyn_ND_df.head()
obgyn_ND_df.shape
# Create dataframe of obgyn provider counts by county
obgyn_NDcounty_df = pd.DataFrame(obgyn_ND_df['subregion'].value_counts().reset_index().values, columns=['County','Provider_Count'])
obgyn_NDcounty_df.head()
# Merge provider count and women data at county level for ND
county_obgyn_df = pd.merge(ND_fertility_df,obgyn_NDcounty_df,left_on='NAME', right_on='County',how='left')
# Look at null values
county_obgyn_df[county_obgyn_df['Provider_Count'].isnull()][['NAME','Provider_Count']]
county_obgyn_df[county_obgyn_df['Provider_Count'].isnull()].shape
We can see that:
- 43 out of 53 counties in ND do not have any Obgyn healthcare providers.
We will replace NaN values for Provider Count in these counties with 1 to plot them on the map to see which counties have highest number of mothers per provider.
# Replace NaN with 1
county_obgyn_df['Provider_Count'].replace(np.nan,1,inplace=True)
# Create new columns that shows provider by women pop
county_obgyn_df['women_per_prov'] = county_obgyn_df['B13016_001E']/county_obgyn_df['Provider_Count']
# Create new columns that shows provider by mother pop
county_obgyn_df['mother_per_prov'] = county_obgyn_df['B13016_002E']/county_obgyn_df['Provider_Count']
# Arrange dataframe by mother_per_prov descending and then B13016_002E (women who gave birht) descending
county_obgyn_df = county_obgyn_df.sort_values(by=['mother_per_prov','B13016_002E'],
ascending=False)[['OBJECTID','SHAPE','NAME','Provider_Count',
'B13016_001E','B13016_002E','women_per_prov',
'mother_per_prov']]
county_obgyn_df.columns = ['OBJECTID','SHAPE','Name','Provider_Count', "Total Women (15 to 50)",
"Women who had birth (past 12 months)", 'women_per_prov', 'mother_per_prov']
county_obgyn_df.head()
Let's plot the number of Mothers per OBGYN provider for all counties in North Dakota on a map.
obgyn_countyshortage_map = gis.map('North Dakota, USA', zoomlevel=6)
obgyn_countyshortage_map
We can see that:
- Morton County seems to be the worst with 555 women who gave birth but no provider.
# Define Renderer
obgynCountyTest = {"renderer": { #This tells python to use JS autocasting
"type": "classBreaks",
"field":"mother_per_prov",
"transparency":.5,
"minValue":1}}
# Define Manual Class breaks
obgynCountyTest['renderer']["classBreakInfos"] = [{
"classMaxValue": 100.00,
"label": "0 - 100.00",
"description": "0 - 100.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [255,247,236,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 200.00,
"label": "100.001 - 200.00",
"description": "100.001 - 200.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [253,220,174,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 300.00,
"label": "200.001 - 300.00",
"description": "200.001 - 300.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [252,177,123,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 400.00,
"label": "300.001 - 400.00",
"description": "300.001 - 400.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [241,109,75,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}, {
"classMaxValue": 700.00,
"label": "400.001 - 700.00",
"description": "400.001 - 700.00",
"symbol": {
"type": "esriSFS",
"style": "esriSFSSolid",
"color": [200,28,18,178.5],
"outline": {
"style": "esriSLSSolid",
"type": "esriSLS",
"color": [128,128,128,255],
"width": 2
}
}
}]
# Plot Map using defined Renderer
county_obgyn_df.spatial.plot(map_widget=obgyn_countyshortage_map, renderer=obgynCountyTest['renderer'])
# Add legend
obgyn_countyshortage_map.legend=True
# Save as a Web Map Item
obgyn_cs_item = obgyn_countyshortage_map.save({'title':'obgyn_countyshortage_map',
'snippet':'shortage map',
'tags':'Women Health',
'typeKeywords':'WebMap'})
obgyn_cs_item
To visualize high level key facts about a geography and gain greater insights, various infographics can be created using ArcGIS Business Analyst. Learn more about ArcGIS Business Analyst.
The image below shows an infographic of key facts for Morton County, ND
.
North Dakota has the highest number of women (ages 15 to 50) who had a birth in the past 12 months per OBGYN provider. Also:
To summarize, we explored the shortage of Mental Health and OBGYN Providers accross United States. We saw that:
[1] https://www.samhsa.gov/data/sites/default/files/NSDUH-FFR1-2015/NSDUH-FFR1-2015/NSDUH-FFR1-2015.pdf
[2] https://www.nami.org/Learn-More/Mental-Health-By-the-Numbers
[3] https://www.nimh.nih.gov/health/statistics/suicide.shtml
[4] https://news.aamc.org/patient-care/article/labor-pains-obgyn-shortage/
[5] https://www.medpagetoday.com/obgyn/generalobgyn/73785
[6] http://fortune.com/2018/06/27/these-cities-are-most-likely-to-face-an-ob-gyn-shortage-by-2020/