# Import Libraries
from arcgis.gis import GIS
from arcgis.geoenrichment import Country
# Create a GIS Connection
gis = GIS(profile='your_online_profile')
Each country has several named statistical areas in a hierarchy of geography levels (such as states, counties, zip codes, etc). The Country
class can be used to discover the data collections, sub-geographies and available reports for a country. When working with a particular country, you will find it convenient to get a reference to it using the Country.get()
method. This method accepts the country name or its 2 letter abbreviation or ISO3 code and returns an instance of that country.
Let's look into some countries and explore their subgeographies.
# Get US as a country
usa = Country.get('US')
type(usa)
arcgis.geoenrichment.enrichment.Country
Now that we have a country, let's look at different levels of geographies available for U.S.
us_levels = [level for level in usa.levels]
us_levels
[{'id': 'US.WholeUSA', 'name': 'Entire Country', 'isWholeCountry': True, 'adminLevel': 'Admin1', 'singularName': 'United States of America', 'pluralName': 'United States of America'}, {'id': 'US.States', 'name': 'States', 'isWholeCountry': False, 'adminLevel': 'Admin2', 'singularName': 'State', 'pluralName': 'States'}, {'id': 'US.DMA', 'name': 'DMAs', 'isWholeCountry': False, 'adminLevel': '', 'singularName': 'DMA', 'pluralName': 'DMAs', 'description': 'A Designated Market Area (DMA), also referred to as a media market, is a region of the United States that is used to define television and radio markets. DMAs are determined by the Nielsen Company and are usually defined based on metropolitan areas, with suburbs often being combined within. Esri creates these boundaries by dissolving the TIGER 2020 Block Groups with Esri modified Shoreline.'}, {'id': 'US.CD', 'name': 'Congressional Districts', 'isWholeCountry': False, 'adminLevel': '', 'singularName': 'Congressional District', 'pluralName': 'Congressional Districts'}, {'id': 'US.CBSA', 'name': 'CBSAs', 'isWholeCountry': False, 'adminLevel': '', 'singularName': 'CBSA', 'pluralName': 'CBSAs', 'description': 'Core-Based Statistical Areas (CBSAs) are comprised of Metropolitan and Micropolitan Statistical Areas. Metropolitan statistical areas have at least one urbanized area of 50,000 or more population and Micropolitan statistical areas have at least one urban cluster of at least 10,000 but less than 50,000 population. CBSAs are defined by the Office of Management and Budget (OMB) and are comprised of whole counties and Esri created these boundaries by dissolving the TIGER 2020 County Boundaries with Esri modified shoreline.'}, {'id': 'US.Counties', 'name': 'Counties', 'isWholeCountry': False, 'adminLevel': 'Admin3', 'singularName': 'County', 'pluralName': 'Counties'}, {'id': 'US.CSD', 'name': 'County Subdivisions', 'isWholeCountry': False, 'adminLevel': '', 'singularName': 'County Subdivision', 'pluralName': 'County Subdivisions'}, {'id': 'US.ZIP5', 'name': 'ZIP Codes', 'isWholeCountry': False, 'adminLevel': 'Admin4', 'singularName': 'ZIP Code', 'pluralName': 'ZIP Codes'}, {'id': 'US.Places', 'name': 'Cities and Towns (Places)', 'isWholeCountry': False, 'adminLevel': '', 'singularName': 'Place', 'pluralName': 'Places'}, {'id': 'US.Tracts', 'name': 'Census Tracts', 'isWholeCountry': False, 'adminLevel': '', 'singularName': 'Census Tract', 'pluralName': 'Census Tracts'}, {'id': 'US.BlockGroups', 'name': 'Block Groups', 'isWholeCountry': False, 'adminLevel': '', 'singularName': 'Block Group', 'pluralName': 'Block Groups'}]
We can see various levels of subgeographies for the U.S. such as states, counties, cities, places being returned as a dictionary object, and the key-value pair of the dictionary includes id, name, adminLevel
etc.
The subgeographies
property of a country can be used to discover these standard geographic/statistical areas within that country.
This information is available through a heirarchy of dynamic properties (like states, counties, tracts, zip-codes,...). Each such dynamic property reflects the geographical levels within that country, with subgeographies grouped logically under the higher levels of geographies. The properties are dictionaries containing the names of the standard geographic places and their values are instances of NamedArea
class. The NamedArea
objects can be used as study areas in the enrich()
method.
Note: Setting the `IPCompleter.greedy=True` configuration option in Jupyter notebook enables you to dynamically discover the various levels of subgeographies using intellisense, as in the example below:
%config IPCompleter.greedy=True
usa.subgeographies.states
returns a dictionary of different state names along with their details. To view all states and their details you may use usa.subgeographies.states
in the code cell. We will only look at the first few records.
us_states = [state for state in usa.subgeographies.states.items()][:5]
us_states
[('Alabama', <NamedArea name:"Alabama" area_id="01", level="US.States", country="United States">), ('Alaska', <NamedArea name:"Alaska" area_id="02", level="US.States", country="United States">), ('Arizona', <NamedArea name:"Arizona" area_id="04", level="US.States", country="United States">), ('Arkansas', <NamedArea name:"Arkansas" area_id="05", level="US.States", country="United States">), ('California', <NamedArea name:"California" area_id="06", level="US.States", country="United States">)]
We see that various states are listed along with the "NamedArea" object that contains different properties such as name, area_id, level etc. Let's select a state and further explore its subgeographies.
# Select a state
us_state = usa.subgeographies.states['California']
us_state
<NamedArea name:"California" area_id="06", level="US.States", country="United States">
Visualize on a map
# Create a map
state_map = gis.map('California')
state_map
# Plot on a map
state_map.draw(us_state.geometry)
Let's look at different counties in California. Following a similar pattern as above, we will look at the first few counties. To look at all counties you may use usa.subgeographies.states['California'].counties
.
ca_counties = [county for county in usa.subgeographies.states['California'].counties.items()][:5]
ca_counties
[('Alameda_County', <NamedArea name:"Alameda County" area_id="06001", level="US.Counties", country="United States">), ('Alpine_County', <NamedArea name:"Alpine County" area_id="06003", level="US.Counties", country="United States">), ('Amador_County', <NamedArea name:"Amador County" area_id="06005", level="US.Counties", country="United States">), ('Butte_County', <NamedArea name:"Butte County" area_id="06007", level="US.Counties", country="United States">), ('Calaveras_County', <NamedArea name:"Calaveras County" area_id="06009", level="US.Counties", country="United States">)]
Let's select a county in California. Notice how a county can be selected using its key i.e. Alameda_County
or San_Bernardino_County
.
sb_county = usa.subgeographies.states['California'].counties['San_Bernardino_County']
sb_county
<NamedArea name:"San Bernardino County" area_id="06071", level="US.Counties", country="United States">
Visualize on a map
# Create a map
county_map = gis.map('California',6)
county_map
# Plot on a map
county_map.draw(sb_county.geometry)
Let's select a county and look at different tracts in the county. We will look at the first few tracts in San Bernardino
county. To look at all tracts you may use usa.subgeographies.states['California'].counties['San_Bernardino_County'].tracts
.
sb_tracts = [tract for tract in usa.subgeographies.states['California'].counties
['San_Bernardino_County'].tracts.items()][:5]
sb_tracts
[('060710002.03', <NamedArea name:"060710002.03" area_id="06071000203", level="US.Tracts", country="United States">), ('060710002.05', <NamedArea name:"060710002.05" area_id="06071000205", level="US.Tracts", country="United States">), ('060710002.07', <NamedArea name:"060710002.07" area_id="06071000207", level="US.Tracts", country="United States">), ('060710002.08', <NamedArea name:"060710002.08" area_id="06071000208", level="US.Tracts", country="United States">), ('060710004.01', <NamedArea name:"060710004.01" area_id="06071000401", level="US.Tracts", country="United States">)]
Now, let's select a particular tract using its key.
usa.subgeographies.states['California'].counties['San_Bernardino_County'].tracts['060710002.03']
<NamedArea name:"060710002.03" area_id="06071000203", level="US.Tracts", country="United States">
Cities and Towns in a state can be accessed using Places
subgeography. We will look at the first few places in California. To view all places, you may use usa.subgeographies.states['California'].places
.
ca_places = [place for place in usa.subgeographies.states['California'].places.items()][:5]
ca_places
[('Acalanes_Ridge_CDP', <NamedArea name:"Acalanes Ridge CDP" area_id="0600135", level="US.Places", country="United States">), ('Acampo_CDP', <NamedArea name:"Acampo CDP" area_id="0600156", level="US.Places", country="United States">), ('Acton_CDP', <NamedArea name:"Acton CDP" area_id="0600212", level="US.Places", country="United States">), ('Adelanto_city', <NamedArea name:"Adelanto city" area_id="0600296", level="US.Places", country="United States">), ('Adin_CDP', <NamedArea name:"Adin CDP" area_id="0600310", level="US.Places", country="United States">)]
A particular city or town can be selected using its key. So we can use census Acalanes_Ridge_CDP
key to get the details.
CDP
here stands for census designated places. Census designated places (CDPs) are statistical geographic entities representing closely settled, unincorporated communities that are locally recognized and identified by name. They are the statistical equivalents of incorporated places, with the primary differences being the lack of a legally defined boundary and an active, functioning governmental structure.
usa.subgeographies.states['California'].places['Fontana_city']
<NamedArea name:"Fontana city" area_id="0624680", level="US.Places", country="United States">
Zip codes can be accessed at both the state and county levels. zip5
is used to select a zip code subgeography. Let's look at how zip codes can be selected.
Let's view a few zip codes for California and then select one.
zip_ca = [zip_code for zip_code in usa.subgeographies.states['California'].zip5.items()][:5]
zip_ca
[('90001', <NamedArea name:"Los Angeles" area_id="90001", level="US.ZIP5", country="United States">), ('90002', <NamedArea name:"Los Angeles" area_id="90002", level="US.ZIP5", country="United States">), ('90003', <NamedArea name:"Los Angeles" area_id="90003", level="US.ZIP5", country="United States">), ('90004', <NamedArea name:"Los Angeles" area_id="90004", level="US.ZIP5", country="United States">), ('90005', <NamedArea name:"Los Angeles" area_id="90005", level="US.ZIP5", country="United States">)]
zip_92373 = usa.subgeographies.states['California'].zip5['92373']
zip_92373
<NamedArea name:"Redlands" area_id="92373", level="US.ZIP5", country="United States">
Visualize on a map
# Create a map
zip_map = gis.map('Redlands, CA')
zip_map
# Plot on a map
zip_map.draw(zip_92373.geometry)
Now let's view a few zip codes for San Bernardino county and then select one.
zip_sb = [zip_code for zip_code in usa.subgeographies.states['California'].
counties['San_Bernardino_County'].zip5.items()][:5]
zip_sb
[('91708', <NamedArea name:"Chino" area_id="91708", level="US.ZIP5", country="United States">), ('91709', <NamedArea name:"Chino Hills" area_id="91709", level="US.ZIP5", country="United States">), ('91739', <NamedArea name:"Rancho Cucamonga" area_id="91739", level="US.ZIP5", country="United States">), ('91759', <NamedArea name:"Mt Baldy" area_id="91759", level="US.ZIP5", country="United States">), ('91761', <NamedArea name:"Ontario" area_id="91761", level="US.ZIP5", country="United States">)]
zip_92336 = usa.subgeographies.states['California'].counties['San_Bernardino_County'].zip5['92336']
zip_92336
<NamedArea name:"Fontana" area_id="92336", level="US.ZIP5", country="United States">
The subgeography levels and their hierarchies are different for each country, and can also be different for the different datasets (which can be set using the Country.dataset
property).
india = Country.get('India')
Look at the different levels of subgeographies for India
india.levels
[{'id': 'IN.Country', 'name': 'Country', 'isWholeCountry': True, 'adminLevel': 'Admin1', 'singularName': 'Country', 'pluralName': 'Country'}, {'id': 'IN.States', 'name': 'States', 'isWholeCountry': False, 'adminLevel': 'Admin2', 'singularName': 'State', 'pluralName': 'States'}, {'id': 'IN.Districts', 'name': 'Districts', 'isWholeCountry': False, 'adminLevel': 'Admin3', 'singularName': 'District', 'pluralName': 'Districts'}, {'id': 'IN.Subdistricts', 'name': 'Subdistricts', 'isWholeCountry': False, 'adminLevel': 'Admin4', 'singularName': 'Subdistrict', 'pluralName': 'Subdistricts'}]
Here we see that subgeographies in India are different from those in U.S. India has districts and sub-districts
instead of counties and tracts
. Let's inspect a few states. Following a similar pattern of exploration as above, all states can be viewed using india.subgeographies.states
.
in_states = [state for state in india.subgeographies.states.items()][:5]
in_states
[('Jammu_&_Kashmir', <NamedArea name:"Jammu & Kashmir" area_id="01", level="IN.States", country="India">), ('Himachal_Pradesh', <NamedArea name:"Himachal Pradesh" area_id="02", level="IN.States", country="India">), ('Punjab', <NamedArea name:"Punjab" area_id="03", level="IN.States", country="India">), ('Chandigarh', <NamedArea name:"Chandigarh" area_id="04", level="IN.States", country="India">), ('Uttarakhand', <NamedArea name:"Uttarakhand" area_id="05", level="IN.States", country="India">)]
We see that various states are listed along with their properties such as name, area_id, level etc. Let's select a state and further explore its subgeographies.
state = india.subgeographies.states['Uttar_Pradesh']
state.geometry.type
'Polygon'
Let's look at different districts in the state of Uttar Pradesh. Following a similar pattern as above, we will look at the first few districts. To look at all districts you may use usa.subgeographies.states['Uttar_Pradesh'].districts
.
in_districts = [district for district in india.subgeographies.states['Uttar_Pradesh'].districts.items()][:5]
in_districts
[('Saharanpur', <NamedArea name:"Saharanpur" area_id="09132", level="IN.Districts", country="India">), ('Muzaffarnagar', <NamedArea name:"Muzaffarnagar" area_id="09133", level="IN.Districts", country="India">), ('Bijnor', <NamedArea name:"Bijnor" area_id="09134", level="IN.Districts", country="India">), ('Moradabad', <NamedArea name:"Moradabad" area_id="09135", level="IN.Districts", country="India">), ('Rampur', <NamedArea name:"Rampur" area_id="09136", level="IN.Districts", country="India">)]
district = india.subgeographies.states['Uttar_Pradesh'].districts['Ghaziabad']
Visualize on a map
# Create a map
district_map = gis.map('Uttar Pradesh, India', 7)
district_map
# Plot on a map
district_map.draw(district.geometry)
Let's select a district and look at different subdistricts in it. We will look at the first few subdistricts in Baghpat
district. To look at all subdistricts you may use india.subgeographies.states['Uttar_Pradesh'].districts['Ghaziabad'].subdistricts
.
in_subdistrict = [sub for sub in india.subgeographies.states['Uttar_Pradesh'].
districts['Ghaziabad'].subdistricts.items()][:5]
in_subdistrict
[('Modinagar', <NamedArea name:"Modinagar" area_id="0914000738", level="IN.Subdistricts", country="India">), ('Ghaziabad', <NamedArea name:"Ghaziabad" area_id="0914000739", level="IN.Subdistricts", country="India">), ('Loni', <NamedArea name:"Loni" area_id="0914090000", level="IN.Subdistricts", country="India">)]
Now, let's select a particular subdistrict.
india.subgeographies.states['Uttar_Pradesh'].districts['Ghaziabad'].subdistricts['Modinagar']
<NamedArea name:"Modinagar" area_id="0914000738", level="IN.Subdistricts", country="India">
Named statistical areas can also be searched within a country. Let's search for areas named Riverside
in the U.S.
riversides_in_usa = usa.search('Riverside')
print("number of riversides in the US: " + str(len(riversides_in_usa)))
# list a few of them
riversides_in_usa[:10]
number of riversides in the US: 81
[<NamedArea name:"Riverside" area_id="147435", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147436", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147437", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147438", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147439", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147440", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147441", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147442", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147443", level="Cities", country="United States">, <NamedArea name:"Riverside" area_id="147444", level="Cities", country="United States">]
The named areas can also be drawn on a map, as they include a geometry
property
For instance, you can make a map of all the riversides in the US
usamap = gis.map('United States', zoomlevel=4)
usamap
for riverside in riversides_in_usa:
usamap.draw(riverside.geometry)
In this part of the arcgis.geoenrichment
module guide series, you explored several named statistical areas in a hierarchy of geography levels. You saw, in detail, how different geography levels can be queried so it can then be used for enrichment. You were introduced to Country
class and Country.get()
method which can be used to discover sub-geographies, data collections and available reports for a country.
In the subsequent pages, you will learn about Data Collections and GeoEnrichment coverage, Generating Reports and Standard Geography Queries.