diff --git a/DESCRIPTION b/DESCRIPTION index c6bea3c..675f30d 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: cmapgeo Title: R-Friendly Geodata for the Chicago Region -Version: 0.1.3 +Version: 0.2.0 Authors@R: c( person("Noel", "Peterson", role = c("aut", "cre"), @@ -25,10 +25,12 @@ LazyData: true LazyDataCompression: gzip Depends: R (>= 2.10) +Imports: + sf, + tibble Suggests: dplyr, ggplot2, - sf, tidycensus Roxygen: list(markdown = TRUE) RoxygenNote: 7.1.2 diff --git a/NAMESPACE b/NAMESPACE index 2597ca8..639dd09 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -2,3 +2,4 @@ export(cmap_crs) export(county_fips_codes) +import(sf) diff --git a/NEWS.md b/NEWS.md index 68071d2..f3b4297 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,21 @@ +# cmapgeo 0.2.0 +February 23, 2022 + +* `block_sf`, `blockgroup_sf` and `tract_sf` now represent the 2020 census + geographies. (2010 census geographies are still available with a `_2010` + suffix for use with ACS data from 2010 through 2019.) +* Crosswalk tables (`xwalk_*`) have also been updated to use 2020 Census data, + and a new employment allocation factor (based on 2019 + [LEHD](https://lehd.ces.census.gov/data) data) has been added to each. The + prior crosswalks based on the 2010 Census data are still available with a + `_2010` suffix for use with ACS data from 2010 through 2019 (although they + still lack an employment allocation factor). +* All datasets based on the Census Bureau's TIGER/Line boundaries have been + updated with the 2021 vintage. (The exception is the 2010 blocks, block groups + and tracts, which are still based on the 2019 vintage.) +* Added `sf` and `tibble` packages as requirements instead of suggestions. + + # cmapgeo 0.1.3 November 3, 2021 diff --git a/R/cmapgeo.R b/R/cmapgeo.R index 0643be2..dd1fb00 100644 --- a/R/cmapgeo.R +++ b/R/cmapgeo.R @@ -13,6 +13,7 @@ #' #' @name cmapgeo #' @docType package +#' @import sf #' @keywords internal "_PACKAGE" diff --git a/R/data.R b/R/data.R index 51649b7..b2d5445 100644 --- a/R/data.R +++ b/R/data.R @@ -1,32 +1,40 @@ # Geodata created with data-raw/load_census_api.R ------------------------ -#' Census Blocks (2019 vintage) +# NOTE: Census geography glossary is located at +# + + +#' Census Blocks #' #' The Census Blocks within the 7-county Chicago Metropolitan Agency for #' Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -#' 2019 vintage. **Use this version with data from the 2010 decennial census or -#' the American Community Survey (ACS) from 2010 through 2019. For data from the -#' 2020 decennial census, use `block_sf_2020` (which will replace this dataset -#' once the 2016-2020 ACS 5-year data is published).** +#' 2021 vintage. **Use `block_sf` for data from the 2020 decennial census or the +#' American Community Survey (ACS) from 2020 onward. For data from the 2010 +#' decennial census or ACS from 2010 through 2019, use `block_sf_2010`.** #' #' Census Bureau description: #' -#' *"Blocks are statistical areas bounded by visible features, such as streets, -#' roads, streams, and railroad tracks, and by nonvisible boundaries, such as -#' selected property lines and city, township, school district, and county -#' limits and short line-of-sight extensions of streets and roads. Generally, -#' census blocks are small in area; for example, a block in a city bounded on -#' all sides by streets. Census blocks in suburban and rural areas may be large, -#' irregular, and bounded by a variety of features, such as roads, streams, and -#' transmission lines. In remote areas, census blocks may encompass hundreds of -#' square miles. Census blocks cover the entire territory of the United States, -#' Puerto Rico, and the Island Areas. Census blocks nest within all other -#' tabulated census geographic entities and are the basis for all tabulated -#' data."* -#' -#' @format -#' A multipolygon `sf` object with `r nrow(block_sf)` rows and -#' `r ncol(block_sf)` variables: +#' *"Blocks (Census Blocks or Tabulation Blocks) are statistical areas bounded +#' by visible features, such as streets, roads, streams, and railroad tracks, +#' and by nonvisible boundaries, such as selected property lines and city, +#' township, school district, and county limits and short line-of-sight +#' extensions of streets and roads. Generally, blocks are small in area; for +#' example, a city block bounded on all sides by streets. Blocks in suburban and +#' rural areas may be larger, more irregular in shape, and bounded by a variety +#' of features, such as roads, streams, and transmission lines. In remote areas, +#' blocks may even encompass hundreds of square miles. Blocks cover the entire +#' territory of the United States, Puerto Rico, and the Island Areas. Blocks +#' nest within all other tabulated census geographic entities at the time of the +#' decennial census and are the basis for all tabulated data from that census. +#' Census Block Numbers—Blocks are numbered uniquely with a four-digit census +#' block number from 0000 to 9999 within census tract, which nest within state +#' and county. The first digit of the census block number identifies the block +#' group. Block numbers beginning with a zero (in Block Group 0) are intended to +#' include only water area, but not all water-only blocks have block numbers +#' beginning with 0 (zero)."* +#' +#' @format `block_sf` is a multipolygon `sf` object with `r nrow(block_sf)` rows +#' and `r ncol(block_sf)` variables: #' \describe{ #' \item{geoid_block}{Unique 15-digit block ID, assigned by the Census Bureau. #' The parent tract and block group can be identified from the first 11 and 12 @@ -37,8 +45,7 @@ #' \item{geometry}{Feature geometry. `sf` multipolygon.} #' } #' -#' @source -#' US Census Bureau +#' @source US Census Bureau #' [TIGER/Line](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) #' #' @examples @@ -47,63 +54,20 @@ #' ggplot(block_sf) + geom_sf(lwd = 0.1) + theme_void() "block_sf" - -#' Census Blocks (2020 vintage) -#' -#' The Census Blocks within the 7-county Chicago Metropolitan Agency for -#' Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -#' 2020 vintage. **Use this version for data from the 2020 decennial census. For -#' data from the 2010 decennial census or the American Community Survey (ACS) -#' from 2010 through 2019, use `block_sf` (which will be replaced by this -#' dataset once the 2016-2020 ACS 5-year data is published).** -#' -#' Census Bureau description: -#' -#' *"Blocks are statistical areas bounded by visible features, such as streets, -#' roads, streams, and railroad tracks, and by nonvisible boundaries, such as -#' selected property lines and city, township, school district, and county -#' limits and short line-of-sight extensions of streets and roads. Generally, -#' census blocks are small in area; for example, a block in a city bounded on -#' all sides by streets. Census blocks in suburban and rural areas may be large, -#' irregular, and bounded by a variety of features, such as roads, streams, and -#' transmission lines. In remote areas, census blocks may encompass hundreds of -#' square miles. Census blocks cover the entire territory of the United States, -#' Puerto Rico, and the Island Areas. Census blocks nest within all other -#' tabulated census geographic entities and are the basis for all tabulated -#' data."* -#' -#' @format -#' A multipolygon `sf` object with `r nrow(block_sf_2020)` rows and -#' `r ncol(block_sf_2020)` variables: -#' \describe{ -#' \item{geoid_block}{Unique 15-digit block ID, assigned by the Census Bureau. -#' The parent tract and block group can be identified from the first 11 and 12 -#' digits, respectively. Character.} -#' \item{county_fips}{Unique 5-digit FIPS code of the county the block is in. -#' Character.} -#' \item{sqmi}{Area in square miles. Double.} -#' \item{geometry}{Feature geometry. `sf` multipolygon.} -#' } -#' -#' @source -#' US Census Bureau -#' [TIGER/Line](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) -#' -#' @examples -#' # Display the blocks with ggplot2 -#' library(ggplot2) -#' ggplot(block_sf_2020) + geom_sf(lwd = 0.1) + theme_void() -"block_sf_2020" +#' @rdname block_sf +#' @format `block_sf_2010` is a multipolygon `sf` object with +#' `r nrow(block_sf_2010)` rows and `r ncol(block_sf_2010)` variables. +"block_sf_2010" -#' Census Block Groups (2019 vintage) +#' Census Block Groups #' #' The Census Block Groups within the 7-county Chicago Metropolitan Agency for #' Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -#' 2019 vintage. **Use this version with data from the 2010 decennial census or -#' the American Community Survey (ACS) from 2010 through 2019. For data from the -#' 2020 decennial census, use `blockgroup_sf_2020` (which will replace this -#' dataset once the 2016-2020 ACS 5-year data is published).** +#' 2021 vintage. **Use `blockgroup_sf` for data from the 2020 decennial census +#' or the American Community Survey (ACS) from 2020 onward. For data from the +#' 2010 decennial census or ACS from 2010 through 2019, use +#' `blockgroup_sf_2010`.** #' #' Census Bureau description: #' @@ -113,24 +77,23 @@ #' of blocks within the same census tract that have the same first digit of #' their four-digit census block number. For example, blocks 3001, 3002, 3003, #' ..., 3999 in census tract 1210.02 belong to BG 3 in that census tract. Most -#' BGs were delineated by local participants in the Census Bureau's Participant -#' Statistical Areas Program. The Census Bureau delineated BGs only where a -#' local or tribal government declined to participate, and a regional -#' organization or State Data Center was not available to participate.* -#' -#' *"A BG usually covers a contiguous area. Each census tract contains at least -#' one BG, and BGs are uniquely numbered within the census tract. Within the -#' standard census geographic hierarchy, BGs never cross state, county, or -#' census tract boundaries but may cross the boundaries of any other geographic -#' entity. Tribal census tracts and tribal BGs are separate and unique -#' geographic areas defined within federally recognized American Indian -#' reservations and can cross state and county boundaries. The tribal census -#' tracts and tribal block groups may be completely different from the census -#' tracts and block groups defined by state and county."* +#' BGs were delineated by local participants in the Census Bureau’s Participant +#' Statistical Areas Program (PSAP). The Census Bureau delineated BGs only where +#' a local or tribal government declined to participate in PSAP, and a regional +#' organization or the State Data Center was not available to participate. A BG +#' usually covers a contiguous area. Each census tract contains at least one BG, +#' and BGs are uniquely numbered within the census tract. Within the standard +#' census geographic hierarchy, BGs never cross state, county, or census tract +#' boundaries, but may cross the boundaries of any other geographic entity. +#' Tribal census tracts and tribal BGs are separate and unique geographic areas +#' defined within federally recognized American Indian reservations and can +#' cross state and county boundaries. The tribal census tracts and tribal block +#' groups may be completely different from the standard county-based census +#' tracts and block groups defined for the same area."* #' #' @format -#' A polygon `sf` object with `r nrow(blockgroup_sf)` rows and -#' `r ncol(blockgroup_sf)` variables: +#' `blockgroup_sf` is a multipolygon `sf` object with `r nrow(blockgroup_sf)` +#' rows and `r ncol(blockgroup_sf)` variables: #' \describe{ #' \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census #' Bureau. The parent tract can be identified from the first 11 digits. @@ -138,7 +101,7 @@ #' \item{county_fips}{Unique 5-digit FIPS code of the county the block group #' is in. Character.} #' \item{sqmi}{Area in square miles. Double.} -#' \item{geometry}{Feature geometry. `sf` polygon.} +#' \item{geometry}{Feature geometry. `sf` multipolygon.} #' } #' #' @source @@ -151,81 +114,30 @@ #' ggplot(blockgroup_sf) + geom_sf(lwd = 0.1) + theme_void() "blockgroup_sf" +#' @rdname blockgroup_sf +#' @format `blockgroup_sf_2010` is a polygon `sf` object with +#' `r nrow(blockgroup_sf_2010)` rows and `r ncol(blockgroup_sf_2010)` variables. +"blockgroup_sf_2010" -#' Census Block Groups (2020 vintage) -#' -#' The Census Block Groups within the 7-county Chicago Metropolitan Agency for -#' Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -#' 2020 vintage. **Use this version for data from the 2020 decennial census. For -#' data from the 2010 decennial census or the American Community Survey (ACS) -#' from 2010 through 2019, use `blockgroup_sf` (which will be replaced by this -#' dataset once the 2016-2020 ACS 5-year data is published).** -#' -#' Census Bureau description: -#' -#' *"Block Groups (BGs) are statistical divisions of census tracts, are -#' generally defined to contain between 600 and 3,000 people, and are used to -#' present data and control block numbering. A block group consists of clusters -#' of blocks within the same census tract that have the same first digit of -#' their four-digit census block number. For example, blocks 3001, 3002, 3003, -#' ..., 3999 in census tract 1210.02 belong to BG 3 in that census tract. Most -#' BGs were delineated by local participants in the Census Bureau's Participant -#' Statistical Areas Program. The Census Bureau delineated BGs only where a -#' local or tribal government declined to participate, and a regional -#' organization or State Data Center was not available to participate.* -#' -#' *"A BG usually covers a contiguous area. Each census tract contains at least -#' one BG, and BGs are uniquely numbered within the census tract. Within the -#' standard census geographic hierarchy, BGs never cross state, county, or -#' census tract boundaries but may cross the boundaries of any other geographic -#' entity. Tribal census tracts and tribal BGs are separate and unique -#' geographic areas defined within federally recognized American Indian -#' reservations and can cross state and county boundaries. The tribal census -#' tracts and tribal block groups may be completely different from the census -#' tracts and block groups defined by state and county."* -#' -#' @format -#' A polygon `sf` object with `r nrow(blockgroup_sf_2020)` rows and -#' `r ncol(blockgroup_sf_2020)` variables: -#' \describe{ -#' \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census -#' Bureau. The parent tract can be identified from the first 11 digits. -#' Character.} -#' \item{county_fips}{Unique 5-digit FIPS code of the county the block group -#' is in. Character.} -#' \item{sqmi}{Area in square miles. Double.} -#' \item{geometry}{Feature geometry. `sf` polygon.} -#' } -#' -#' @source -#' US Census Bureau -#' [TIGER/Line](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) -#' -#' @examples -#' # Display the block groups with ggplot2 -#' library(ggplot2) -#' ggplot(blockgroup_sf_2020) + geom_sf(lwd = 0.1) + theme_void() -"blockgroup_sf_2020" - -#' Census Tracts (2019 vintage) +#' Census Tracts #' #' The Census Tracts within the 7-county Chicago Metropolitan Agency for #' Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -#' 2019 vintage. **Use this version with data from the 2010 decennial census or -#' the American Community Survey (ACS) from 2010 through 2019. For data from the -#' 2020 decennial census, use `tract_sf_2020` (which will replace this dataset -#' once the 2016-2020 ACS 5-year data is published).** +#' 2021 vintage. **Use `tract_sf` for data from the 2020 decennial census or the +#' American Community Survey (ACS) from 2020 onward. For data from the 2010 +#' decennial census or ACS from 2010 through 2019, use `tract_sf_2010`.** #' #' Census Bureau description: #' #' *"Census Tracts are small, relatively permanent statistical subdivisions of a -#' county or equivalent entity that are updated by local participants prior to -#' each decennial census as part of the Census Bureau's Participant Statistical -#' Areas Program. The Census Bureau delineates census tracts in situations where -#' no local participant existed or where state, local, or tribal governments -#' declined to participate. The primary purpose of census tracts is to provide a -#' stable set of geographic units for the presentation of statistical data.* +#' county or statistically equivalent entity that can be updated by local +#' participants prior to each decennial census as part of the Census Bureau’s +#' Participant Statistical Areas Program (PSAP). The Census Bureau delineates +#' census tracts in situations where no local participant responded or where +#' state, local, or tribal governments declined to participate. The primary +#' purpose of census tracts is to provide a stable set of geographic units for +#' the presentation of statistical data. #' #' *"Census tracts generally have a population size between 1,200 and 8,000 #' people, with an optimum size of 4,000 people. A census tract usually covers a @@ -234,30 +146,30 @@ #' delineated with the intention of being maintained over a long time so that #' statistical comparisons can be made from census to census. Census tracts #' occasionally are split due to population growth or merged as a result of -#' substantial population decline.* +#' substantial population decline. #' #' *"Census tract boundaries generally follow visible and identifiable features. #' They may follow nonvisible legal boundaries, such as minor civil division #' (MCD) or incorporated place boundaries in some states and situations, to -#' allow for census-tract-to-governmental-unit relationships where the +#' allow for census tract-to-governmental unit relationships where the #' governmental boundaries tend to remain unchanged between censuses. State and #' county boundaries always are census tract boundaries in the standard census #' geographic hierarchy. Tribal census tracts are a unique geographic entity #' defined within federally recognized American Indian reservations and -#' off-reservation trust lands and can cross state and county boundaries. Tribal -#' census tracts may be completely different from the census tracts and block -#' groups defined by state and county."* +#' off-reservation trust lands and can cross state and county boundaries. The +#' tribal census tracts may be completely different from the standard +#' county-based census tracts defined for the same area."* #' #' @format -#' A polygon `sf` object with `r nrow(tract_sf)` rows and `r ncol(tract_sf)` -#' variables: +#' `tract_sf` is a multipolygon `sf` object with `r nrow(tract_sf)` rows and +#' `r ncol(tract_sf)` variables: #' \describe{ #' \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census #' Bureau. Character.} #' \item{county_fips}{Unique 5-digit FIPS code of the county the tract is in. #' Character.} #' \item{sqmi}{Area in square miles. Double.} -#' \item{geometry}{Feature geometry. `sf` polygon.} +#' \item{geometry}{Feature geometry. `sf` multipolygon.} #' } #' #' @source @@ -270,75 +182,19 @@ #' ggplot(tract_sf) + geom_sf(lwd = 0.1) + theme_void() "tract_sf" - -#' Census Tracts (2020 vintage) -#' -#' The Census Tracts within the 7-county Chicago Metropolitan Agency for -#' Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -#' 2020 vintage. **Use this version for data from the 2020 decennial census. For -#' data from the 2010 decennial census or the American Community Survey (ACS) -#' from 2010 through 2019, use `tract_sf` (which will be replaced by this -#' dataset once the 2016-2020 ACS 5-year data is published).** -#' -#' Census Bureau description: -#' -#' *"Census Tracts are small, relatively permanent statistical subdivisions of a -#' county or equivalent entity that are updated by local participants prior to -#' each decennial census as part of the Census Bureau's Participant Statistical -#' Areas Program. The Census Bureau delineates census tracts in situations where -#' no local participant existed or where state, local, or tribal governments -#' declined to participate. The primary purpose of census tracts is to provide a -#' stable set of geographic units for the presentation of statistical data.* -#' -#' *"Census tracts generally have a population size between 1,200 and 8,000 -#' people, with an optimum size of 4,000 people. A census tract usually covers a -#' contiguous area; however, the spatial size of census tracts varies widely -#' depending on the density of settlement. Census tract boundaries are -#' delineated with the intention of being maintained over a long time so that -#' statistical comparisons can be made from census to census. Census tracts -#' occasionally are split due to population growth or merged as a result of -#' substantial population decline.* -#' -#' *"Census tract boundaries generally follow visible and identifiable features. -#' They may follow nonvisible legal boundaries, such as minor civil division -#' (MCD) or incorporated place boundaries in some states and situations, to -#' allow for census-tract-to-governmental-unit relationships where the -#' governmental boundaries tend to remain unchanged between censuses. State and -#' county boundaries always are census tract boundaries in the standard census -#' geographic hierarchy. Tribal census tracts are a unique geographic entity -#' defined within federally recognized American Indian reservations and -#' off-reservation trust lands and can cross state and county boundaries. Tribal -#' census tracts may be completely different from the census tracts and block -#' groups defined by state and county."* -#' -#' @format -#' A polygon `sf` object with `r nrow(tract_sf_2020)` rows and -#' `r ncol(tract_sf_2020)` variables: -#' \describe{ -#' \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census -#' Bureau. Character.} -#' \item{county_fips}{Unique 5-digit FIPS code of the county the tract is in. -#' Character.} -#' \item{sqmi}{Area in square miles. Double.} -#' \item{geometry}{Feature geometry. `sf` multipolygon.} -#' } -#' -#' @source -#' US Census Bureau -#' [TIGER/Line](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) -#' -#' @examples -#' # Display the tracts with ggplot2 -#' library(ggplot2) -#' ggplot(tract_sf_2020) + geom_sf(lwd = 0.1) + theme_void() -"tract_sf_2020" +#' @rdname tract_sf +#' @format `tract_sf_2010` is a polygon `sf` object with `r nrow(tract_sf_2010)` +#' rows and `r ncol(tract_sf_2010)` variables. +"tract_sf_2010" #' Census Public Use Microdata Areas (PUMAs) #' #' The Census PUMAs covering the 7-county Chicago Metropolitan Agency for #' Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -#' 2019 vintage. +#' 2021 vintage. **These PUMAs are valid for use with ACS PUMS data from 2012 +#' through 2021. They will be superseded by the 2020 PUMAs when the 2022 ACS +#' data is published.** #' #' Census Bureau description: #' @@ -383,11 +239,11 @@ #' #' The Census ZCTAs covering the 7-county Chicago Metropolitan Agency for #' Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -#' 2019 vintage. +#' 2021 vintage. #' #' Census Bureau description: #' -#' *ZIP Code Tabulation Areas (ZCTAs) are approximate area representations of +#' *"ZIP Code Tabulation Areas (ZCTAs) are approximate area representations of #' U.S. Postal Service (USPS) five-digit ZIP Code service areas that the Census #' Bureau creates using whole blocks to present statistical data from censuses #' and surveys. The Census Bureau defines ZCTAs by allocating each block that @@ -399,25 +255,13 @@ #' buffering performed between multiple ZCTAs. The Census Bureau identifies #' five-digit ZCTAs using a five-character numeric code that represents the most #' frequently occurring USPS ZIP Code within that ZCTA, and this code may -#' contain leading zeros.* -#' -#' *There are significant changes to the 2010 ZCTA delineation from that used in -#' 2000. Coverage was extended to include the Island Areas for 2010 so that the -#' United States, Puerto Rico, and the Island Areas have ZCTAs. Unlike 2000, -#' when areas that could not be assigned to a ZCTA were given a generic code -#' ending in "XX" (land area) or "HH" (water area), for 2010 there is no -#' universal coverage by ZCTAs, and only legitimate five-digit areas are -#' defined. The 2010 ZCTAs will better represent the actual Zip Code service -#' areas because the Census Bureau initiated a process before creation of 2010 -#' blocks to add block boundaries that split polygons with large numbers of -#' addresses using different ZIP Codes.* -#' -#' *Data users should not use ZCTAs to identify the official USPS ZIP Code for -#' mail delivery. The USPS makes periodic changes to ZIP Codes to support more -#' efficient mail delivery. The ZCTAs process used primarily residential -#' addresses and was biased towards ZIP Codes used for city-style mail delivery, -#' thus there may be ZIP Codes that are primarily nonresidential or boxes only -#' that may not have a corresponding ZCTA.* +#' contain leading zeros. Not all ZIP Codes in use by the USPS may have a ZCTA +#' delineated to represent them, The USPS makes periodic changes to ZIP Codes to +#' support more efficient mail delivery. In addition, the ZCTA delineation +#' process primarily uses residential addresses and has a bias towards ZIP Codes +#' used for city-style mail delivery, thus there may be ZIP Codes that are +#' primarily nonresidential or used for PO boxes only that may not have a +#' corresponding ZCTA. ZIP Code is a trademark of the U.S. Postal Service."* #' #' @format #' A multipolygon `sf` object with `r nrow(zcta_sf)` rows and `r ncol(zcta_sf)` @@ -443,7 +287,10 @@ #' Illinois State Senate Districts #' #' The Illinois General Assembly Senate Districts. From the US Census Bureau's -#' TIGER/Line shapefiles, 2019 vintage. +#' TIGER/Line shapefiles, 2021 vintage. **These districts were in effect for +#' elections from 2012 through 2020 (i.e. the 98th through 102nd General +#' Assemblies). They will be superseded by new district boundaries for the 2022 +#' election (for the 103rd General Assembly).** #' #' Census Bureau description: #' @@ -451,23 +298,21 @@ #' elected to state legislatures. The Census Bureau first reported data for SLDs #' as part of the 2000 Public Law (P.L.) 94-171 Redistricting Data File.* #' -#' *"Current SLDs (2010 Election Cycle) -- States participating in Phase 1 of -#' the 2010 Census Redistricting Data Program voluntarily provided the Census -#' Bureau with the 2006 election cycle boundaries, codes, and, in some cases, -#' names for their SLDs. All 50 states, plus the District of Columbia and Puerto -#' Rico, participated in Phase 1, State Legislative District Project (SLDP) of -#' the 2010 Census Redistricting Data Program. States subsequently provided -#' legal changes to those plans through the Redistricting Data Office and/or -#' corrections as part of Phase 2 of the 2010 Census Redistricting Data Program, -#' as needed.* -#' -#' *"The SLDs embody the upper (senate) and lower (house) chambers of the state -#' legislature. A unique three-character census code, identified by state -#' participants, is assigned to each SLD within a state. In Connecticut, Hawaii, -#' Illinois, Louisiana, Maine, Massachusetts, New Jersey, Ohio, and Puerto Rico, +#' *"Current SLDs (2018 Election Cycle)—States participating in Phase 4 of the +#' 2020 Census Redistricting Data Program voluntarily provided the Census Bureau +#' with the 2018 election cycle boundaries, codes, and, in some cases, names for +#' their SLDs. All 50 states, plus the District of Columbia and Puerto Rico, +#' participated in Phase 4's State Legislative District Project (SLDP) of the +#' 2020 Census Redistricting Data Program. States subsequently provided +#' corrections to those plans through the Redistricting Data Office during Phase +#' 2 of the 2020 Census Redistricting Data Program, if needed. +#' +#' "The SLDs embody the upper (senate—SLDU) and lower (house—SLDL) chambers of +#' the state legislature. A unique three-character census code, identified by +#' state participants, is assigned to each SLD within a state. In some states, #' state officials did not define the SLDs to cover all of the state or state #' equivalent area (usually bodies of water). In these areas with no SLDs -#' defined, the code "ZZZ" has been assigned, which is treated within state as a +#' defined, the code 'ZZZ' has been assigned, which is treated within state as a #' single SLD for purposes of data presentation."* #' #' Note: The aforementioned "ZZZ" district, which comprises the Illinois portion @@ -498,7 +343,10 @@ #' Illinois State House Districts #' #' The Illinois General Assembly House Districts. From the US Census Bureau's -#' TIGER/Line shapefiles, 2019 vintage. +#' TIGER/Line shapefiles, 2021 vintage. **These districts were in effect for +#' elections from 2012 through 2020 (i.e. the 98th through 102nd General +#' Assemblies). They will be superseded by new district boundaries for the 2022 +#' election (for the 103rd General Assembly).** #' #' Census Bureau description: #' @@ -506,23 +354,21 @@ #' elected to state legislatures. The Census Bureau first reported data for SLDs #' as part of the 2000 Public Law (P.L.) 94-171 Redistricting Data File.* #' -#' *"Current SLDs (2010 Election Cycle) -- States participating in Phase 1 of -#' the 2010 Census Redistricting Data Program voluntarily provided the Census -#' Bureau with the 2006 election cycle boundaries, codes, and, in some cases, -#' names for their SLDs. All 50 states, plus the District of Columbia and Puerto -#' Rico, participated in Phase 1, State Legislative District Project (SLDP) of -#' the 2010 Census Redistricting Data Program. States subsequently provided -#' legal changes to those plans through the Redistricting Data Office and/or -#' corrections as part of Phase 2 of the 2010 Census Redistricting Data Program, -#' as needed.* -#' -#' *"The SLDs embody the upper (senate) and lower (house) chambers of the state -#' legislature. A unique three-character census code, identified by state -#' participants, is assigned to each SLD within a state. In Connecticut, Hawaii, -#' Illinois, Louisiana, Maine, Massachusetts, New Jersey, Ohio, and Puerto Rico, +#' *"Current SLDs (2018 Election Cycle)—States participating in Phase 4 of the +#' 2020 Census Redistricting Data Program voluntarily provided the Census Bureau +#' with the 2018 election cycle boundaries, codes, and, in some cases, names for +#' their SLDs. All 50 states, plus the District of Columbia and Puerto Rico, +#' participated in Phase 4's State Legislative District Project (SLDP) of the +#' 2020 Census Redistricting Data Program. States subsequently provided +#' corrections to those plans through the Redistricting Data Office during Phase +#' 2 of the 2020 Census Redistricting Data Program, if needed. +#' +#' "The SLDs embody the upper (senate—SLDU) and lower (house—SLDL) chambers of +#' the state legislature. A unique three-character census code, identified by +#' state participants, is assigned to each SLD within a state. In some states, #' state officials did not define the SLDs to cover all of the state or state #' equivalent area (usually bodies of water). In these areas with no SLDs -#' defined, the code "ZZZ" has been assigned, which is treated within state as a +#' defined, the code 'ZZZ' has been assigned, which is treated within state as a #' single SLD for purposes of data presentation."* #' #' Note: The aforementioned "ZZZ" district, which comprises the Illinois portion @@ -553,7 +399,10 @@ #' U.S. Congressional Districts #' #' The United States Congressional Districts in the state of Illinois. From the -#' US Census Bureau's TIGER/Line shapefiles, 2019 vintage. +#' US Census Bureau's TIGER/Line shapefiles, 2021 vintage. **These districts +#' were in effect for elections from 2012 through 2020 (i.e. the 113th through +#' 117th Congresses). They will be superseded by new district boundaries for the +#' 2022 election (for the 118th Congress).** #' #' Census Bureau description: #' @@ -567,20 +416,18 @@ #' each Island Area, a separate code is used to identify the entire areas of #' these state-equivalent entities as having a single nonvoting delegate."* #' -#' @format -#' A multipolygon `sf` object with `r nrow(congress_sf)` rows and `r ncol(congress_sf)` -#' variables: +#' @format A multipolygon `sf` object with `r nrow(congress_sf)` rows and +#' `r ncol(congress_sf)` variables: #' \describe{ #' \item{dist_num}{Congressional District number. Integer.} #' \item{dist_name}{Name of the district (full). Character.} #' \item{dist_name_short}{Name of the district (short). Character.} #' \item{cmap}{Does the district overlap the 7-county CMAP region? Logical.} -#' \item{sqmi}{Area in square miles. Double.} -#' \item{geometry}{Feature geometry. `sf` multipolygon.} +#' \item{sqmi}{Area in square miles. Double.} \item{geometry}{Feature +#' geometry. `sf` multipolygon.} #' } #' -#' @source -#' US Census Bureau +#' @source US Census Bureau #' [TIGER/Line](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) #' #' @examples @@ -595,23 +442,22 @@ #' The 284 municipalities (also referred to as "incorporated places" in Census #' Bureau terminology) that are at least partially within the 7-county Chicago #' Metropolitan Agency for Planning (CMAP) region. From the US Census Bureau's -#' TIGER/Line shapefiles, 2019 vintage. +#' TIGER/Line shapefiles, 2021 vintage. #' #' Census Bureau description: #' #' *"Incorporated Places are those reported to the Census Bureau as legally in -#' existence as of January 1, 2010, as reported in the latest Boundary and -#' Annexation Survey (BAS), under the laws of their respective states. An -#' incorporated place is established to provide governmental functions for a -#' concentration of people as opposed to a minor civil division, which generally -#' is created to provide services or administer an area without regard, -#' necessarily, to population. Places always are within a single state or -#' equivalent entity, but may extend across county and county subdivision -#' boundaries. An incorporated place usually is a city, town, village, or -#' borough, but can have other legal descriptions."* -#' -#' @format -#' A multipolygon `sf` object with `r nrow(municipality_sf)` rows and +#' existence as of January 1, as reported in the latest Boundary and Annexation +#' Survey (BAS), under the laws of their respective states. An incorporated +#' place is established to provide governmental functions for a concentration of +#' people as opposed to a minor civil division (MCD), which generally is created +#' to provide services or administer an area without regard, necessarily, to +#' population. Places always are within a single state or equivalent entity, but +#' may extend across county and county subdivision boundaries. An incorporated +#' place usually is a city, town, village, or borough, but can have other legal +#' descriptions."* +#' +#' @format A multipolygon `sf` object with `r nrow(municipality_sf)` rows and #' `r ncol(municipality_sf)` variables: #' \describe{ #' \item{geoid_place}{Unique 7-digit place/municipality ID, assigned by the @@ -621,8 +467,7 @@ #' \item{geometry}{Feature geometry. `sf` multipolygon.} #' } #' -#' @source -#' US Census Bureau +#' @source US Census Bureau #' [TIGER/Line](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) #' #' @examples @@ -638,7 +483,7 @@ #' Bureau terminology) that are within the CMAP Metropolitan Planning Area #' (MPA). (The MPA includes the 7 CMAP counties, plus Aux Sable Township in #' Grundy County and Sandwich & Somonauk Townships in DeKalb County.) From the -#' US Census Bureau's TIGER/Line shapefiles, 2019 vintage. +#' US Census Bureau's TIGER/Line shapefiles, 2021 vintage. #' #' Census Bureau description: #' @@ -647,16 +492,14 @@ #' divisions, and unorganized territories and can be classified as either legal #' or statistical. Each county subdivision is assigned a five-character numeric #' Federal Information Processing Series (FIPS) code based on alphabetical -#' sequence within state and an eight-digit National Standard feature -#' identifier."* +#' sequence within state, and an eight-digit National Standard (NS) code."* #' #' Note: The entire City of Chicago (other than the portion of O'Hare in DuPage #' County) is included as a single township in this dataset, and has not been #' subdivided into the eight theoretical townships defined by the Cook County #' Clerk's Office for the purposes of collecting property tax. #' -#' @format -#' A multipolygon `sf` object with `r nrow(township_sf)` rows and +#' @format A multipolygon `sf` object with `r nrow(township_sf)` rows and #' `r ncol(township_sf)` variables: #' \describe{ #' \item{geoid_cousub}{Unique 10-digit county subdivision/township ID, @@ -668,8 +511,7 @@ #' \item{geometry}{Feature geometry. `sf` multipolygon.} #' } #' -#' @source -#' US Census Bureau +#' @source US Census Bureau #' [TIGER/Line](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) #' #' @examples @@ -684,23 +526,23 @@ #' The counties that are within the CMAP travel modeling area **or** the #' "Chicago-Naperville-Elgin, IL-IN-WI" Metropolitan Statistical Area (as #' defined by the United States Office of Management and Budget). From the US -#' Census Bureau's TIGER/Line shapefiles, 2019 vintage. +#' Census Bureau's TIGER/Line shapefiles, 2021 vintage. #' #' Census Bureau description: #' -#' *"Counties are the primary legal divisions of most states. Most counties are -#' functioning governmental units, whose powers and functions vary from state to -#' state. Legal changes to county boundaries or names are typically infrequent, -#' but do occur from time to time."* +#' *"The primary legal divisions of most states are termed counties. Each county +#' or statistically equivalent entity is assigned a three-character numeric +#' Federal Information Processing Series (FIPS) code based on alphabetical +#' sequence that is unique within state, and an eight-digit National Standard +#' (NS) code."* #' #' Note: The Illinois counties of LaSalle, Lee and Ogle are included in their #' entirety, although only portions of these counties are part of the CMAP #' travel modeling area. The precise geographic extent of the CMAP travel #' modeling area is reflected in `zone_sf` and `subzone_sf`. #' -#' @format -#' A polygon `sf` object with `r nrow(county_sf)` rows and `r ncol(county_sf)` -#' variables: +#' @format A polygon `sf` object with `r nrow(county_sf)` rows and +#' `r ncol(county_sf)` variables: #' \describe{ #' \item{geoid_county}{Unique 5-digit county ID (a.k.a. FIPS code), assigned #' by the Census Bureau. Character.} @@ -718,6 +560,7 @@ #' @source US Census Bureau #' [TIGER/Line](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) #' +#' #' @examples #' # Display the counties with ggplot2 #' library(ggplot2) @@ -734,7 +577,7 @@ #' by the Illinois Department of Transportation (IDOT). Includes a column #' indicating which of the five transportation regions each district belongs to. #' Created using the county boundaries in the US Census Bureau's TIGER/Line -#' shapefiles, 2019 vintage. +#' shapefiles, 2021 vintage. #' #' @format #' A polygon `sf` object with `r nrow(idot_sf)` rows and `r ncol(idot_sf)` @@ -787,8 +630,8 @@ #' Chicago Wards #' -#' The official boundaries of the current Chicago wards (established in May of -#' 2015). Obtained 3/24/2021. +#' The official boundaries of the Chicago wards established in May of 2015. +#' Obtained 3/24/2021. #' #' @format #' A multipolygon `sf` object with `r nrow(ward_sf)` rows and `r ncol(ward_sf)` @@ -866,8 +709,8 @@ #' Northwest subregional councils; in this case, the subregional boundary #' follows the county boundary through Buffalo Grove. #' -#' It is important to note here that the portions of COM boundaries, defined by -#' municipalities, are fluid: they change as a village annexes adjacent +#' It is important to note here that the portions of COM boundaries defined by +#' municipalities are fluid: they change as a village annexes adjacent #' unincorporated land. The boundaries depicted in this dataset reflect #' municipal boundaries of varying vintages and sources, and cannot be #' considered “true” for any given point in time. @@ -916,7 +759,8 @@ #' variables: #' \describe{ #' \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census -#' Bureau. Character.} +#' Bureau. **These correspond to tract boundaries from 2010, not 2020.** +#' Character.} #' \item{county_fips}{Unique 5-digit FIPS code of the county the tract is in. #' Character.} #' \item{area_type}{Description of the tract's combined EDA and disinvested @@ -1090,32 +934,36 @@ #' apportioning housing unit, household, and population attributes. All factors #' were determined by calculating the percentage of a tract's housing units, #' households and population that were located in each of its component blocks, -#' according to the 2010 Decennial Census, and then assigning each block to a -#' CCA (based on the location of the block's centroid point). +#' according to the 2020 Decennial Census, and then assigning each block to a +#' CCA (based on the location of the block's centroid point). **Use +#' `xwalk_tract2cca` for data from the 2020 decennial census or the American +#' Community Survey (ACS) from 2020 onward. For data from the 2010 decennial +#' census or ACS from 2010 through 2019, use `xwalk_tract2cca_2010`.** #' #' Generally speaking, tract boundaries align neatly with CCA boundaries as they #' tend to follow similar features (e.g. rivers, major roads, rail lines) but -#' there are cases where the population, households and/or housing units in a -#' tract are split across multiple CCAs, or else are partially within the City -#' of Chicago and partially outside of it. For that reason, it is not +#' there are cases where the jobs, population, households and/or housing units +#' in a tract are split across multiple CCAs, or else are partially within the +#' City of Chicago and partially outside of it. For that reason, it is not #' appropriate to use a one-to-one tract-to-CCA assignment to apportion Census #' data among CCAs, and this crosswalk should be used instead. #' #' To use this crosswalk effectively, Census data should be joined to it (not #' vice versa, since tract IDs appear multiple times in this table). Once the #' data is joined, it should be multiplied by the appropriate factor (depending -#' whether the data of interest is measured at the housing unit, household or -#' person level), and then the result should be summed by CCA. If calculating -#' rates, this should only be done after the counts have been summed to CCA. The -#' resulting table can then be joined to `cca_sf` for mapping, if desired. +#' whether the data of interest is measured at the housing unit, household, +#' person or job level), and then the result should be summed by CCA. If +#' calculating rates, this should only be done after the counts have been summed +#' to CCA. The resulting table can then be joined to `cca_sf` for mapping, if +#' desired. #' #' If your data is also available at the block group level, it is recommended #' that you use that with `xwalk_blockgroup2cca` instead of the tract-level #' allocation. #' #' @format -#' A tibble with `r nrow(xwalk_tract2cca)` rows and `r ncol(xwalk_tract2cca)` -#' variables: +#' `xwalk_tract2cca` is a tibble with `r nrow(xwalk_tract2cca)` rows and +#' `r ncol(xwalk_tract2cca)` variables: #' \describe{ #' \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census #' Bureau. Corresponds to `tract_sf`. Character.} @@ -1133,29 +981,31 @@ #' quarters) living in the specified CCA. Multiply this by a tract-level #' measure of a population attribute (e.g. race/ethnicity) to estimate the #' CCA's portion. Double.} +#' \item{emp_pct}{Proportion of the tract's total jobs located in the +#' specified CCA. Multiply this by a tract-level measure of an employment +#' attribute (e.g. retail jobs) to estimate the CCA's portion. +#' **Not available in `xwalk_tract2cca_2010`.** Double.} #' } #' #' @examples #' suppressPackageStartupMessages(library(dplyr)) #' -#' # View the tracts with population not fully contained in a single CCA +#' # View the tracts with population split between multiple CCAs #' filter(xwalk_tract2cca, pop_pct < 1) #' -#' # Estimate CCA-level transit mode share from tract-level ACS data -#' df_tract <- tidycensus::get_acs( -#' "tract", state = "IL", county = "031", table = "B08006", -#' year = 2019, survey = "acs5", output = "wide", cache_table = TRUE +#' # Estimate CCA-level population density from tract-level Census data +#' df_tract <- tidycensus::get_decennial( +#' geography = "tract", variables = c("P1_001N"), +#' year = 2020, state = "IL", county = c("031", "043"), output = "wide" #' ) %>% -#' rename(workers = B08006_001E, transit = B08006_008E) %>% -#' select(GEOID, workers, transit) +#' suppressMessages() %>% # Hide tidycensus messages +#' select(geoid_tract = GEOID, pop = P1_001N) #' #' df_cca <- xwalk_tract2cca %>% -#' left_join(df_tract, by = c("geoid_tract" = "GEOID")) %>% -#' mutate(transit = transit * pop_pct, -#' workers = workers * pop_pct) %>% +#' left_join(df_tract, by = "geoid_tract") %>% +#' mutate(pop = pop * pop_pct) %>% #' group_by(cca_num) %>% -#' summarize_at(vars(transit, workers), sum) %>% -#' mutate(transit_commute_pct = transit / workers) +#' summarize(pop = sum(pop)) #' df_cca #' #' # Join to cca_sf for mapping @@ -1163,11 +1013,17 @@ #' cca_sf %>% #' left_join(df_cca, by = "cca_num") %>% #' ggplot() + -#' geom_sf(aes(fill = transit_commute_pct), lwd = 0.1) + -#' scale_fill_viridis_c() + +#' geom_sf(aes(fill = pop / sqmi), lwd = 0.1) + +#' scale_fill_viridis_c(direction = -1) + #' theme_void() "xwalk_tract2cca" +#' @rdname xwalk_tract2cca +#' @format `xwalk_tract2cca_2010` is a tibble with +#' `r nrow(xwalk_tract2cca_2010)` rows and `r ncol(xwalk_tract2cca_2010)` +#' variables (no `emp_pct`). +"xwalk_tract2cca_2010" + #' Block Group-to-CCA Crosswalk #' @@ -1176,32 +1032,37 @@ #' apportioning housing unit, household, and population attributes. All factors #' were determined by calculating the percentage of a block group's housing #' units, households and population that were located in each of its component -#' blocks, according to the 2010 Decennial Census, and then assigning each block -#' to a CCA (based on the location of the block's centroid point). +#' blocks, according to the 2020 Decennial Census, and then assigning each block +#' to a CCA (based on the location of the block's centroid point). **Use +#' `xwalk_blockgroup2cca` for data from the 2020 decennial census or the +#' American Community Survey (ACS) from 2020 onward. For data from the 2010 +#' decennial census or ACS from 2010 through 2019, use +#' `xwalk_blockgroup2cca_2010`.** #' #' Generally speaking, block group boundaries align neatly with CCA boundaries #' as they tend to follow similar features (e.g. rivers, major roads, rail -#' lines) but there are cases where the population, households and/or housing -#' units in a block group are split across multiple CCAs, or else are partially -#' within the City of Chicago and partially outside of it. For that reason, it -#' is not appropriate to use a one-to-one block group-to-CCA assignment to -#' apportion Census data among CCAs, and this crosswalk should be used instead. +#' lines) but there are cases where the jobs, population, households and/or +#' housing units in a block group are split across multiple CCAs, or else are +#' partially within the City of Chicago and partially outside of it. For that +#' reason, it is not appropriate to use a one-to-one block group-to-CCA +#' assignment to apportion Census data among CCAs, and this crosswalk should be +#' used instead. #' #' To use this crosswalk effectively, Census data should be joined to it (not #' vice versa, since block group IDs appear multiple times in this table). Once #' the data is joined, it should be multiplied by the appropriate factor #' (depending whether the data of interest is measured at the housing unit, -#' household or person level), and then the result should be summed by CCA. If -#' calculating rates, this should only be done after the counts have been summed -#' to CCA. The resulting table can then be joined to `cca_sf` for mapping, if -#' desired. +#' household, person or job level), and then the result should be summed by CCA. +#' If calculating rates, this should only be done after the counts have been +#' summed to CCA. The resulting table can then be joined to `cca_sf` for +#' mapping, if desired. #' #' If your data is only available at the tract level, you can use #' `xwalk_tract2cca` for a tract-level allocation instead. #' #' @format -#' A tibble with `r nrow(xwalk_blockgroup2cca)` rows and -#' `r ncol(xwalk_blockgroup2cca)` variables: +#' `xwalk_blockgroup2cca` is a tibble with `r nrow(xwalk_blockgroup2cca)` rows +#' and `r ncol(xwalk_blockgroup2cca)` variables: #' \describe{ #' \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census #' Bureau. Corresponds to `blockgroup_sf`. Character.} @@ -1219,29 +1080,33 @@ #' group quarters) living in the specified CCA. Multiply this by a block #' group-level measure of a population attribute (e.g. race/ethnicity) to #' estimate the CCA's portion. Double.} +#' \item{emp_pct}{Proportion of the block group's total jobs located in the +#' specified CCA. Multiply this by a block group-level measure of an +#' employment attribute (e.g. retail jobs) to estimate the CCA's portion. +#' **Not available in `xwalk_blockgroup2cca_2010`.** Double.} #' } #' #' @examples #' suppressPackageStartupMessages(library(dplyr)) #' -#' # View the block groups with households not fully contained in a single CCA -#' filter(xwalk_blockgroup2cca, hh_pct < 1) +#' # View the block groups with housing units split between multiple CCAs +#' filter(xwalk_blockgroup2cca, hu_pct < 1) #' -#' # Estimate CCA-level unemployment rate from block group-level ACS data -#' df_blkgrp <- tidycensus::get_acs( -#' "block group", state = "IL", county = "031", table = "B23025", -#' year = 2019, survey = "acs5", output = "wide", cache_table = TRUE +#' # Estimate CCA-level housing vacancy rates from block group-level Census data +#' df_blkgrp <- tidycensus::get_decennial( +#' geography = "block group", variables = c("H1_001N", "H1_003N"), +#' year = 2020, state = "IL", county = c("031", "043"), output = "wide" #' ) %>% -#' rename(civ_lf = B23025_003E, unemp = B23025_005E) %>% -#' select(GEOID, civ_lf, unemp) +#' suppressMessages() %>% # Hide tidycensus messages +#' select(geoid_blkgrp = GEOID, hu_tot = H1_001N, hu_vac = H1_003N) #' #' df_cca <- xwalk_blockgroup2cca %>% -#' left_join(df_blkgrp, by = c("geoid_blkgrp" = "GEOID")) %>% -#' mutate(civ_lf = civ_lf * pop_pct, -#' unemp = unemp * pop_pct) %>% +#' left_join(df_blkgrp, by = "geoid_blkgrp") %>% +#' mutate(hu_tot = hu_tot * hu_pct, +#' hu_vac = hu_vac * hu_pct) %>% #' group_by(cca_num) %>% -#' summarize_at(vars(civ_lf, unemp), sum) %>% -#' mutate(unemp_rate = unemp / civ_lf) +#' summarize_at(vars(hu_tot, hu_vac), sum) %>% +#' mutate(vac_rate = hu_vac / hu_tot) #' df_cca #' #' # Join to cca_sf for mapping @@ -1249,11 +1114,17 @@ #' cca_sf %>% #' left_join(df_cca, by = "cca_num") %>% #' ggplot() + -#' geom_sf(aes(fill = unemp_rate), lwd = 0.1) + +#' geom_sf(aes(fill = vac_rate), lwd = 0.1) + #' scale_fill_viridis_c(direction = -1) + #' theme_void() "xwalk_blockgroup2cca" +#' @rdname xwalk_blockgroup2cca +#' @format `xwalk_blockgroup2cca_2010` is a tibble with +#' `r nrow(xwalk_blockgroup2cca_2010)` rows and +#' `r ncol(xwalk_blockgroup2cca_2010)` variables (no `emp_pct`). +"xwalk_blockgroup2cca_2010" + #' Tract-to-Subzone Crosswalk #' @@ -1262,24 +1133,26 @@ #' apportioning housing unit, household, and population attributes. All factors #' were determined by calculating the percentage of a tract's housing units, #' households and population that were located in each of its component blocks, -#' according to the 2010 Decennial Census, and then assigning each block to a +#' according to the 2020 Decennial Census, and then assigning each block to a #' subzone (based on the location of the block's centroid point). Subzones that #' do not contain the centroid of any blocks with at least one housing unit, -#' household or person are not present in this table, and should be considered -#' unpopulated. +#' household, person or job are *not* present in this table. **Use +#' `xwalk_tract2subzone` for data from the 2020 decennial census or the American +#' Community Survey (ACS) from 2020 onward. For data from the 2010 decennial +#' census or ACS from 2010 through 2019, use `xwalk_tract2subzone_2010`.** #' #' Other than in certain areas of Chicago, tracts tend to be significantly #' larger than subzones and have highly irregular boundaries, so in most cases -#' the population, households and/or housing units in a tract are split across -#' multiple subzones. For that reason, it is not appropriate to use a one-to-one -#' tract-to-subzone assignment to apportion Census data among subzones, and this -#' crosswalk should be used instead. +#' the jobs, population, households and/or housing units in a tract are split +#' across multiple subzones. For that reason, it is not appropriate to use a +#' one-to-one tract-to-subzone assignment to apportion Census data among +#' subzones, and this crosswalk should be used instead. #' #' To use this crosswalk effectively, Census data should be joined to it (not #' vice versa, since tract IDs appear multiple times in this table). Once the #' data is joined, it should be multiplied by the appropriate factor (depending -#' whether the data of interest is measured at the housing unit, household or -#' person level), and then the result should be summed by subzone ID. If +#' whether the data of interest is measured at the housing unit, household, +#' person or job level), and then the result should be summed by subzone ID. If #' calculating rates, this should only be done after the counts have been summed #' to subzone. The resulting table can then be joined to `subzone_sf` for #' mapping, if desired. @@ -1290,7 +1163,7 @@ #' use zones instead with `xwalk_tract2zone` or `xwalk_blockgroup2zone`. #' #' @format -#' A tibble with `r nrow(xwalk_tract2subzone)` rows and +#' `xwalk_tract2subzone` is a tibble with `r nrow(xwalk_tract2subzone)` rows and #' `r ncol(xwalk_tract2subzone)` variables: #' \describe{ #' \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census Bureau. @@ -1309,13 +1182,17 @@ #' quarters) living in the specified subzone. Multiply this by a tract-level #' measure of a population attribute (e.g. race/ethnicity) to estimate the #' subzone's portion. Double.} +#' \item{emp_pct}{Proportion of the tract's total jobs located in the +#' specified subzone. Multiply this by a tract-level measure of an employment +#' attribute (e.g. retail jobs) to estimate the subzone's portion. +#' **Not available in `xwalk_tract2subzone_2010`.** Double.} #' } #' #' @examples #' # View the tract allocations for subzone17 == 1 #' dplyr::filter(xwalk_tract2subzone, subzone17 == 1) #' -#' # Map the subzones missing from xwalk_tract2subzone (i.e. no HU/HH/pop) +#' # Map the subzones missing from xwalk_tract2subzone (i.e. no HU/HH/pop/emp) #' library(ggplot2) #' ggplot(dplyr::anti_join(subzone_sf, xwalk_tract2subzone)) + #' geom_sf(fill = "red", lwd = 0.1) + @@ -1323,6 +1200,12 @@ #' theme_void() "xwalk_tract2subzone" +#' @rdname xwalk_tract2subzone +#' @format `xwalk_tract2subzone_2010` is a tibble with +#' `r nrow(xwalk_tract2subzone_2010)` rows and +#' `r ncol(xwalk_tract2subzone_2010)` variables (no `emp_pct`). +"xwalk_tract2subzone_2010" + #' Block Group-to-Subzone Crosswalk #' @@ -1331,27 +1214,30 @@ #' for apportioning housing unit, household, and population attributes. All #' factors were determined by calculating the percentage of a block group's #' housing units, households and population that were located in each of its -#' component blocks, according to the 2010 Decennial Census, and then assigning +#' component blocks, according to the 2020 Decennial Census, and then assigning #' each block to a subzone (based on the location of the block's centroid #' point). Subzones that do not contain the centroid of any blocks with at least -#' one housing unit, household or person are not present in this table, and -#' should be considered unpopulated. +#' one housing unit, household, person or job are *not* present in this table. +#' **Use `xwalk_blockgroup2subzone` for data from the 2020 decennial census or +#' the American Community Survey (ACS) from 2020 onward. For data from the 2010 +#' decennial census or ACS from 2010 through 2019, use +#' `xwalk_blockgroup2subzone_2010`.** #' #' Other than in certain areas of Chicago, block groups tend to be significantly #' larger than subzones and have highly irregular boundaries, so in most cases -#' the population, households and/or housing units in a block group are split -#' across multiple subzones. For that reason, it is not appropriate to use a -#' one-to-one block group-to-subzone assignment to apportion Census data among +#' the jobs, population, households and/or housing units in a block group are +#' split across multiple subzones. For that reason, it is not appropriate to use +#' a one-to-one block group-to-subzone assignment to apportion Census data among #' subzones, and this crosswalk should be used instead. #' #' To use this crosswalk effectively, Census data should be joined to it (not #' vice versa, since block group IDs appear multiple times in this table). Once #' the data is joined, it should be multiplied by the appropriate factor #' (depending whether the data of interest is measured at the housing unit, -#' household or person level), and then the result should be summed by subzone -#' ID. If calculating rates, this should only be done after the counts have been -#' summed to subzone. The resulting table can then be joined to `subzone_sf` for -#' mapping, if desired. +#' household, person or job level), and then the result should be summed by +#' subzone ID. If calculating rates, this should only be done after the counts +#' have been summed to subzone. The resulting table can then be joined to +#' `subzone_sf` for mapping, if desired. #' #' If your data is only available at the tract level, you can use #' `xwalk_tract2subzone` for a tract-level allocation instead. If the subzone @@ -1359,7 +1245,8 @@ #' `xwalk_blockgroup2zone` or `xwalk_tract2zone`. #' #' @format -#' A tibble with `r nrow(xwalk_blockgroup2subzone)` rows and +#' `xwalk_blockgroup2subzone` is a tibble with +#' `r nrow(xwalk_blockgroup2subzone)` rows and #' `r ncol(xwalk_blockgroup2subzone)` variables: #' \describe{ #' \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census @@ -1378,13 +1265,17 @@ #' group quarters) living in the specified subzone. Multiply this by a block #' group-level measure of a population attribute (e.g. race/ethnicity) to #' estimate the subzone's portion. Double.} +#' \item{emp_pct}{Proportion of the block group's total jobs located in the +#' specified subzone. Multiply this by a block group-level measure of an +#' employment attribute (e.g. retail jobs) to estimate the subzone's portion. +#' **Not available in `xwalk_blockgroup2subzone_2010`.** Double.} #' } #' #' @examples #' # View the block group allocations for subzone17 == 1 #' dplyr::filter(xwalk_blockgroup2subzone, subzone17 == 1) #' -#' # Map the subzones missing from xwalk_blockgroup2subzone (i.e. no HU/HH/pop) +#' # Map the subzones missing from xwalk_blockgroup2subzone (i.e. no HU/HH/pop/emp) #' library(ggplot2) #' ggplot(dplyr::anti_join(subzone_sf, xwalk_blockgroup2subzone)) + #' geom_sf(fill = "red", lwd = 0.1) + @@ -1392,6 +1283,12 @@ #' theme_void() "xwalk_blockgroup2subzone" +#' @rdname xwalk_blockgroup2subzone +#' @format `xwalk_blockgroup2subzone_2010` is a tibble with +#' `r nrow(xwalk_blockgroup2subzone_2010)` rows and +#' `r ncol(xwalk_blockgroup2subzone_2010)` variables (no `emp_pct`). +"xwalk_blockgroup2subzone_2010" + #' Tract-to-Zone Crosswalk #' @@ -1400,14 +1297,16 @@ #' apportioning housing unit, household, and population attributes. All factors #' were determined by calculating the percentage of a tract's housing units, #' households and population that were located in each of its component blocks, -#' according to the 2010 Decennial Census, and then assigning each block to a -#' zone (based on the location of the block's centroid point). Zones that -#' do not contain the centroid of any blocks with at least one housing unit, -#' household or person are not present in this table, and should be considered -#' unpopulated. +#' according to the 2020 Decennial Census, and then assigning each block to a +#' zone (based on the location of the block's centroid point). Zones that do not +#' contain the centroid of any blocks with at least one housing unit, household, +#' person or job are *not* present in this table. **Use `xwalk_tract2zone` for +#' data from the 2020 decennial census or the American Community Survey (ACS) +#' from 2020 onward. For data from the 2010 decennial census or ACS from 2010 +#' through 2019, use `xwalk_tract2zone_2010`.** #' #' Other than in certain areas of Chicago, tracts tend to be larger than zones -#' and have highly irregular boundaries, so in most cases the population, +#' and have highly irregular boundaries, so in most cases the jobs, population, #' households and/or housing units in a tract are split across multiple zones. #' For that reason, it is not appropriate to use a one-to-one tract-to-zone #' assignment to apportion Census data among zones, and this crosswalk should be @@ -1416,8 +1315,8 @@ #' To use this crosswalk effectively, Census data should be joined to it (not #' vice versa, since tract IDs appear multiple times in this table). Once the #' data is joined, it should be multiplied by the appropriate factor (depending -#' whether the data of interest is measured at the housing unit, household or -#' person level), and then the result should be summed by zone ID. If +#' whether the data of interest is measured at the housing unit, household, +#' person or job level), and then the result should be summed by zone ID. If #' calculating rates, this should only be done after the counts have been summed #' to zone. The resulting table can then be joined to `zone_sf` for mapping, if #' desired. @@ -1428,8 +1327,8 @@ #' subzones instead with `xwalk_tract2subzone` or `xwalk_blockgroup2subzone`. #' #' @format -#' A tibble with `r nrow(xwalk_tract2zone)` rows and `r ncol(xwalk_tract2zone)` -#' variables: +#' `xwalk_tract2zone` is a tibble with `r nrow(xwalk_tract2zone)` rows and +#' `r ncol(xwalk_tract2zone)` variables: #' \describe{ #' \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census Bureau. #' Corresponds to `tract_sf` (although that only includes the tracts in the @@ -1447,13 +1346,17 @@ #' quarters) living in the specified zone. Multiply this by a tract-level #' measure of a population attribute (e.g. race/ethnicity) to estimate the #' zone's portion. Double.} +#' \item{emp_pct}{Proportion of the tracts's total jobs located in the +#' specified zone. Multiply this by a tract-level measure of an employment +#' attribute (e.g. retail jobs) to estimate the zone's portion. +#' **Not available in `xwalk_tract2zone_2010`.** Double.} #' } #' #' @examples #' # View the tract allocations for zone17 == 55 #' dplyr::filter(xwalk_tract2zone, zone17 == 55) #' -#' # Map the zones missing from xwalk_tract2zone (i.e. no HU/HH/pop) +#' # Map the zones missing from xwalk_tract2zone (i.e. no HU/HH/pop/emp) #' library(ggplot2) #' ggplot(dplyr::anti_join(zone_sf, xwalk_tract2zone)) + #' geom_sf(fill = "red", lwd = 0.1) + @@ -1461,33 +1364,42 @@ #' theme_void() "xwalk_tract2zone" +#' @rdname xwalk_tract2zone +#' @format `xwalk_tract2zone_2010` is a tibble with +#' `r nrow(xwalk_tract2zone_2010)` rows and `r ncol(xwalk_tract2zone_2010)` +#' variables (no `emp_pct`). +"xwalk_tract2zone_2010" + #' Block Group-to-Zone Crosswalk #' #' This table contains a set of factors to apportion Census block group-level #' data among the CMAP travel modeling zones. Separate factors are provided for -#' apportioning housing unit, household, and population attributes. All factors -#' were determined by calculating the percentage of a block group's housing -#' units, households and population that were located in each of its component -#' blocks, according to the 2010 Decennial Census, and then assigning each block -#' to a zone (based on the location of the block's centroid point). Zones that -#' do not contain the centroid of any blocks with at least one housing unit, -#' household or person are not present in this table, and should be considered -#' unpopulated. +#' apportioning housing unit, household, population and employment attributes. +#' All factors were determined by calculating the percentage of a block group's +#' housing units, households, population and employment that were located in +#' each of its component blocks, according to the 2020 Decennial Census and 2019 +#' LEHD, and then assigning each block to a zone (based on the location of the +#' block's centroid point). Zones that do not contain the centroid of any blocks +#' with at least one housing unit, household, person or job are *not* present in +#' this table. **Use `xwalk_blockgroup2zone` for data from the 2020 decennial +#' census or the American Community Survey (ACS) from 2020 onward. For data from +#' the 2010 decennial census or ACS from 2010 through 2019, use +#' `xwalk_blockgroup2zone_2010`.** #' #' Other than in certain areas of Chicago, block groups tend to be larger than -#' zones and have highly irregular boundaries, so in most cases the population, -#' households and/or housing units in a block group are split across multiple -#' zones. For that reason, it is not appropriate to use a one-to-one block -#' group-to-zone assignment to apportion Census data among zones, and this +#' zones and have highly irregular boundaries, so in most cases the jobs, +#' population, households and/or housing units in a block group are split across +#' multiple zones. For that reason, it is not appropriate to use a one-to-one +#' block group-to-zone assignment to apportion Census data among zones, and this #' crosswalk should be used instead. #' #' To use this crosswalk effectively, Census data should be joined to it (not #' vice versa, since block group IDs appear multiple times in this table). Once #' the data is joined, it should be multiplied by the appropriate factor #' (depending whether the data of interest is measured at the housing unit, -#' household or person level), and then the result should be summed by zone ID. -#' If calculating rates, this should only be done after the counts have been +#' household, person or job level), and then the result should be summed by zone +#' ID. If calculating rates, this should only be done after the counts have been #' summed to zone. The resulting table can then be joined to `zone_sf` for #' mapping, if desired. #' @@ -1497,8 +1409,8 @@ #' `xwalk_blockgroup2subzone` or `xwalk_tract2subzone`. #' #' @format -#' A tibble with `r nrow(xwalk_blockgroup2zone)` rows and -#' `r ncol(xwalk_blockgroup2zone)` variables: +#' `xwalk_blockgroup2zone` is a tibble with `r nrow(xwalk_blockgroup2zone)` rows +#' and `r ncol(xwalk_blockgroup2zone)` variables: #' \describe{ #' \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census #' Bureau. Corresponds to `blockgroup_sf` (although that only includes the @@ -1516,16 +1428,26 @@ #' group quarters) living in the specified zone. Multiply this by a block #' group-level measure of a population attribute (e.g. race/ethnicity) to #' estimate the zone's portion. Double.} +#' \item{emp_pct}{Proportion of the block group's total jobs located in the +#' specified zone. Multiply this by a block group-level measure of an +#' employment attribute (e.g. retail jobs) to estimate the zone's portion. +#' **Not available in `xwalk_blockgroup2zone_2010`.** Double.} #' } #' #' @examples #' # View the block group allocations for zone17 == 55 #' dplyr::filter(xwalk_blockgroup2zone, zone17 == 55) #' -#' # Map the zones missing from xwalk_blockgroup2zone (i.e. no HU/HH/pop) +#' # Map the zones missing from xwalk_blockgroup2zone (i.e. no HU/HH/pop/emp) #' library(ggplot2) #' ggplot(dplyr::anti_join(zone_sf, xwalk_blockgroup2zone)) + #' geom_sf(fill = "red", lwd = 0.1) + #' geom_sf(data = zone_sf, fill = NA, lwd = 0.1) + #' theme_void() "xwalk_blockgroup2zone" + +#' @rdname xwalk_blockgroup2zone +#' @format `xwalk_blockgroup2zone_2010` is a tibble with +#' `r nrow(xwalk_blockgroup2zone_2010)` rows and +#' `r ncol(xwalk_blockgroup2zone_2010)` variables (no `emp_pct`). +"xwalk_blockgroup2zone_2010" diff --git a/README.Rmd b/README.Rmd index 2803493..1db4c32 100644 --- a/README.Rmd +++ b/README.Rmd @@ -57,8 +57,7 @@ format, and includes boundaries for: - Municipalities - Chicago community areas (CCAs) and wards - Census tracts, block groups, blocks, public use microdata areas (PUMAs) and - ZIP code tabulation areas (ZCTAs) — *2019 vintage, for use with most recent ACS data* -- Census tracts, block groups and blocks — *2020 vintage, for use with 2020 Census data* + ZIP code tabulation areas (ZCTAs) - CMAP travel modeling zones and subzones - CMAP subregional Councils of Mayors (COMs) - Legislative districts (state and federal) @@ -72,7 +71,7 @@ Run the following to install or update cmapgeo: ```{r install, eval=FALSE, message=FALSE, warning=FALSE} ## Install current version from GitHub -devtools::install_github("CMAP-REPOS/cmapgeo", build_vignettes=TRUE) +devtools::install_github("CMAP-REPOS/cmapgeo") ## Then load the package as you would any other library(cmapgeo) diff --git a/README.md b/README.md index f786136..d034456 100644 --- a/README.md +++ b/README.md @@ -47,10 +47,7 @@ format, and includes boundaries for: - Municipalities - Chicago community areas (CCAs) and wards - Census tracts, block groups, blocks, public use microdata areas - (PUMAs) and ZIP code tabulation areas (ZCTAs) — *2019 vintage, for - use with most recent ACS data* -- Census tracts, block groups and blocks — *2020 vintage, for use with - 2020 Census data* + (PUMAs) and ZIP code tabulation areas (ZCTAs) - CMAP travel modeling zones and subzones - CMAP subregional Councils of Mayors (COMs) - Legislative districts (state and federal) @@ -63,7 +60,7 @@ Run the following to install or update cmapgeo: ``` r ## Install current version from GitHub -devtools::install_github("CMAP-REPOS/cmapgeo", build_vignettes=TRUE) +devtools::install_github("CMAP-REPOS/cmapgeo") ## Then load the package as you would any other library(cmapgeo) diff --git a/data-raw/generate_xwalks.R b/data-raw/generate_xwalks.R index d1b4a3b..1891286 100644 --- a/data-raw/generate_xwalks.R +++ b/data-raw/generate_xwalks.R @@ -2,20 +2,23 @@ library(tidyverse) devtools::load_all() # Set parameters -census_year <- 2010 # Latest Decennial Census -census_vars <- c("H003001", "H003002", "P001001") # HU, HH, POP +tigerline_year <- 2021 # Latest TIGER/Line vintage +lehd_year <- 2019 # Latest LEHD year available, for employment data +census_year <- 2020 # Latest Decennial Census +census_vars <- c("H1_001N", "H1_002N", "P1_001N") # HU, HH, POP vars in 2020 redistricting data counties_il <- str_sub(c(county_fips_codes$cmap, county_fips_codes$xil), 3, 5) counties_in <- str_sub(county_fips_codes$xin, 3, 5) counties_wi <- str_sub(county_fips_codes$xwi, 3, 5) + # Get Census block geometries for entire modeling area -block_21co_sf <- tigris::blocks(state = "IL", county = counties_il) %>% - bind_rows(tigris::blocks(state = "IN", county = counties_in)) %>% - bind_rows(tigris::blocks(state = "WI", county = counties_wi)) %>% - filter(TRACTCE10 != "990000") %>% # Exclude Lake Michigan tracts +block_21co_sf <- tigris::blocks(state = "IL", county = counties_il, year = tigerline_year) %>% + bind_rows(tigris::blocks(state = "IN", county = counties_in, year = tigerline_year)) %>% + bind_rows(tigris::blocks(state = "WI", county = counties_wi, year = tigerline_year)) %>% + filter(TRACTCE20 != "990000") %>% # Exclude Lake Michigan tracts sf::st_transform(cmap_crs) %>% - rename(geoid_block = GEOID10) %>% - mutate(county_fips = paste0(STATEFP10, COUNTYFP10), + rename(geoid_block = GEOID20) %>% + mutate(county_fips = paste0(STATEFP20, COUNTYFP20), sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% select(geoid_block, county_fips, sqmi) %>% arrange(geoid_block) @@ -36,11 +39,56 @@ block_data <- tidycensus::get_decennial( select(-NAME) %>% rename( geoid_block = GEOID, - hu = H003001, - hh = H003002, - pop = P001001 - ) %>% - filter(pop > 0 | hh > 0 | hu > 0) %>% + hu = H1_001N, + hh = H1_002N, + pop = P1_001N + ) + +# Get LEHD block-level employment data (2019 LEHD USES 2010 BLOCKS) +lehd_data <- lehdr::grab_lodes( + state = c("IL", "IN", "WI"), year = lehd_year, lodes_type = "wac", + job_type = "JT00", segment = "S000", state_part = "main" +) %>% + mutate(county_fips = stringr::str_sub(w_geocode, 1, 5)) %>% + filter(county_fips %in% unlist(county_fips_codes[c("cmap", "xil", "xin", "xwi")])) %>% + select(w_geocode, emp = C000) + + +### PROCESS LEHD EMPLOYMENT DATA USING 2010 CENSUS BLOCKS ### +# Note: this section is only needed until LEHD switches to use the 2020 blocks. +# As of the 2019 LEHD, it still uses the 2010 blocks. Once it switches, remove +# this section and the LEHD data will be joined to block_data directly. + +# Get 2010 Census block geometries for entire modeling area +block2010_21co_sf <- tigris::blocks(state = "IL", county = counties_il, year = 2019) %>% + bind_rows(tigris::blocks(state = "IN", county = counties_in, year = 2019)) %>% + bind_rows(tigris::blocks(state = "WI", county = counties_wi, year = 2019)) %>% + filter(TRACTCE10 != "990000") %>% # Exclude Lake Michigan tracts + sf::st_transform(cmap_crs) %>% + rename(w_geocode = GEOID10) %>% + select(w_geocode) %>% + arrange(w_geocode) + +# Convert 2010 blocks to centroids and spatially join to 2020 block polygons +sf::st_geometry(block2010_21co_sf) <- sf::st_point_on_surface(sf::st_geometry(block2010_21co_sf)) +block10_block20_assign <- block2010_21co_sf %>% + mutate(geoid_block = block_21co_sf$geoid_block[sf::st_nearest_feature(., block_21co_sf)]) %>% + as_tibble() %>% + select(w_geocode, geoid_block) + +# Summarize LEHD data by 2020 block +lehd_data <- lehd_data %>% + left_join(block10_block20_assign) %>% + group_by(geoid_block) %>% + summarize(emp = sum(emp)) + +### END LEHD 2010 BLOCK PROCESSING ### + + +block_data <- block_data %>% + left_join(lehd_data) %>% # Specify `by=c("geoid_block"="w_geocode")` when joining LEHD directly + mutate(emp = if_else(is.na(emp), 0, emp)) %>% + filter(pop > 0 | hh > 0 | hu > 0 | emp > 0) %>% mutate( geoid_blkgrp = str_sub(geoid_block, 1, 12), geoid_tract = str_sub(geoid_block, 1, 11), @@ -55,6 +103,7 @@ blockgroup_data <- block_data %>% hu_blkgrp = sum(hu), hh_blkgrp = sum(hh), pop_blkgrp = sum(pop), + emp_blkgrp = sum(emp), sqmi_blkgrp = sum(sqmi), .groups = "drop" ) @@ -65,6 +114,7 @@ tract_data <- block_data %>% hu_tract = sum(hu), hh_tract = sum(hh), pop_tract = sum(pop), + emp_tract = sum(emp), sqmi_tract = sum(sqmi), .groups = "drop" ) @@ -76,19 +126,21 @@ block_pct <- block_data %>% mutate( hu_pct_blkgrp = if_else(hu_blkgrp > 0, hu/hu_blkgrp, sqmi/sqmi_blkgrp), hh_pct_blkgrp = if_else(hh_blkgrp > 0, hh/hh_blkgrp, sqmi/sqmi_blkgrp), - pop_pct_blkgrp = if_else(pop_blkgrp > 0, pop/pop_blkgrp, sqmi/sqmi_blkgrp) + pop_pct_blkgrp = if_else(pop_blkgrp > 0, pop/pop_blkgrp, sqmi/sqmi_blkgrp), + emp_pct_blkgrp = if_else(emp_blkgrp > 0, emp/emp_blkgrp, sqmi/sqmi_blkgrp) ) %>% left_join(tract_data) %>% mutate( hu_pct_tract = if_else(hu_tract > 0, hu/hu_tract, sqmi/sqmi_tract), hh_pct_tract = if_else(hh_tract > 0, hh/hh_tract, sqmi/sqmi_tract), - pop_pct_tract = if_else(pop_tract > 0, pop/pop_tract, sqmi/sqmi_tract) + pop_pct_tract = if_else(pop_tract > 0, pop/pop_tract, sqmi/sqmi_tract), + emp_pct_tract = if_else(emp_tract > 0, emp/emp_tract, sqmi/sqmi_tract) ) %>% - select(geoid_block, hu, hh, pop, + select(geoid_block, hu, hh, pop, emp, geoid_blkgrp, ends_with("pct_blkgrp"), geoid_tract, ends_with("pct_tract")) -# Create block centroids (for blocks with HU/HH/pop only) +# Create block centroids (for blocks with HU/HH/pop/emp only) block_pt_sf <- block_21co_sf %>% select(geoid_block) %>% semi_join(block_data) @@ -107,12 +159,11 @@ block_cca_assign <- block_pt_sf %>% ### MAKE MANUAL ADJUSTMENTS TO BLOCK-CCA ASSIGNMENT HERE: ### mutate(cca_num = as.integer(case_when( - geoid_block == "170312705001009" ~ 27, # Centroid in 26 - geoid_block == "170315206001058" ~ 55, # Centroid in 52 - geoid_block == "170315206001059" ~ 55, # Centroid in 52 - geoid_block == "170317112004057" ~ 71, # Centroid in 70 - geoid_block == "170317304002008" ~ 73, # Centroid in 72 - geoid_block == "170317706023016" ~ NA_real_, # Centroid in chi_census_sf, but not cca_sf + geoid_block == "170312301001004" ~ 23, # Centroid in 22 + geoid_block == "170315206001056" ~ 55, # Centroid in 52 + geoid_block == "170315206001057" ~ 55, # Centroid in 52 + geoid_block == "170317304002017" ~ 73, # Centroid in 72 + geoid_block == "170319801001004" ~ 56, # Centroid in 64 TRUE ~ as.numeric(cca_num) # Leave everything else alone ))) %>% filter(!is.na(cca_num)) @@ -122,20 +173,26 @@ model_area_sf <- rmapshaper::ms_dissolve(subzone_sf) block_subzone_assign <- block_pt_sf %>% filter(apply(sf::st_intersects(., sf::st_buffer(model_area_sf, 100)), 1, any)) %>% + + ### MANUALLY RE-ADD IN-REGION BLOCKS WHOSE CENTROIDS ARE OUTSIDE: ### + bind_rows(filter(block_pt_sf, geoid_block %in% c("170978622003054"))) %>% + mutate(subzone17 = subzone_sf$subzone17[sf::st_nearest_feature(., subzone_sf)]) %>% as_tibble() %>% select(geoid_block, subzone17) %>% ### MAKE MANUAL ADJUSTMENTS TO BLOCK-SUBZONE ASSIGNMENT HERE: ### mutate(subzone17 = as.integer(case_when( - geoid_block == "170318300011022" ~ 1100, # Centroid in subzone from wrong county - geoid_block == "170370018001115" ~ 16598, # Centroid in subzone from wrong county - geoid_block == "170438446021003" ~ 4736, # Centroid in subzone from wrong county - geoid_block == "170978642031006" ~ 9261, # Centroid in subzone from wrong county - geoid_block == "170978642031039" ~ 9263, # Centroid in subzone from wrong county - geoid_block == "170978642043055" ~ 9295, # Centroid in subzone from wrong county - geoid_block == "181270510071032" ~ 17204, # Centroid in subzone from wrong county - geoid_block == "551010027021039" ~ 17304, # Centroid in subzone from wrong county + geoid_block == "170318300012001" ~ 1100, # Centroid in subzone from wrong county + geoid_block == "170370018001079" ~ 16598, # Centroid in subzone from wrong county + geoid_block == "170978622003054" ~ 10096, # Centroid on breakwater in Lake Michigan + geoid_block == "171118708122011" ~ 11831, # Centroid in subzone from wrong county + geoid_block == "171118708123021" ~ 11845, # Centroid in subzone from wrong county + geoid_block == "171118708123029" ~ 11843, # Centroid in subzone from wrong county + geoid_block == "171118713053000" ~ 11175, # Centroid in subzone from wrong county + geoid_block == "181270510111015" ~ 17204, # Centroid in subzone from wrong county + geoid_block == "551010027021035" ~ 17304, # Centroid in subzone from wrong county + geoid_block == "551270010003065" ~ 17392, # Centroid in subzone from wrong county TRUE ~ as.numeric(subzone17) # Leave everything else alone ))) @@ -144,6 +201,7 @@ block_zone_assign <- block_subzone_assign %>% left_join(subzone_sf) %>% select(geoid_block, zone17) + # Create the xwalks xwalk_blockgroup2cca <- block_pct %>% select(geoid_block, geoid_blkgrp, ends_with("pct_blkgrp")) %>% @@ -153,6 +211,7 @@ xwalk_blockgroup2cca <- block_pct %>% hu_pct = sum(hu_pct_blkgrp), hh_pct = sum(hh_pct_blkgrp), pop_pct = sum(pop_pct_blkgrp), + emp_pct = sum(emp_pct_blkgrp), .groups = "drop" ) summary(xwalk_blockgroup2cca) @@ -165,6 +224,7 @@ xwalk_tract2cca <- block_pct %>% hu_pct = sum(hu_pct_tract), hh_pct = sum(hh_pct_tract), pop_pct = sum(pop_pct_tract), + emp_pct = sum(emp_pct_tract), .groups = "drop" ) summary(xwalk_tract2cca) @@ -177,6 +237,7 @@ xwalk_blockgroup2subzone <- block_pct %>% hu_pct = sum(hu_pct_blkgrp), hh_pct = sum(hh_pct_blkgrp), pop_pct = sum(pop_pct_blkgrp), + emp_pct = sum(emp_pct_blkgrp), .groups = "drop" ) summary(xwalk_blockgroup2subzone) @@ -189,6 +250,7 @@ xwalk_tract2subzone <- block_pct %>% hu_pct = sum(hu_pct_tract), hh_pct = sum(hh_pct_tract), pop_pct = sum(pop_pct_tract), + emp_pct = sum(emp_pct_tract), .groups = "drop" ) summary(xwalk_tract2subzone) @@ -201,6 +263,7 @@ xwalk_blockgroup2zone <- block_pct %>% hu_pct = sum(hu_pct_blkgrp), hh_pct = sum(hh_pct_blkgrp), pop_pct = sum(pop_pct_blkgrp), + emp_pct = sum(emp_pct_blkgrp), .groups = "drop" ) summary(xwalk_blockgroup2zone) @@ -213,6 +276,7 @@ xwalk_tract2zone <- block_pct %>% hu_pct = sum(hu_pct_tract), hh_pct = sum(hh_pct_tract), pop_pct = sum(pop_pct_tract), + emp_pct = sum(emp_pct_tract), .groups = "drop" ) summary(xwalk_tract2zone) @@ -226,149 +290,176 @@ usethis::use_data(xwalk_tract2subzone, overwrite = TRUE) usethis::use_data(xwalk_tract2zone, overwrite = TRUE) -# ## QC TO IDENTIFY NECESSARY MANUAL ADJUSTMENTS -# library(tmap) # Load tmap for interactive mapping -# tmap_mode("view") -# -# # Map tracts that straddle 2+ CCAs (and their component blocks/centroids) -# multi_cca_tracts <- xwalk_tract2cca %>% -# group_by(geoid_tract) %>% -# summarize(n = n()) %>% -# filter(n > 1) -# block_21co_sf %>% -# inner_join(block_data, by="geoid_block") %>% -# filter(substr(geoid_block, 1, 11) %in% multi_cca_tracts$geoid_tract) %>% -# left_join(block_cca_assign) %>% -# mutate(cca_num = as_factor(cca_num)) %>% -# tm_shape(.) + -# tm_polygons(col="cca_num", alpha=0.5, popup.vars=c("geoid_block", "cca_num", "pop", "hh", "hu")) + -# tm_shape(cca_sf) + -# tm_borders(col="red", lwd=4, alpha=0.5) + -# tm_shape(filter(tract_sf, geoid_tract %in% multi_cca_tracts$geoid_tract)) + -# tm_borders(lwd=2, col="black") + -# tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 11) %in% multi_cca_tracts$geoid_tract)) + -# tm_dots() -# -# # Map tracts that are only partially in Chicago -# chi_partial_tracts <- xwalk_tract2cca %>% -# group_by(geoid_tract) %>% -# summarize_at(vars(hu_pct, hh_pct, pop_pct), sum) %>% -# filter(hu_pct < 0.99999 | hh_pct < 0.99999 | pop_pct < 0.99999) -# block_21co_sf %>% -# inner_join(block_data, by="geoid_block") %>% -# filter(substr(geoid_block, 1, 11) %in% chi_partial_tracts$geoid_tract) %>% -# left_join(block_cca_assign) %>% -# mutate(cca_num = as_factor(cca_num)) %>% -# tm_shape(.) + -# tm_polygons(col="cca_num", alpha=0.5, popup.vars=c("geoid_block", "cca_num", "pop", "hh", "hu")) + -# tm_shape(cca_sf) + -# tm_borders(col="red", lwd=4, alpha=0.5) + -# tm_shape(filter(tract_sf, geoid_tract %in% chi_partial_tracts$geoid_tract)) + -# tm_borders(lwd=2, col="black") + -# tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 11) %in% chi_partial_tracts$geoid_tract)) + -# tm_dots() -# -# # Map block groups that straddle 2+ CCAs (and their component blocks/centroids) -# multi_cca_blkgrps <- xwalk_blockgroup2cca %>% -# group_by(geoid_blkgrp) %>% -# summarize(n = n()) %>% -# filter(n > 1) -# block_21co_sf %>% -# inner_join(block_data, by="geoid_block") %>% -# filter(substr(geoid_block, 1, 12) %in% multi_cca_blkgrps$geoid_blkgrp) %>% -# left_join(block_cca_assign) %>% -# mutate(cca_num = as_factor(cca_num)) %>% -# tm_shape(.) + -# tm_polygons(col="cca_num", alpha=0.5, popup.vars=c("geoid_block", "cca_num", "pop", "hh", "hu")) + -# tm_shape(cca_sf) + -# tm_borders(col="red", lwd=4, alpha=0.5) + -# tm_shape(filter(blockgroup_sf, geoid_blkgrp %in% multi_cca_blkgrps$geoid_blkgrp)) + -# tm_borders(lwd=2, col="black") + -# tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 12) %in% multi_cca_blkgrps$geoid_blkgrp)) + -# tm_dots() -# -# # Map block groups that are only partially in Chicago -# chi_partial_blkgrps <- xwalk_blockgroup2cca %>% -# group_by(geoid_blkgrp) %>% -# summarize_at(vars(hu_pct, hh_pct, pop_pct), sum) %>% -# filter(hu_pct < 0.99999 | hh_pct < 0.99999 | pop_pct < 0.99999) -# block_21co_sf %>% -# inner_join(block_data, by="geoid_block") %>% -# filter(substr(geoid_block, 1, 12) %in% chi_partial_blkgrps$geoid_blkgrp) %>% -# left_join(block_cca_assign) %>% -# mutate(cca_num = as_factor(cca_num)) %>% -# tm_shape(.) + -# tm_polygons(col="cca_num", alpha=0.5, popup.vars=c("geoid_block", "cca_num", "pop", "hh", "hu")) + -# tm_shape(cca_sf) + -# tm_borders(col="red", lwd=4, alpha=0.5) + -# tm_shape(filter(blockgroup_sf, geoid_blkgrp %in% chi_partial_blkgrps$geoid_blkgrp)) + -# tm_borders(lwd=2, col="black") + -# tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 12) %in% chi_partial_blkgrps$geoid_blkgrp)) + -# tm_dots() -# -# # Map blocks that were assigned to subzones but centroid is outside model area -# fringe_block_sf <- block_21co_sf %>% -# inner_join(block_subzone_assign) %>% -# filter(!apply(sf::st_intersects(sf::st_point_on_surface(sf::st_geometry(.)), model_area_sf), 1, any)) -# ggplot(fringe_block_sf) + -# geom_sf(color="red", lwd=1) + -# geom_sf(data=inner_join(block_pt_sf, as.data.frame(fringe_block_sf), by = "geoid_block")) -# -# # Map subzones with no HU/HH/pop assigned to them -# subzone_sf %>% -# anti_join(block_subzone_assign) %>% -# tm_shape() + -# tm_polygons(col="red", alpha=0.3) -# -# # Map zones with no HU/HH/pop assigned to them -# zone_sf %>% -# anti_join(block_zone_assign) %>% -# tm_shape() + -# tm_polygons(col="red", alpha=0.3) -# -# # Map blocks in one county assigned to subzone/zone in another county -# subzone_county <- subzone_sf %>% -# as.data.frame() %>% -# select(subzone17, county_fips) -# block_bad_county <- block_21co_sf %>% -# left_join(block_subzone_assign, by = "geoid_block") %>% -# left_join(subzone_county, by = "subzone17", suffix = c("_block", "_subzone")) %>% -# filter(!is.na(county_fips_subzone), county_fips_block != county_fips_subzone) -# tm_shape(subzone_sf) + -# tm_polygons(id = "subzone17", col = "county_fips", border.col = "white", alpha = 0.5) + -# tm_shape(block_bad_county) + -# tm_polygons(col = "red", border.col = "maroon", alpha = 0.3) + -# tm_shape(filter(block_pt_sf, geoid_block %in% block_bad_county$geoid_block)) + -# tm_dots() -# -# # Map tracts that are only partially in modeling area -# model_partial_tracts <- xwalk_tract2subzone %>% -# group_by(geoid_tract) %>% -# summarize_at(vars(hu_pct, hh_pct, pop_pct), sum) %>% -# filter(hu_pct < 0.99999 | hh_pct < 0.99999 | pop_pct < 0.99999) -# block_21co_sf %>% -# inner_join(block_data, by="geoid_block") %>% -# filter(substr(geoid_block, 1, 11) %in% model_partial_tracts$geoid_tract) %>% -# left_join(block_subzone_assign) %>% -# tm_shape(.) + -# tm_polygons(col="geoid_tract", alpha=0.5, popup.vars=c("geoid_block", "geoid_tract", "subzone17", "pop", "hh", "hu")) + -# tm_shape(model_area_sf) + -# tm_borders(lwd=4, col="red") + -# tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 11) %in% model_partial_tracts$geoid_tract)) + -# tm_dots() -# -# # Map block groups that are only partially in modeling area -# model_partial_blkgrps <- xwalk_blockgroup2subzone %>% -# group_by(geoid_blkgrp) %>% -# summarize_at(vars(hu_pct, hh_pct, pop_pct), sum) %>% -# filter(hu_pct < 0.99999 | hh_pct < 0.99999 | pop_pct < 0.99999) -# block_21co_sf %>% -# inner_join(block_data, by="geoid_block") %>% -# filter(substr(geoid_block, 1, 12) %in% model_partial_blkgrps$geoid_blkgrp) %>% -# left_join(block_subzone_assign) %>% -# tm_shape(.) + -# tm_polygons(col="geoid_blkgrp", alpha=0.5, popup.vars=c("geoid_block", "geoid_blkgrp", "subzone17", "pop", "hh", "hu")) + -# tm_shape(model_area_sf) + -# tm_borders(lwd=4, col="red") + -# tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 12) %in% model_partial_blkgrps$geoid_blkgrp)) + -# tm_dots() +## QC TO IDENTIFY ANY NECESSARY MANUAL ADJUSTMENTS +library(tmap) # Load tmap for interactive mapping +tmap_mode("view") + +# Map tracts that straddle 2+ CCAs (and their component blocks/centroids) +multi_cca_tracts <- xwalk_tract2cca %>% + group_by(geoid_tract) %>% + summarize(n = n()) %>% + filter(n > 1) +block_21co_sf %>% + inner_join(block_data, by="geoid_block") %>% + filter(substr(geoid_block, 1, 11) %in% multi_cca_tracts$geoid_tract) %>% + left_join(block_cca_assign) %>% + mutate(cca_num = as_factor(cca_num)) %>% + tm_shape(.) + + tm_polygons(col="cca_num", alpha=0.5, popup.vars=c("geoid_block", "cca_num", "pop", "hh", "hu", "emp")) + + tm_shape(chi_census_sf) + + tm_borders(col="blue", lwd=4, alpha=0.5) + + tm_shape(cca_sf) + + tm_borders(col="red", lwd=4, alpha=0.5) + + tm_shape(filter(tract_sf, geoid_tract %in% multi_cca_tracts$geoid_tract)) + + tm_borders(lwd=2, col="black") + + tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 11) %in% multi_cca_tracts$geoid_tract)) + + tm_dots() + +# Map tracts that are only partially in Chicago +chi_partial_tracts <- xwalk_tract2cca %>% + group_by(geoid_tract) %>% + summarize_at(vars(hu_pct, hh_pct, pop_pct), sum) %>% + filter(hu_pct < 0.99999 | hh_pct < 0.99999 | pop_pct < 0.99999) +block_21co_sf %>% + inner_join(block_data, by="geoid_block") %>% + filter(substr(geoid_block, 1, 11) %in% chi_partial_tracts$geoid_tract) %>% + left_join(block_cca_assign) %>% + mutate(cca_num = as_factor(cca_num)) %>% + tm_shape(.) + + tm_polygons(col="cca_num", alpha=0.5, popup.vars=c("geoid_block", "cca_num", "pop", "hh", "hu", "emp")) + + tm_shape(chi_census_sf) + + tm_borders(col="blue", lwd=4, alpha=0.5) + + tm_shape(cca_sf) + + tm_borders(col="red", lwd=4, alpha=0.5) + + tm_shape(filter(tract_sf, geoid_tract %in% chi_partial_tracts$geoid_tract)) + + tm_borders(lwd=2, col="black") + + tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 11) %in% chi_partial_tracts$geoid_tract)) + + tm_dots() + +# Map block groups that straddle 2+ CCAs (and their component blocks/centroids) +multi_cca_blkgrps <- xwalk_blockgroup2cca %>% + group_by(geoid_blkgrp) %>% + summarize(n = n()) %>% + filter(n > 1) +block_21co_sf %>% + inner_join(block_data, by="geoid_block") %>% + filter(substr(geoid_block, 1, 12) %in% multi_cca_blkgrps$geoid_blkgrp) %>% + left_join(block_cca_assign) %>% + mutate(cca_num = as_factor(cca_num)) %>% + tm_shape(.) + + tm_polygons(col="cca_num", alpha=0.5, popup.vars=c("geoid_block", "cca_num", "pop", "hh", "hu", "emp")) + + tm_shape(chi_census_sf) + + tm_borders(col="blue", lwd=4, alpha=0.5) + + tm_shape(cca_sf) + + tm_borders(col="red", lwd=4, alpha=0.5) + + tm_shape(filter(blockgroup_sf, geoid_blkgrp %in% multi_cca_blkgrps$geoid_blkgrp)) + + tm_borders(lwd=2, col="black") + + tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 12) %in% multi_cca_blkgrps$geoid_blkgrp)) + + tm_dots() + +# Map block groups that are only partially in Chicago +chi_partial_blkgrps <- xwalk_blockgroup2cca %>% + group_by(geoid_blkgrp) %>% + summarize_at(vars(hu_pct, hh_pct, pop_pct), sum) %>% + filter(hu_pct < 0.99999 | hh_pct < 0.99999 | pop_pct < 0.99999) +block_21co_sf %>% + inner_join(block_data, by="geoid_block") %>% + filter(substr(geoid_block, 1, 12) %in% chi_partial_blkgrps$geoid_blkgrp) %>% + left_join(block_cca_assign) %>% + mutate(cca_num = as_factor(cca_num)) %>% + tm_shape(.) + + tm_polygons(col="cca_num", alpha=0.5, popup.vars=c("geoid_block", "cca_num", "pop", "hh", "hu", "emp")) + + tm_shape(chi_census_sf) + + tm_borders(col="blue", lwd=4, alpha=0.5) + + tm_shape(cca_sf) + + tm_borders(col="red", lwd=4, alpha=0.5) + + tm_shape(filter(blockgroup_sf, geoid_blkgrp %in% chi_partial_blkgrps$geoid_blkgrp)) + + tm_borders(lwd=2, col="black") + + tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 12) %in% chi_partial_blkgrps$geoid_blkgrp)) + + tm_dots() + +# Map blocks that were assigned to subzones but centroid is outside model area +fringe_block_sf <- block_21co_sf %>% + inner_join(block_subzone_assign) %>% + filter(!apply(sf::st_intersects(sf::st_point_on_surface(sf::st_geometry(.)), model_area_sf), 1, any)) +ggplot(fringe_block_sf) + + geom_sf(color="red", lwd=1) + + geom_sf(data=inner_join(block_pt_sf, as.data.frame(fringe_block_sf), by = "geoid_block")) + +# Map subzones with no HU/HH/pop/emp assigned to them +subzone_sf %>% + anti_join(block_subzone_assign) %>% + tm_shape() + + tm_polygons(col="red", alpha=0.3) + +# Map zones with no HU/HH/pop/emp assigned to them +zone_sf %>% + anti_join(block_zone_assign) %>% + tm_shape() + + tm_polygons(col="red", alpha=0.3) + +# Map blocks in one county assigned to subzone/zone in another county +subzone_county <- subzone_sf %>% + as.data.frame() %>% + select(subzone17, county_fips) +block_bad_county <- block_21co_sf %>% + left_join(block_subzone_assign, by = "geoid_block") %>% + left_join(subzone_county, by = "subzone17", suffix = c("_block", "_subzone")) %>% + filter(!is.na(county_fips_subzone), county_fips_block != county_fips_subzone) +tm_shape(subzone_sf) + + tm_polygons(id = "subzone17", col = "county_fips", border.col = "white", alpha = 0.5) + +tm_shape(block_bad_county) + + tm_polygons(col = "red", border.col = "maroon", alpha = 0.3) + +tm_shape(filter(block_pt_sf, geoid_block %in% block_bad_county$geoid_block)) + + tm_dots() + +# Map tracts that are only partially in modeling area +model_partial_tracts <- xwalk_tract2subzone %>% + group_by(geoid_tract) %>% + summarize_at(vars(hu_pct, hh_pct, pop_pct, emp_pct), sum) %>% + filter(hu_pct < 0.99999 | hh_pct < 0.99999 | pop_pct < 0.99999 | emp_pct < 0.99999) +block_21co_sf %>% + inner_join(block_data, by="geoid_block") %>% + filter(substr(geoid_block, 1, 11) %in% model_partial_tracts$geoid_tract) %>% + left_join(block_subzone_assign) %>% + tm_shape(.) + + tm_polygons(col="geoid_tract", alpha=0.5, popup.vars=c("geoid_block", "geoid_tract", "subzone17", "pop", "hh", "hu", "emp")) + + tm_shape(model_area_sf) + + tm_borders(col="red", lwd=4, alpha=0.5) + + tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 11) %in% model_partial_tracts$geoid_tract)) + + tm_dots() + +# Map block groups that are only partially in modeling area +model_partial_blkgrps <- xwalk_blockgroup2subzone %>% + group_by(geoid_blkgrp) %>% + summarize_at(vars(hu_pct, hh_pct, pop_pct, emp_pct), sum) %>% + filter(hu_pct < 0.99999 | hh_pct < 0.99999 | pop_pct < 0.99999 | emp_pct < 0.99999) +block_21co_sf %>% + inner_join(block_data, by="geoid_block") %>% + filter(substr(geoid_block, 1, 12) %in% model_partial_blkgrps$geoid_blkgrp) %>% + left_join(block_subzone_assign) %>% + tm_shape(.) + + tm_polygons(col="geoid_blkgrp", alpha=0.5, popup.vars=c("geoid_block", "geoid_blkgrp", "subzone17", "pop", "hh", "hu", "emp")) + + tm_shape(model_area_sf) + + tm_borders(col="red", lwd=4, alpha=0.5) + + tm_shape(filter(block_pt_sf, substr(geoid_block, 1, 12) %in% model_partial_blkgrps$geoid_blkgrp)) + + tm_dots() + +# Check CCA sums against Census' Chicago total +chi_data <- tidycensus::get_decennial( + "place", census_vars, year = census_year, sumfile = "sf1", output = "wide", + state = "IL" +) %>% + filter(NAME == "Chicago city, Illinois") %>% + rename( + hu = H1_001N, + hh = H1_002N, + pop = P1_001N + ) +block_cca_sums <- block_data %>% + semi_join(block_cca_assign) %>% + select(hu, hh, pop) %>% + summarize_all(sum) +block_cca_sums$hu - chi_data$hu # Should return 0 +block_cca_sums$hh - chi_data$hh # Should return 0 +block_cca_sums$pop - chi_data$pop # Should return 0 diff --git a/data-raw/load_census_api.R b/data-raw/load_census_api.R index 3fc3d40..62b1bc4 100644 --- a/data-raw/load_census_api.R +++ b/data-raw/load_census_api.R @@ -2,20 +2,21 @@ library(tidyverse) devtools::load_all() # Set common parameters +BASE_YEAR <- 2021 # TIGER/Line vintage to use by default STATE <- "17" # Illinois COUNTIES_7CO <- c("031", "043", "089", "093", "097", "111", "197") # CMAP 7 COUNTIES_MPO <- c(COUNTIES_7CO, "063", "037") # CMAP 7, plus Grundy and DeKalb # Get CMAP counties for spatial filtering (not saved) -- includes Lake Michigan -temp_cmap_sf <- tigris::counties(state = STATE) %>% +temp_cmap_sf <- tigris::counties(state = STATE, year = BASE_YEAR) %>% filter(COUNTYFP %in% COUNTIES_7CO) %>% sf::st_transform(cmap_crs) %>% select(GEOID) # Get Lake Michigan tracts for erasing -temp_lakemich_sf <- tigris::tracts(state = "17") %>% - bind_rows(tigris::tracts(state = "18")) %>% - bind_rows(tigris::tracts(state = "55")) %>% +temp_lakemich_sf <- tigris::tracts(state = "17", year = BASE_YEAR) %>% + bind_rows(tigris::tracts(state = "18", year = BASE_YEAR)) %>% + bind_rows(tigris::tracts(state = "55", year = BASE_YEAR)) %>% filter(TRACTCE == "990000") %>% # Water tracts only sf::st_transform(cmap_crs) %>% rmapshaper::ms_dissolve() @@ -30,7 +31,7 @@ intersects_cmap <- function(in_sf) { # Process Census counties keep_counties <- unique(unlist(county_fips_codes)) state_fips <- unique(substr(keep_counties, 1, 2)) -county_sf <- tigris::counties(state = state_fips) %>% +county_sf <- tigris::counties(state = state_fips, year = BASE_YEAR) %>% filter(GEOID %in% keep_counties) %>% sf::st_transform(cmap_crs) %>% rmapshaper::ms_erase(temp_lakemich_sf) %>% # Erase Lake Michigan @@ -47,7 +48,7 @@ county_sf <- tigris::counties(state = state_fips) %>% arrange(geoid_county) # Process Census county subdivisions (a.k.a. political townships) -township_sf <- tigris::county_subdivisions(state = STATE, county = COUNTIES_MPO) %>% +township_sf <- tigris::county_subdivisions(state = STATE, county = COUNTIES_MPO, year = BASE_YEAR) %>% filter(COUSUBFP != "00000", # Exclude Lake Michigan "townships" COUNTYFP %in% COUNTIES_7CO | NAME %in% c("Aux Sable", "Sandwich", "Somonauk")) %>% sf::st_transform(cmap_crs) %>% @@ -59,7 +60,7 @@ township_sf <- tigris::county_subdivisions(state = STATE, county = COUNTIES_MPO) arrange(geoid_cousub) # Process Census places (a.k.a. municipalities) -municipality_sf <- tigris::places(state = STATE) %>% +municipality_sf <- tigris::places(state = STATE, year = BASE_YEAR) %>% filter(!str_detect(NAMELSAD, " CDP")) %>% # Incorporated places only sf::st_transform(cmap_crs) %>% filter(intersects_cmap(.)) %>% # Restrict to CMAP region @@ -69,51 +70,9 @@ municipality_sf <- tigris::places(state = STATE) %>% select(geoid_place, municipality, sqmi) %>% arrange(geoid_place) -# Process Census tracts -tract_sf <- tigris::tracts(state = STATE, county = COUNTIES_7CO) %>% - filter(TRACTCE != "990000") %>% # Exclude Lake Michigan tracts - sf::st_transform(cmap_crs) %>% - rename(geoid_tract = GEOID) %>% - mutate(county_fips = paste0(STATEFP, COUNTYFP), - sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% - select(geoid_tract, county_fips, sqmi) %>% - arrange(geoid_tract) - -# Process Census block groups -blockgroup_sf <- tigris::block_groups(state = STATE, county = COUNTIES_7CO) %>% - filter(TRACTCE != "990000") %>% # Exclude Lake Michigan tracts - sf::st_transform(cmap_crs) %>% - rename(geoid_blkgrp = GEOID) %>% - mutate(county_fips = paste0(STATEFP, COUNTYFP), - sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% - select(geoid_blkgrp, county_fips, sqmi) %>% - arrange(geoid_blkgrp) - -# Process Census blocks -block_sf <- tigris::blocks(state = STATE, county = COUNTIES_7CO) %>% - filter(TRACTCE10 != "990000") %>% # Exclude Lake Michigan tracts - sf::st_transform(cmap_crs) %>% - rename(geoid_block = GEOID10) %>% - mutate(county_fips = paste0(STATEFP10, COUNTYFP10), - sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% - select(geoid_block, county_fips, sqmi) %>% - arrange(geoid_block) - -# Process Census PUMAs -puma_sf <- tigris::pumas(state = STATE) %>% - sf::st_transform(cmap_crs) %>% - filter(intersects_cmap(.)) %>% # Restrict to CMAP region - rmapshaper::ms_erase(temp_lakemich_sf) %>% # Erase Lake Michigan - rename(geoid_puma = GEOID10, - name = NAMELSAD10) %>% - mutate(sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% - select(geoid_puma, name, sqmi) %>% - arrange(geoid_puma) %>% - ## Manually exclude one PUMA, which appears to mistakenly include a *tiny* block in McHenry - filter(geoid_puma != "1702901") - # Process Congressional Districts (U.S. House of Representatives) -congress_sf <- tigris::congressional_districts() %>% +# Note: still 2011 districts, as of 2020 TIGER/Line +congress_sf <- tigris::congressional_districts(year = BASE_YEAR) %>% filter(STATEFP == STATE, LSAD == "C2") %>% sf::st_transform(cmap_crs) %>% mutate(dist_num = as.integer(CD116FP), @@ -126,8 +85,9 @@ congress_sf <- tigris::congressional_districts() %>% select(dist_num, dist_name, dist_name_short, cmap, sqmi) %>% arrange(dist_num) -# Process IL House Districts -ilga_house_sf <- tigris::state_legislative_districts(state = STATE, house = "lower") %>% +# Process IL House Districts (Illinois General Assembly) +# Note: still 2011 districts, as of 2020 TIGER/Line +ilga_house_sf <- tigris::state_legislative_districts(state = STATE, house = "lower", year = BASE_YEAR) %>% filter(LSAD == "LL") %>% sf::st_transform(cmap_crs) %>% rename(dist_name = NAMELSAD) %>% @@ -137,8 +97,9 @@ ilga_house_sf <- tigris::state_legislative_districts(state = STATE, house = "low select(dist_num, dist_name, cmap, sqmi) %>% arrange(dist_num) -# Process IL Senate Districts -ilga_senate_sf <- tigris::state_legislative_districts(state = STATE, house = "upper") %>% +# Process IL Senate Districts (Illinois General Assembly) +# Note: still 2011 districts, as of 2020 TIGER/Line +ilga_senate_sf <- tigris::state_legislative_districts(state = STATE, house = "upper", year = BASE_YEAR) %>% filter(LSAD == "LU") %>% sf::st_transform(cmap_crs) %>% rename(dist_name = NAMELSAD) %>% @@ -148,11 +109,25 @@ ilga_senate_sf <- tigris::state_legislative_districts(state = STATE, house = "up select(dist_num, dist_name, cmap, sqmi) %>% arrange(dist_num) +# Process Public Use Microddata Areas (PUMAs) +# Note: still 2010 boundaries, as of 2020 TIGER/Line +puma_sf <- tigris::pumas(state = STATE, year = BASE_YEAR) %>% + sf::st_transform(cmap_crs) %>% + filter(intersects_cmap(.)) %>% # Restrict to CMAP region + rmapshaper::ms_erase(temp_lakemich_sf) %>% # Erase Lake Michigan + rename(geoid_puma = GEOID10, + name = NAMELSAD10) %>% + mutate(sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% + select(geoid_puma, name, sqmi) %>% + arrange(geoid_puma) %>% + ## Manually exclude one PUMA, which appears to mistakenly include a *tiny* block in McHenry + filter(geoid_puma != "1702901") + # Process ZIP Code Tabulation Areas (ZCTAs) -zcta_sf <- tigris::zctas(starts_with = "6") %>% +zcta_sf <- tigris::zctas(starts_with = "6", year = BASE_YEAR) %>% sf::st_transform(cmap_crs) %>% filter(intersects_cmap(.)) %>% # Restrict to CMAP region - rename(geoid_zcta = GEOID10) %>% + rename(geoid_zcta = GEOID20) %>% mutate(sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% select(geoid_zcta, sqmi) %>% arrange(geoid_zcta) @@ -169,7 +144,7 @@ county_district = c( `17005`="D8", `17083`="D8", `17163`="D8", `17189`="D8", `17027`="D8", `17061`="D8", `17157`="D8", `17013`="D8", `17121`="D8", `17119`="D8", `17133`="D8", `17151`="D9", `17055`="D9", `17059`="D9", `17065`="D9", `17193`="D9", `17181`="D9", `17199`="D9", `17069`="D9", `17127`="D9", `17145`="D9", `17087`="D9", `17153`="D9", `17003`="D9", `17165`="D9", `17081`="D9", `17077`="D9" ) -idot_sf <- tigris::counties(state = STATE) %>% +idot_sf <- tigris::counties(state = STATE, year = BASE_YEAR) %>% sf::st_transform(cmap_crs) %>% rmapshaper::ms_erase(temp_lakemich_sf) %>% # Erase Lake Michigan mutate(district = recode(GEOID, !!!county_district), @@ -183,9 +158,8 @@ idot_sf <- tigris::counties(state = STATE) %>% summarize(sqmi = sum(sqmi), .groups = "drop") -# Process 2020 Census geographies (block, block group, tract). -# (These should replace the 2019 vintage once 2020 ACS data is published.) -tract_sf_2020 <- tigris::tracts(state = STATE, county = COUNTIES_7CO, year = 2020) %>% +# Process 2020 Census geographies (block, block group, tract) +tract_sf <- tigris::tracts(state = STATE, county = COUNTIES_7CO, year = BASE_YEAR) %>% filter(TRACTCE != "990000") %>% # Exclude Lake Michigan tracts sf::st_transform(cmap_crs) %>% rename(geoid_tract = GEOID) %>% @@ -194,7 +168,7 @@ tract_sf_2020 <- tigris::tracts(state = STATE, county = COUNTIES_7CO, year = 202 select(geoid_tract, county_fips, sqmi) %>% arrange(geoid_tract) -blockgroup_sf_2020 <- tigris::block_groups(state = STATE, county = COUNTIES_7CO, year = 2020) %>% +blockgroup_sf <- tigris::block_groups(state = STATE, county = COUNTIES_7CO, year = BASE_YEAR) %>% filter(TRACTCE != "990000") %>% # Exclude Lake Michigan tracts sf::st_transform(cmap_crs) %>% rename(geoid_blkgrp = GEOID) %>% @@ -203,7 +177,7 @@ blockgroup_sf_2020 <- tigris::block_groups(state = STATE, county = COUNTIES_7CO, select(geoid_blkgrp, county_fips, sqmi) %>% arrange(geoid_blkgrp) -block_sf_2020 <- tigris::blocks(state = STATE, county = COUNTIES_7CO, year = 2020) %>% +block_sf <- tigris::blocks(state = STATE, county = COUNTIES_7CO, year = BASE_YEAR) %>% filter(TRACTCE20 != "990000") %>% # Exclude Lake Michigan tracts sf::st_transform(cmap_crs) %>% rename(geoid_block = GEOID20) %>% @@ -213,19 +187,49 @@ block_sf_2020 <- tigris::blocks(state = STATE, county = COUNTIES_7CO, year = 202 arrange(geoid_block) +# Process 2010 Census geographies (block, block group, tract) +# Note: remove these datasets from cmapgeo once 2021 1-year ACS is released +tract_sf_2010 <- tigris::tracts(state = STATE, county = COUNTIES_7CO, year = 2019) %>% + filter(TRACTCE != "990000") %>% # Exclude Lake Michigan tracts + sf::st_transform(cmap_crs) %>% + rename(geoid_tract = GEOID) %>% + mutate(county_fips = paste0(STATEFP, COUNTYFP), + sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% + select(geoid_tract, county_fips, sqmi) %>% + arrange(geoid_tract) + +blockgroup_sf_2010 <- tigris::block_groups(state = STATE, county = COUNTIES_7CO, year = 2019) %>% + filter(TRACTCE != "990000") %>% # Exclude Lake Michigan tracts + sf::st_transform(cmap_crs) %>% + rename(geoid_blkgrp = GEOID) %>% + mutate(county_fips = paste0(STATEFP, COUNTYFP), + sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% + select(geoid_blkgrp, county_fips, sqmi) %>% + arrange(geoid_blkgrp) + +block_sf_2010 <- tigris::blocks(state = STATE, county = COUNTIES_7CO, year = 2019) %>% + filter(TRACTCE10 != "990000") %>% # Exclude Lake Michigan tracts + sf::st_transform(cmap_crs) %>% + rename(geoid_block = GEOID10) %>% + mutate(county_fips = paste0(STATEFP10, COUNTYFP10), + sqmi = unclass(sf::st_area(geometry) / sqft_per_sqmi)) %>% + select(geoid_block, county_fips, sqmi) %>% + arrange(geoid_block) + + # Save processed data to package's data dir usethis::use_data(county_sf, overwrite = TRUE) usethis::use_data(township_sf, overwrite = TRUE) usethis::use_data(municipality_sf, overwrite = TRUE) -usethis::use_data(tract_sf, overwrite = TRUE) -usethis::use_data(blockgroup_sf, overwrite = TRUE) -usethis::use_data(block_sf, overwrite = TRUE) -usethis::use_data(puma_sf, overwrite = TRUE) usethis::use_data(congress_sf, overwrite = TRUE) usethis::use_data(ilga_house_sf, overwrite = TRUE) usethis::use_data(ilga_senate_sf, overwrite = TRUE) +usethis::use_data(puma_sf, overwrite = TRUE) usethis::use_data(zcta_sf, overwrite = TRUE) usethis::use_data(idot_sf, overwrite = TRUE) -usethis::use_data(tract_sf_2020, overwrite = TRUE) -usethis::use_data(blockgroup_sf_2020, overwrite = TRUE) -usethis::use_data(block_sf_2020, overwrite = TRUE) +usethis::use_data(tract_sf, overwrite = TRUE) +usethis::use_data(blockgroup_sf, overwrite = TRUE) +usethis::use_data(block_sf, overwrite = TRUE) +usethis::use_data(tract_sf_2010, overwrite = TRUE) +usethis::use_data(blockgroup_sf_2010, overwrite = TRUE) +usethis::use_data(block_sf_2010, overwrite = TRUE) diff --git a/data/block_sf.rda b/data/block_sf.rda index 8ec990a..d175eab 100644 Binary files a/data/block_sf.rda and b/data/block_sf.rda differ diff --git a/data/block_sf_2020.rda b/data/block_sf_2010.rda similarity index 73% rename from data/block_sf_2020.rda rename to data/block_sf_2010.rda index dfce6d6..7fb0201 100644 Binary files a/data/block_sf_2020.rda and b/data/block_sf_2010.rda differ diff --git a/data/blockgroup_sf.rda b/data/blockgroup_sf.rda index c1e4959..74fe409 100644 Binary files a/data/blockgroup_sf.rda and b/data/blockgroup_sf.rda differ diff --git a/data/blockgroup_sf_2010.rda b/data/blockgroup_sf_2010.rda new file mode 100644 index 0000000..759d8cc Binary files /dev/null and b/data/blockgroup_sf_2010.rda differ diff --git a/data/blockgroup_sf_2020.rda b/data/blockgroup_sf_2020.rda deleted file mode 100644 index ff8577d..0000000 Binary files a/data/blockgroup_sf_2020.rda and /dev/null differ diff --git a/data/congress_sf.rda b/data/congress_sf.rda index b3cab01..1a6d928 100644 Binary files a/data/congress_sf.rda and b/data/congress_sf.rda differ diff --git a/data/county_sf.rda b/data/county_sf.rda index 274ea46..6f3eeaa 100644 Binary files a/data/county_sf.rda and b/data/county_sf.rda differ diff --git a/data/idot_sf.rda b/data/idot_sf.rda index f00cd12..f260c66 100644 Binary files a/data/idot_sf.rda and b/data/idot_sf.rda differ diff --git a/data/ilga_house_sf.rda b/data/ilga_house_sf.rda index 0497102..8357396 100644 Binary files a/data/ilga_house_sf.rda and b/data/ilga_house_sf.rda differ diff --git a/data/ilga_senate_sf.rda b/data/ilga_senate_sf.rda index 67a37f1..dd9934f 100644 Binary files a/data/ilga_senate_sf.rda and b/data/ilga_senate_sf.rda differ diff --git a/data/municipality_sf.rda b/data/municipality_sf.rda index 064e41c..65ca738 100644 Binary files a/data/municipality_sf.rda and b/data/municipality_sf.rda differ diff --git a/data/puma_sf.rda b/data/puma_sf.rda index 3af6f4e..dd0b3ae 100644 Binary files a/data/puma_sf.rda and b/data/puma_sf.rda differ diff --git a/data/township_sf.rda b/data/township_sf.rda index 709279a..54216d5 100644 Binary files a/data/township_sf.rda and b/data/township_sf.rda differ diff --git a/data/tract_sf.rda b/data/tract_sf.rda index a03816a..fa3334a 100644 Binary files a/data/tract_sf.rda and b/data/tract_sf.rda differ diff --git a/data/tract_sf_2010.rda b/data/tract_sf_2010.rda new file mode 100644 index 0000000..fe8a8e2 Binary files /dev/null and b/data/tract_sf_2010.rda differ diff --git a/data/tract_sf_2020.rda b/data/tract_sf_2020.rda deleted file mode 100644 index c738ca8..0000000 Binary files a/data/tract_sf_2020.rda and /dev/null differ diff --git a/data/xwalk_blockgroup2cca.rda b/data/xwalk_blockgroup2cca.rda index 0bbc1dc..15a2c5d 100644 Binary files a/data/xwalk_blockgroup2cca.rda and b/data/xwalk_blockgroup2cca.rda differ diff --git a/data/xwalk_blockgroup2cca_2010.rda b/data/xwalk_blockgroup2cca_2010.rda new file mode 100644 index 0000000..cff6a68 Binary files /dev/null and b/data/xwalk_blockgroup2cca_2010.rda differ diff --git a/data/xwalk_blockgroup2subzone.rda b/data/xwalk_blockgroup2subzone.rda index face753..c6b5a97 100644 Binary files a/data/xwalk_blockgroup2subzone.rda and b/data/xwalk_blockgroup2subzone.rda differ diff --git a/data/xwalk_blockgroup2subzone_2010.rda b/data/xwalk_blockgroup2subzone_2010.rda new file mode 100644 index 0000000..ef88ce1 Binary files /dev/null and b/data/xwalk_blockgroup2subzone_2010.rda differ diff --git a/data/xwalk_blockgroup2zone.rda b/data/xwalk_blockgroup2zone.rda index 6853279..04176e5 100644 Binary files a/data/xwalk_blockgroup2zone.rda and b/data/xwalk_blockgroup2zone.rda differ diff --git a/data/xwalk_blockgroup2zone_2010.rda b/data/xwalk_blockgroup2zone_2010.rda new file mode 100644 index 0000000..4fc4f1e Binary files /dev/null and b/data/xwalk_blockgroup2zone_2010.rda differ diff --git a/data/xwalk_tract2cca.rda b/data/xwalk_tract2cca.rda index 8bcfab2..7e0802f 100644 Binary files a/data/xwalk_tract2cca.rda and b/data/xwalk_tract2cca.rda differ diff --git a/data/xwalk_tract2cca_2010.rda b/data/xwalk_tract2cca_2010.rda new file mode 100644 index 0000000..c412845 Binary files /dev/null and b/data/xwalk_tract2cca_2010.rda differ diff --git a/data/xwalk_tract2subzone.rda b/data/xwalk_tract2subzone.rda index 69a8d0b..07ce3a4 100644 Binary files a/data/xwalk_tract2subzone.rda and b/data/xwalk_tract2subzone.rda differ diff --git a/data/xwalk_tract2subzone_2010.rda b/data/xwalk_tract2subzone_2010.rda new file mode 100644 index 0000000..1c35cbb Binary files /dev/null and b/data/xwalk_tract2subzone_2010.rda differ diff --git a/data/xwalk_tract2zone.rda b/data/xwalk_tract2zone.rda index af3853c..536e178 100644 Binary files a/data/xwalk_tract2zone.rda and b/data/xwalk_tract2zone.rda differ diff --git a/data/xwalk_tract2zone_2010.rda b/data/xwalk_tract2zone_2010.rda new file mode 100644 index 0000000..c4118e3 Binary files /dev/null and b/data/xwalk_tract2zone_2010.rda differ diff --git a/data/zcta_sf.rda b/data/zcta_sf.rda index cdd2e3f..cc9657a 100644 Binary files a/data/zcta_sf.rda and b/data/zcta_sf.rda differ diff --git a/man/block_sf.Rd b/man/block_sf.Rd index 2876750..ea61913 100644 --- a/man/block_sf.Rd +++ b/man/block_sf.Rd @@ -3,10 +3,11 @@ \docType{data} \name{block_sf} \alias{block_sf} -\title{Census Blocks (2019 vintage)} +\alias{block_sf_2010} +\title{Census Blocks} \format{ -A multipolygon \code{sf} object with 169469 rows and -4 variables: +\code{block_sf} is a multipolygon \code{sf} object with 144082 rows +and 4 variables: \describe{ \item{geoid_block}{Unique 15-digit block ID, assigned by the Census Bureau. The parent tract and block group can be identified from the first 11 and 12 @@ -16,6 +17,9 @@ Character.} \item{sqmi}{Area in square miles. Double.} \item{geometry}{Feature geometry. \code{sf} multipolygon.} } + +\code{block_sf_2010} is a multipolygon \code{sf} object with +169469 rows and 4 variables. } \source{ US Census Bureau @@ -23,30 +27,37 @@ US Census Bureau } \usage{ block_sf + +block_sf_2010 } \description{ The Census Blocks within the 7-county Chicago Metropolitan Agency for Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -2019 vintage. \strong{Use this version with data from the 2010 decennial census or -the American Community Survey (ACS) from 2010 through 2019. For data from the -2020 decennial census, use \code{block_sf_2020} (which will replace this dataset -once the 2016-2020 ACS 5-year data is published).} +2021 vintage. \strong{Use \code{block_sf} for data from the 2020 decennial census or the +American Community Survey (ACS) from 2020 onward. For data from the 2010 +decennial census or ACS from 2010 through 2019, use \code{block_sf_2010}.} } \details{ Census Bureau description: -\emph{"Blocks are statistical areas bounded by visible features, such as streets, -roads, streams, and railroad tracks, and by nonvisible boundaries, such as -selected property lines and city, township, school district, and county -limits and short line-of-sight extensions of streets and roads. Generally, -census blocks are small in area; for example, a block in a city bounded on -all sides by streets. Census blocks in suburban and rural areas may be large, -irregular, and bounded by a variety of features, such as roads, streams, and -transmission lines. In remote areas, census blocks may encompass hundreds of -square miles. Census blocks cover the entire territory of the United States, -Puerto Rico, and the Island Areas. Census blocks nest within all other -tabulated census geographic entities and are the basis for all tabulated -data."} +\emph{"Blocks (Census Blocks or Tabulation Blocks) are statistical areas bounded +by visible features, such as streets, roads, streams, and railroad tracks, +and by nonvisible boundaries, such as selected property lines and city, +township, school district, and county limits and short line-of-sight +extensions of streets and roads. Generally, blocks are small in area; for +example, a city block bounded on all sides by streets. Blocks in suburban and +rural areas may be larger, more irregular in shape, and bounded by a variety +of features, such as roads, streams, and transmission lines. In remote areas, +blocks may even encompass hundreds of square miles. Blocks cover the entire +territory of the United States, Puerto Rico, and the Island Areas. Blocks +nest within all other tabulated census geographic entities at the time of the +decennial census and are the basis for all tabulated data from that census. +Census Block Numbers—Blocks are numbered uniquely with a four-digit census +block number from 0000 to 9999 within census tract, which nest within state +and county. The first digit of the census block number identifies the block +group. Block numbers beginning with a zero (in Block Group 0) are intended to +include only water area, but not all water-only blocks have block numbers +beginning with 0 (zero)."} } \examples{ # Display the blocks with ggplot2 diff --git a/man/block_sf_2020.Rd b/man/block_sf_2020.Rd deleted file mode 100644 index 7901597..0000000 --- a/man/block_sf_2020.Rd +++ /dev/null @@ -1,56 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/data.R -\docType{data} -\name{block_sf_2020} -\alias{block_sf_2020} -\title{Census Blocks (2020 vintage)} -\format{ -A multipolygon \code{sf} object with 144082 rows and -4 variables: -\describe{ -\item{geoid_block}{Unique 15-digit block ID, assigned by the Census Bureau. -The parent tract and block group can be identified from the first 11 and 12 -digits, respectively. Character.} -\item{county_fips}{Unique 5-digit FIPS code of the county the block is in. -Character.} -\item{sqmi}{Area in square miles. Double.} -\item{geometry}{Feature geometry. \code{sf} multipolygon.} -} -} -\source{ -US Census Bureau -\href{https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html}{TIGER/Line} -} -\usage{ -block_sf_2020 -} -\description{ -The Census Blocks within the 7-county Chicago Metropolitan Agency for -Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -2020 vintage. \strong{Use this version for data from the 2020 decennial census. For -data from the 2010 decennial census or the American Community Survey (ACS) -from 2010 through 2019, use \code{block_sf} (which will be replaced by this -dataset once the 2016-2020 ACS 5-year data is published).} -} -\details{ -Census Bureau description: - -\emph{"Blocks are statistical areas bounded by visible features, such as streets, -roads, streams, and railroad tracks, and by nonvisible boundaries, such as -selected property lines and city, township, school district, and county -limits and short line-of-sight extensions of streets and roads. Generally, -census blocks are small in area; for example, a block in a city bounded on -all sides by streets. Census blocks in suburban and rural areas may be large, -irregular, and bounded by a variety of features, such as roads, streams, and -transmission lines. In remote areas, census blocks may encompass hundreds of -square miles. Census blocks cover the entire territory of the United States, -Puerto Rico, and the Island Areas. Census blocks nest within all other -tabulated census geographic entities and are the basis for all tabulated -data."} -} -\examples{ -# Display the blocks with ggplot2 -library(ggplot2) -ggplot(block_sf_2020) + geom_sf(lwd = 0.1) + theme_void() -} -\keyword{datasets} diff --git a/man/blockgroup_sf.Rd b/man/blockgroup_sf.Rd index d941cf2..0e34409 100644 --- a/man/blockgroup_sf.Rd +++ b/man/blockgroup_sf.Rd @@ -3,10 +3,11 @@ \docType{data} \name{blockgroup_sf} \alias{blockgroup_sf} -\title{Census Block Groups (2019 vintage)} +\alias{blockgroup_sf_2010} +\title{Census Block Groups} \format{ -A polygon \code{sf} object with 5880 rows and -4 variables: +\code{blockgroup_sf} is a multipolygon \code{sf} object with 6077 +rows and 4 variables: \describe{ \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census Bureau. The parent tract can be identified from the first 11 digits. @@ -14,8 +15,11 @@ Character.} \item{county_fips}{Unique 5-digit FIPS code of the county the block group is in. Character.} \item{sqmi}{Area in square miles. Double.} -\item{geometry}{Feature geometry. \code{sf} polygon.} +\item{geometry}{Feature geometry. \code{sf} multipolygon.} } + +\code{blockgroup_sf_2010} is a polygon \code{sf} object with +5880 rows and 4 variables. } \source{ US Census Bureau @@ -23,14 +27,16 @@ US Census Bureau } \usage{ blockgroup_sf + +blockgroup_sf_2010 } \description{ The Census Block Groups within the 7-county Chicago Metropolitan Agency for Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -2019 vintage. \strong{Use this version with data from the 2010 decennial census or -the American Community Survey (ACS) from 2010 through 2019. For data from the -2020 decennial census, use \code{blockgroup_sf_2020} (which will replace this -dataset once the 2016-2020 ACS 5-year data is published).} +2021 vintage. \strong{Use \code{blockgroup_sf} for data from the 2020 decennial census +or the American Community Survey (ACS) from 2020 onward. For data from the +2010 decennial census or ACS from 2010 through 2019, use +\code{blockgroup_sf_2010}.} } \details{ Census Bureau description: @@ -41,20 +47,19 @@ present data and control block numbering. A block group consists of clusters of blocks within the same census tract that have the same first digit of their four-digit census block number. For example, blocks 3001, 3002, 3003, ..., 3999 in census tract 1210.02 belong to BG 3 in that census tract. Most -BGs were delineated by local participants in the Census Bureau's Participant -Statistical Areas Program. The Census Bureau delineated BGs only where a -local or tribal government declined to participate, and a regional -organization or State Data Center was not available to participate.} - -\emph{"A BG usually covers a contiguous area. Each census tract contains at least -one BG, and BGs are uniquely numbered within the census tract. Within the -standard census geographic hierarchy, BGs never cross state, county, or -census tract boundaries but may cross the boundaries of any other geographic -entity. Tribal census tracts and tribal BGs are separate and unique -geographic areas defined within federally recognized American Indian -reservations and can cross state and county boundaries. The tribal census -tracts and tribal block groups may be completely different from the census -tracts and block groups defined by state and county."} +BGs were delineated by local participants in the Census Bureau’s Participant +Statistical Areas Program (PSAP). The Census Bureau delineated BGs only where +a local or tribal government declined to participate in PSAP, and a regional +organization or the State Data Center was not available to participate. A BG +usually covers a contiguous area. Each census tract contains at least one BG, +and BGs are uniquely numbered within the census tract. Within the standard +census geographic hierarchy, BGs never cross state, county, or census tract +boundaries, but may cross the boundaries of any other geographic entity. +Tribal census tracts and tribal BGs are separate and unique geographic areas +defined within federally recognized American Indian reservations and can +cross state and county boundaries. The tribal census tracts and tribal block +groups may be completely different from the standard county-based census +tracts and block groups defined for the same area."} } \examples{ # Display the block groups with ggplot2 diff --git a/man/blockgroup_sf_2020.Rd b/man/blockgroup_sf_2020.Rd deleted file mode 100644 index 9e6bcab..0000000 --- a/man/blockgroup_sf_2020.Rd +++ /dev/null @@ -1,64 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/data.R -\docType{data} -\name{blockgroup_sf_2020} -\alias{blockgroup_sf_2020} -\title{Census Block Groups (2020 vintage)} -\format{ -A polygon \code{sf} object with 6077 rows and -4 variables: -\describe{ -\item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census -Bureau. The parent tract can be identified from the first 11 digits. -Character.} -\item{county_fips}{Unique 5-digit FIPS code of the county the block group -is in. Character.} -\item{sqmi}{Area in square miles. Double.} -\item{geometry}{Feature geometry. \code{sf} polygon.} -} -} -\source{ -US Census Bureau -\href{https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html}{TIGER/Line} -} -\usage{ -blockgroup_sf_2020 -} -\description{ -The Census Block Groups within the 7-county Chicago Metropolitan Agency for -Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -2020 vintage. \strong{Use this version for data from the 2020 decennial census. For -data from the 2010 decennial census or the American Community Survey (ACS) -from 2010 through 2019, use \code{blockgroup_sf} (which will be replaced by this -dataset once the 2016-2020 ACS 5-year data is published).} -} -\details{ -Census Bureau description: - -\emph{"Block Groups (BGs) are statistical divisions of census tracts, are -generally defined to contain between 600 and 3,000 people, and are used to -present data and control block numbering. A block group consists of clusters -of blocks within the same census tract that have the same first digit of -their four-digit census block number. For example, blocks 3001, 3002, 3003, -..., 3999 in census tract 1210.02 belong to BG 3 in that census tract. Most -BGs were delineated by local participants in the Census Bureau's Participant -Statistical Areas Program. The Census Bureau delineated BGs only where a -local or tribal government declined to participate, and a regional -organization or State Data Center was not available to participate.} - -\emph{"A BG usually covers a contiguous area. Each census tract contains at least -one BG, and BGs are uniquely numbered within the census tract. Within the -standard census geographic hierarchy, BGs never cross state, county, or -census tract boundaries but may cross the boundaries of any other geographic -entity. Tribal census tracts and tribal BGs are separate and unique -geographic areas defined within federally recognized American Indian -reservations and can cross state and county boundaries. The tribal census -tracts and tribal block groups may be completely different from the census -tracts and block groups defined by state and county."} -} -\examples{ -# Display the block groups with ggplot2 -library(ggplot2) -ggplot(blockgroup_sf_2020) + geom_sf(lwd = 0.1) + theme_void() -} -\keyword{datasets} diff --git a/man/com_sf.Rd b/man/com_sf.Rd index ad41b93..4e460ea 100644 --- a/man/com_sf.Rd +++ b/man/com_sf.Rd @@ -41,8 +41,8 @@ Will County. Example 2: Buffalo Grove belongs to both the Lake County and Northwest subregional councils; in this case, the subregional boundary follows the county boundary through Buffalo Grove. -It is important to note here that the portions of COM boundaries, defined by -municipalities, are fluid: they change as a village annexes adjacent +It is important to note here that the portions of COM boundaries defined by +municipalities are fluid: they change as a village annexes adjacent unincorporated land. The boundaries depicted in this dataset reflect municipal boundaries of varying vintages and sources, and cannot be considered “true” for any given point in time. diff --git a/man/congress_sf.Rd b/man/congress_sf.Rd index 384289b..6bb51d7 100644 --- a/man/congress_sf.Rd +++ b/man/congress_sf.Rd @@ -5,15 +5,15 @@ \alias{congress_sf} \title{U.S. Congressional Districts} \format{ -A multipolygon \code{sf} object with 18 rows and 6 -variables: +A multipolygon \code{sf} object with 18 rows and +6 variables: \describe{ \item{dist_num}{Congressional District number. Integer.} \item{dist_name}{Name of the district (full). Character.} \item{dist_name_short}{Name of the district (short). Character.} \item{cmap}{Does the district overlap the 7-county CMAP region? Logical.} -\item{sqmi}{Area in square miles. Double.} -\item{geometry}{Feature geometry. \code{sf} multipolygon.} +\item{sqmi}{Area in square miles. Double.} \item{geometry}{Feature +geometry. \code{sf} multipolygon.} } } \source{ @@ -25,7 +25,10 @@ congress_sf } \description{ The United States Congressional Districts in the state of Illinois. From the -US Census Bureau's TIGER/Line shapefiles, 2019 vintage. +US Census Bureau's TIGER/Line shapefiles, 2021 vintage. \strong{These districts +were in effect for elections from 2012 through 2020 (i.e. the 113th through +117th Congresses). They will be superseded by new district boundaries for the +2022 election (for the 118th Congress).} } \details{ Census Bureau description: diff --git a/man/county_sf.Rd b/man/county_sf.Rd index 1eed1ca..eb80280 100644 --- a/man/county_sf.Rd +++ b/man/county_sf.Rd @@ -5,8 +5,8 @@ \alias{county_sf} \title{Counties} \format{ -A polygon \code{sf} object with 23 rows and 8 -variables: +A polygon \code{sf} object with 23 rows and +8 variables: \describe{ \item{geoid_county}{Unique 5-digit county ID (a.k.a. FIPS code), assigned by the Census Bureau. Character.} @@ -32,15 +32,16 @@ county_sf The counties that are within the CMAP travel modeling area \strong{or} the "Chicago-Naperville-Elgin, IL-IN-WI" Metropolitan Statistical Area (as defined by the United States Office of Management and Budget). From the US -Census Bureau's TIGER/Line shapefiles, 2019 vintage. +Census Bureau's TIGER/Line shapefiles, 2021 vintage. } \details{ Census Bureau description: -\emph{"Counties are the primary legal divisions of most states. Most counties are -functioning governmental units, whose powers and functions vary from state to -state. Legal changes to county boundaries or names are typically infrequent, -but do occur from time to time."} +\emph{"The primary legal divisions of most states are termed counties. Each county +or statistically equivalent entity is assigned a three-character numeric +Federal Information Processing Series (FIPS) code based on alphabetical +sequence that is unique within state, and an eight-digit National Standard +(NS) code."} Note: The Illinois counties of LaSalle, Lee and Ogle are included in their entirety, although only portions of these counties are part of the CMAP diff --git a/man/eda_sf.Rd b/man/eda_sf.Rd index a8583ee..57b2433 100644 --- a/man/eda_sf.Rd +++ b/man/eda_sf.Rd @@ -9,7 +9,8 @@ A polygon \code{sf} object with 856 rows and 7 variables: \describe{ \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census -Bureau. Character.} +Bureau. \strong{These correspond to tract boundaries from 2010, not 2020.} +Character.} \item{county_fips}{Unique 5-digit FIPS code of the county the tract is in. Character.} \item{area_type}{Description of the tract's combined EDA and disinvested diff --git a/man/idot_sf.Rd b/man/idot_sf.Rd index 382bc5d..7970ecd 100644 --- a/man/idot_sf.Rd +++ b/man/idot_sf.Rd @@ -28,7 +28,7 @@ The nine highway districts covering the entire state of Illinois, as defined by the Illinois Department of Transportation (IDOT). Includes a column indicating which of the five transportation regions each district belongs to. Created using the county boundaries in the US Census Bureau's TIGER/Line -shapefiles, 2019 vintage. +shapefiles, 2021 vintage. } \examples{ # Display the IDOT districts/regions with ggplot2 diff --git a/man/ilga_house_sf.Rd b/man/ilga_house_sf.Rd index 104ca2c..6b2ea8e 100644 --- a/man/ilga_house_sf.Rd +++ b/man/ilga_house_sf.Rd @@ -24,7 +24,10 @@ ilga_house_sf } \description{ The Illinois General Assembly House Districts. From the US Census Bureau's -TIGER/Line shapefiles, 2019 vintage. +TIGER/Line shapefiles, 2021 vintage. \strong{These districts were in effect for +elections from 2012 through 2020 (i.e. the 98th through 102nd General +Assemblies). They will be superseded by new district boundaries for the 2022 +election (for the 103rd General Assembly).} } \details{ Census Bureau description: @@ -33,24 +36,22 @@ Census Bureau description: elected to state legislatures. The Census Bureau first reported data for SLDs as part of the 2000 Public Law (P.L.) 94-171 Redistricting Data File.} -\emph{"Current SLDs (2010 Election Cycle) -- States participating in Phase 1 of -the 2010 Census Redistricting Data Program voluntarily provided the Census -Bureau with the 2006 election cycle boundaries, codes, and, in some cases, -names for their SLDs. All 50 states, plus the District of Columbia and Puerto -Rico, participated in Phase 1, State Legislative District Project (SLDP) of -the 2010 Census Redistricting Data Program. States subsequently provided -legal changes to those plans through the Redistricting Data Office and/or -corrections as part of Phase 2 of the 2010 Census Redistricting Data Program, -as needed.} +*"Current SLDs (2018 Election Cycle)—States participating in Phase 4 of the +2020 Census Redistricting Data Program voluntarily provided the Census Bureau +with the 2018 election cycle boundaries, codes, and, in some cases, names for +their SLDs. All 50 states, plus the District of Columbia and Puerto Rico, +participated in Phase 4's State Legislative District Project (SLDP) of the +2020 Census Redistricting Data Program. States subsequently provided +corrections to those plans through the Redistricting Data Office during Phase +2 of the 2020 Census Redistricting Data Program, if needed. -\emph{"The SLDs embody the upper (senate) and lower (house) chambers of the state -legislature. A unique three-character census code, identified by state -participants, is assigned to each SLD within a state. In Connecticut, Hawaii, -Illinois, Louisiana, Maine, Massachusetts, New Jersey, Ohio, and Puerto Rico, +"The SLDs embody the upper (senate—SLDU) and lower (house—SLDL) chambers of +the state legislature. A unique three-character census code, identified by +state participants, is assigned to each SLD within a state. In some states, state officials did not define the SLDs to cover all of the state or state equivalent area (usually bodies of water). In these areas with no SLDs -defined, the code "ZZZ" has been assigned, which is treated within state as a -single SLD for purposes of data presentation."} +defined, the code 'ZZZ' has been assigned, which is treated within state as a +single SLD for purposes of data presentation."* Note: The aforementioned "ZZZ" district, which comprises the Illinois portion of Lake Michigan, has been excluded from this dataset. diff --git a/man/ilga_senate_sf.Rd b/man/ilga_senate_sf.Rd index 93a11f3..b018b5d 100644 --- a/man/ilga_senate_sf.Rd +++ b/man/ilga_senate_sf.Rd @@ -24,7 +24,10 @@ ilga_senate_sf } \description{ The Illinois General Assembly Senate Districts. From the US Census Bureau's -TIGER/Line shapefiles, 2019 vintage. +TIGER/Line shapefiles, 2021 vintage. \strong{These districts were in effect for +elections from 2012 through 2020 (i.e. the 98th through 102nd General +Assemblies). They will be superseded by new district boundaries for the 2022 +election (for the 103rd General Assembly).} } \details{ Census Bureau description: @@ -33,24 +36,22 @@ Census Bureau description: elected to state legislatures. The Census Bureau first reported data for SLDs as part of the 2000 Public Law (P.L.) 94-171 Redistricting Data File.} -\emph{"Current SLDs (2010 Election Cycle) -- States participating in Phase 1 of -the 2010 Census Redistricting Data Program voluntarily provided the Census -Bureau with the 2006 election cycle boundaries, codes, and, in some cases, -names for their SLDs. All 50 states, plus the District of Columbia and Puerto -Rico, participated in Phase 1, State Legislative District Project (SLDP) of -the 2010 Census Redistricting Data Program. States subsequently provided -legal changes to those plans through the Redistricting Data Office and/or -corrections as part of Phase 2 of the 2010 Census Redistricting Data Program, -as needed.} +*"Current SLDs (2018 Election Cycle)—States participating in Phase 4 of the +2020 Census Redistricting Data Program voluntarily provided the Census Bureau +with the 2018 election cycle boundaries, codes, and, in some cases, names for +their SLDs. All 50 states, plus the District of Columbia and Puerto Rico, +participated in Phase 4's State Legislative District Project (SLDP) of the +2020 Census Redistricting Data Program. States subsequently provided +corrections to those plans through the Redistricting Data Office during Phase +2 of the 2020 Census Redistricting Data Program, if needed. -\emph{"The SLDs embody the upper (senate) and lower (house) chambers of the state -legislature. A unique three-character census code, identified by state -participants, is assigned to each SLD within a state. In Connecticut, Hawaii, -Illinois, Louisiana, Maine, Massachusetts, New Jersey, Ohio, and Puerto Rico, +"The SLDs embody the upper (senate—SLDU) and lower (house—SLDL) chambers of +the state legislature. A unique three-character census code, identified by +state participants, is assigned to each SLD within a state. In some states, state officials did not define the SLDs to cover all of the state or state equivalent area (usually bodies of water). In these areas with no SLDs -defined, the code "ZZZ" has been assigned, which is treated within state as a -single SLD for purposes of data presentation."} +defined, the code 'ZZZ' has been assigned, which is treated within state as a +single SLD for purposes of data presentation."* Note: The aforementioned "ZZZ" district, which comprises the Illinois portion of Lake Michigan, has been excluded from this dataset. diff --git a/man/municipality_sf.Rd b/man/municipality_sf.Rd index 8ca7255..b661e5b 100644 --- a/man/municipality_sf.Rd +++ b/man/municipality_sf.Rd @@ -26,21 +26,21 @@ municipality_sf The 284 municipalities (also referred to as "incorporated places" in Census Bureau terminology) that are at least partially within the 7-county Chicago Metropolitan Agency for Planning (CMAP) region. From the US Census Bureau's -TIGER/Line shapefiles, 2019 vintage. +TIGER/Line shapefiles, 2021 vintage. } \details{ Census Bureau description: \emph{"Incorporated Places are those reported to the Census Bureau as legally in -existence as of January 1, 2010, as reported in the latest Boundary and -Annexation Survey (BAS), under the laws of their respective states. An -incorporated place is established to provide governmental functions for a -concentration of people as opposed to a minor civil division, which generally -is created to provide services or administer an area without regard, -necessarily, to population. Places always are within a single state or -equivalent entity, but may extend across county and county subdivision -boundaries. An incorporated place usually is a city, town, village, or -borough, but can have other legal descriptions."} +existence as of January 1, as reported in the latest Boundary and Annexation +Survey (BAS), under the laws of their respective states. An incorporated +place is established to provide governmental functions for a concentration of +people as opposed to a minor civil division (MCD), which generally is created +to provide services or administer an area without regard, necessarily, to +population. Places always are within a single state or equivalent entity, but +may extend across county and county subdivision boundaries. An incorporated +place usually is a city, town, village, or borough, but can have other legal +descriptions."} } \examples{ # Display the municipalities with ggplot2 diff --git a/man/puma_sf.Rd b/man/puma_sf.Rd index 06bbd29..7f7deb4 100644 --- a/man/puma_sf.Rd +++ b/man/puma_sf.Rd @@ -26,7 +26,9 @@ puma_sf \description{ The Census PUMAs covering the 7-county Chicago Metropolitan Agency for Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -2019 vintage. +2021 vintage. \strong{These PUMAs are valid for use with ACS PUMS data from 2012 +through 2021. They will be superseded by the 2020 PUMAs when the 2022 ACS +data is published.} } \details{ Census Bureau description: diff --git a/man/township_sf.Rd b/man/township_sf.Rd index a4b4f34..c4a93cb 100644 --- a/man/township_sf.Rd +++ b/man/township_sf.Rd @@ -29,7 +29,7 @@ The political townships (also referred to as "county subdivisions" in Census Bureau terminology) that are within the CMAP Metropolitan Planning Area (MPA). (The MPA includes the 7 CMAP counties, plus Aux Sable Township in Grundy County and Sandwich & Somonauk Townships in DeKalb County.) From the -US Census Bureau's TIGER/Line shapefiles, 2019 vintage. +US Census Bureau's TIGER/Line shapefiles, 2021 vintage. } \details{ Census Bureau description: @@ -39,8 +39,7 @@ entities. They include census county divisions, census subareas, minor civil divisions, and unorganized territories and can be classified as either legal or statistical. Each county subdivision is assigned a five-character numeric Federal Information Processing Series (FIPS) code based on alphabetical -sequence within state and an eight-digit National Standard feature -identifier."} +sequence within state, and an eight-digit National Standard (NS) code."} Note: The entire City of Chicago (other than the portion of O'Hare in DuPage County) is included as a single township in this dataset, and has not been diff --git a/man/tract_sf.Rd b/man/tract_sf.Rd index bc28d37..28553d2 100644 --- a/man/tract_sf.Rd +++ b/man/tract_sf.Rd @@ -3,18 +3,22 @@ \docType{data} \name{tract_sf} \alias{tract_sf} -\title{Census Tracts (2019 vintage)} +\alias{tract_sf_2010} +\title{Census Tracts} \format{ -A polygon \code{sf} object with 1983 rows and 4 -variables: +\code{tract_sf} is a multipolygon \code{sf} object with 2070 rows and +4 variables: \describe{ \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census Bureau. Character.} \item{county_fips}{Unique 5-digit FIPS code of the county the tract is in. Character.} \item{sqmi}{Area in square miles. Double.} -\item{geometry}{Feature geometry. \code{sf} polygon.} +\item{geometry}{Feature geometry. \code{sf} multipolygon.} } + +\code{tract_sf_2010} is a polygon \code{sf} object with 1983 +rows and 4 variables. } \source{ US Census Bureau @@ -22,46 +26,48 @@ US Census Bureau } \usage{ tract_sf + +tract_sf_2010 } \description{ The Census Tracts within the 7-county Chicago Metropolitan Agency for Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -2019 vintage. \strong{Use this version with data from the 2010 decennial census or -the American Community Survey (ACS) from 2010 through 2019. For data from the -2020 decennial census, use \code{tract_sf_2020} (which will replace this dataset -once the 2016-2020 ACS 5-year data is published).} +2021 vintage. \strong{Use \code{tract_sf} for data from the 2020 decennial census or the +American Community Survey (ACS) from 2020 onward. For data from the 2010 +decennial census or ACS from 2010 through 2019, use \code{tract_sf_2010}.} } \details{ Census Bureau description: -\emph{"Census Tracts are small, relatively permanent statistical subdivisions of a -county or equivalent entity that are updated by local participants prior to -each decennial census as part of the Census Bureau's Participant Statistical -Areas Program. The Census Bureau delineates census tracts in situations where -no local participant existed or where state, local, or tribal governments -declined to participate. The primary purpose of census tracts is to provide a -stable set of geographic units for the presentation of statistical data.} +*"Census Tracts are small, relatively permanent statistical subdivisions of a +county or statistically equivalent entity that can be updated by local +participants prior to each decennial census as part of the Census Bureau’s +Participant Statistical Areas Program (PSAP). The Census Bureau delineates +census tracts in situations where no local participant responded or where +state, local, or tribal governments declined to participate. The primary +purpose of census tracts is to provide a stable set of geographic units for +the presentation of statistical data. -\emph{"Census tracts generally have a population size between 1,200 and 8,000 +*"Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. A census tract usually covers a contiguous area; however, the spatial size of census tracts varies widely depending on the density of settlement. Census tract boundaries are delineated with the intention of being maintained over a long time so that statistical comparisons can be made from census to census. Census tracts occasionally are split due to population growth or merged as a result of -substantial population decline.} +substantial population decline. \emph{"Census tract boundaries generally follow visible and identifiable features. They may follow nonvisible legal boundaries, such as minor civil division (MCD) or incorporated place boundaries in some states and situations, to -allow for census-tract-to-governmental-unit relationships where the +allow for census tract-to-governmental unit relationships where the governmental boundaries tend to remain unchanged between censuses. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. Tribal census tracts are a unique geographic entity defined within federally recognized American Indian reservations and -off-reservation trust lands and can cross state and county boundaries. Tribal -census tracts may be completely different from the census tracts and block -groups defined by state and county."} +off-reservation trust lands and can cross state and county boundaries. The +tribal census tracts may be completely different from the standard +county-based census tracts defined for the same area."} } \examples{ # Display the tracts with ggplot2 diff --git a/man/tract_sf_2020.Rd b/man/tract_sf_2020.Rd deleted file mode 100644 index 7d52fd3..0000000 --- a/man/tract_sf_2020.Rd +++ /dev/null @@ -1,71 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/data.R -\docType{data} -\name{tract_sf_2020} -\alias{tract_sf_2020} -\title{Census Tracts (2020 vintage)} -\format{ -A polygon \code{sf} object with 2070 rows and -4 variables: -\describe{ -\item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census -Bureau. Character.} -\item{county_fips}{Unique 5-digit FIPS code of the county the tract is in. -Character.} -\item{sqmi}{Area in square miles. Double.} -\item{geometry}{Feature geometry. \code{sf} multipolygon.} -} -} -\source{ -US Census Bureau -\href{https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html}{TIGER/Line} -} -\usage{ -tract_sf_2020 -} -\description{ -The Census Tracts within the 7-county Chicago Metropolitan Agency for -Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -2020 vintage. \strong{Use this version for data from the 2020 decennial census. For -data from the 2010 decennial census or the American Community Survey (ACS) -from 2010 through 2019, use \code{tract_sf} (which will be replaced by this -dataset once the 2016-2020 ACS 5-year data is published).} -} -\details{ -Census Bureau description: - -\emph{"Census Tracts are small, relatively permanent statistical subdivisions of a -county or equivalent entity that are updated by local participants prior to -each decennial census as part of the Census Bureau's Participant Statistical -Areas Program. The Census Bureau delineates census tracts in situations where -no local participant existed or where state, local, or tribal governments -declined to participate. The primary purpose of census tracts is to provide a -stable set of geographic units for the presentation of statistical data.} - -\emph{"Census tracts generally have a population size between 1,200 and 8,000 -people, with an optimum size of 4,000 people. A census tract usually covers a -contiguous area; however, the spatial size of census tracts varies widely -depending on the density of settlement. Census tract boundaries are -delineated with the intention of being maintained over a long time so that -statistical comparisons can be made from census to census. Census tracts -occasionally are split due to population growth or merged as a result of -substantial population decline.} - -\emph{"Census tract boundaries generally follow visible and identifiable features. -They may follow nonvisible legal boundaries, such as minor civil division -(MCD) or incorporated place boundaries in some states and situations, to -allow for census-tract-to-governmental-unit relationships where the -governmental boundaries tend to remain unchanged between censuses. State and -county boundaries always are census tract boundaries in the standard census -geographic hierarchy. Tribal census tracts are a unique geographic entity -defined within federally recognized American Indian reservations and -off-reservation trust lands and can cross state and county boundaries. Tribal -census tracts may be completely different from the census tracts and block -groups defined by state and county."} -} -\examples{ -# Display the tracts with ggplot2 -library(ggplot2) -ggplot(tract_sf_2020) + geom_sf(lwd = 0.1) + theme_void() -} -\keyword{datasets} diff --git a/man/ward_sf.Rd b/man/ward_sf.Rd index 377deaf..7d51011 100644 --- a/man/ward_sf.Rd +++ b/man/ward_sf.Rd @@ -20,8 +20,8 @@ variables: ward_sf } \description{ -The official boundaries of the current Chicago wards (established in May of -2015). Obtained 3/24/2021. +The official boundaries of the Chicago wards established in May of 2015. +Obtained 3/24/2021. } \examples{ # Display the wards with ggplot2 diff --git a/man/xwalk_blockgroup2cca.Rd b/man/xwalk_blockgroup2cca.Rd index b7920b1..3a4564b 100644 --- a/man/xwalk_blockgroup2cca.Rd +++ b/man/xwalk_blockgroup2cca.Rd @@ -3,10 +3,11 @@ \docType{data} \name{xwalk_blockgroup2cca} \alias{xwalk_blockgroup2cca} +\alias{xwalk_blockgroup2cca_2010} \title{Block Group-to-CCA Crosswalk} \format{ -A tibble with 2180 rows and -5 variables: +\code{xwalk_blockgroup2cca} is a tibble with 2175 rows +and 6 variables: \describe{ \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census Bureau. Corresponds to \code{blockgroup_sf}. Character.} @@ -24,10 +25,20 @@ estimate the CCA's portion.Double.} group quarters) living in the specified CCA. Multiply this by a block group-level measure of a population attribute (e.g. race/ethnicity) to estimate the CCA's portion. Double.} +\item{emp_pct}{Proportion of the block group's total jobs located in the +specified CCA. Multiply this by a block group-level measure of an +employment attribute (e.g. retail jobs) to estimate the CCA's portion. +\strong{Not available in \code{xwalk_blockgroup2cca_2010}.} Double.} } + +\code{xwalk_blockgroup2cca_2010} is a tibble with +2180 rows and +5 variables (no \code{emp_pct}). } \usage{ xwalk_blockgroup2cca + +xwalk_blockgroup2cca_2010 } \description{ This table contains a set of factors to apportion Census block group-level @@ -35,26 +46,31 @@ data among Chicago Community Areas (CCAs). Separate factors are provided for apportioning housing unit, household, and population attributes. All factors were determined by calculating the percentage of a block group's housing units, households and population that were located in each of its component -blocks, according to the 2010 Decennial Census, and then assigning each block -to a CCA (based on the location of the block's centroid point). +blocks, according to the 2020 Decennial Census, and then assigning each block +to a CCA (based on the location of the block's centroid point). \strong{Use +\code{xwalk_blockgroup2cca} for data from the 2020 decennial census or the +American Community Survey (ACS) from 2020 onward. For data from the 2010 +decennial census or ACS from 2010 through 2019, use +\code{xwalk_blockgroup2cca_2010}.} } \details{ Generally speaking, block group boundaries align neatly with CCA boundaries as they tend to follow similar features (e.g. rivers, major roads, rail -lines) but there are cases where the population, households and/or housing -units in a block group are split across multiple CCAs, or else are partially -within the City of Chicago and partially outside of it. For that reason, it -is not appropriate to use a one-to-one block group-to-CCA assignment to -apportion Census data among CCAs, and this crosswalk should be used instead. +lines) but there are cases where the jobs, population, households and/or +housing units in a block group are split across multiple CCAs, or else are +partially within the City of Chicago and partially outside of it. For that +reason, it is not appropriate to use a one-to-one block group-to-CCA +assignment to apportion Census data among CCAs, and this crosswalk should be +used instead. To use this crosswalk effectively, Census data should be joined to it (not vice versa, since block group IDs appear multiple times in this table). Once the data is joined, it should be multiplied by the appropriate factor (depending whether the data of interest is measured at the housing unit, -household or person level), and then the result should be summed by CCA. If -calculating rates, this should only be done after the counts have been summed -to CCA. The resulting table can then be joined to \code{cca_sf} for mapping, if -desired. +household, person or job level), and then the result should be summed by CCA. +If calculating rates, this should only be done after the counts have been +summed to CCA. The resulting table can then be joined to \code{cca_sf} for +mapping, if desired. If your data is only available at the tract level, you can use \code{xwalk_tract2cca} for a tract-level allocation instead. @@ -62,24 +78,24 @@ If your data is only available at the tract level, you can use \examples{ suppressPackageStartupMessages(library(dplyr)) -# View the block groups with households not fully contained in a single CCA -filter(xwalk_blockgroup2cca, hh_pct < 1) +# View the block groups with housing units split between multiple CCAs +filter(xwalk_blockgroup2cca, hu_pct < 1) -# Estimate CCA-level unemployment rate from block group-level ACS data -df_blkgrp <- tidycensus::get_acs( - "block group", state = "IL", county = "031", table = "B23025", - year = 2019, survey = "acs5", output = "wide", cache_table = TRUE +# Estimate CCA-level housing vacancy rates from block group-level Census data +df_blkgrp <- tidycensus::get_decennial( + geography = "block group", variables = c("H1_001N", "H1_003N"), + year = 2020, state = "IL", county = c("031", "043"), output = "wide" ) \%>\% - rename(civ_lf = B23025_003E, unemp = B23025_005E) \%>\% - select(GEOID, civ_lf, unemp) + suppressMessages() \%>\% # Hide tidycensus messages + select(geoid_blkgrp = GEOID, hu_tot = H1_001N, hu_vac = H1_003N) df_cca <- xwalk_blockgroup2cca \%>\% - left_join(df_blkgrp, by = c("geoid_blkgrp" = "GEOID")) \%>\% - mutate(civ_lf = civ_lf * pop_pct, - unemp = unemp * pop_pct) \%>\% + left_join(df_blkgrp, by = "geoid_blkgrp") \%>\% + mutate(hu_tot = hu_tot * hu_pct, + hu_vac = hu_vac * hu_pct) \%>\% group_by(cca_num) \%>\% - summarize_at(vars(civ_lf, unemp), sum) \%>\% - mutate(unemp_rate = unemp / civ_lf) + summarize_at(vars(hu_tot, hu_vac), sum) \%>\% + mutate(vac_rate = hu_vac / hu_tot) df_cca # Join to cca_sf for mapping @@ -87,7 +103,7 @@ library(ggplot2) cca_sf \%>\% left_join(df_cca, by = "cca_num") \%>\% ggplot() + - geom_sf(aes(fill = unemp_rate), lwd = 0.1) + + geom_sf(aes(fill = vac_rate), lwd = 0.1) + scale_fill_viridis_c(direction = -1) + theme_void() } diff --git a/man/xwalk_blockgroup2subzone.Rd b/man/xwalk_blockgroup2subzone.Rd index 0173758..dbed747 100644 --- a/man/xwalk_blockgroup2subzone.Rd +++ b/man/xwalk_blockgroup2subzone.Rd @@ -3,10 +3,12 @@ \docType{data} \name{xwalk_blockgroup2subzone} \alias{xwalk_blockgroup2subzone} +\alias{xwalk_blockgroup2subzone_2010} \title{Block Group-to-Subzone Crosswalk} \format{ -A tibble with 21411 rows and -5 variables: +\code{xwalk_blockgroup2subzone} is a tibble with +22669 rows and +6 variables: \describe{ \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census Bureau. Corresponds to \code{blockgroup_sf} (although that only includes the @@ -24,10 +26,20 @@ estimate the subzone's portion. Double.} group quarters) living in the specified subzone. Multiply this by a block group-level measure of a population attribute (e.g. race/ethnicity) to estimate the subzone's portion. Double.} +\item{emp_pct}{Proportion of the block group's total jobs located in the +specified subzone. Multiply this by a block group-level measure of an +employment attribute (e.g. retail jobs) to estimate the subzone's portion. +\strong{Not available in \code{xwalk_blockgroup2subzone_2010}.} Double.} } + +\code{xwalk_blockgroup2subzone_2010} is a tibble with +21411 rows and +5 variables (no \code{emp_pct}). } \usage{ xwalk_blockgroup2subzone + +xwalk_blockgroup2subzone_2010 } \description{ This table contains a set of factors to apportion Census block group-level @@ -35,28 +47,31 @@ data among the CMAP travel modeling subzones. Separate factors are provided for apportioning housing unit, household, and population attributes. All factors were determined by calculating the percentage of a block group's housing units, households and population that were located in each of its -component blocks, according to the 2010 Decennial Census, and then assigning +component blocks, according to the 2020 Decennial Census, and then assigning each block to a subzone (based on the location of the block's centroid point). Subzones that do not contain the centroid of any blocks with at least -one housing unit, household or person are not present in this table, and -should be considered unpopulated. +one housing unit, household, person or job are \emph{not} present in this table. +\strong{Use \code{xwalk_blockgroup2subzone} for data from the 2020 decennial census or +the American Community Survey (ACS) from 2020 onward. For data from the 2010 +decennial census or ACS from 2010 through 2019, use +\code{xwalk_blockgroup2subzone_2010}.} } \details{ Other than in certain areas of Chicago, block groups tend to be significantly larger than subzones and have highly irregular boundaries, so in most cases -the population, households and/or housing units in a block group are split -across multiple subzones. For that reason, it is not appropriate to use a -one-to-one block group-to-subzone assignment to apportion Census data among +the jobs, population, households and/or housing units in a block group are +split across multiple subzones. For that reason, it is not appropriate to use +a one-to-one block group-to-subzone assignment to apportion Census data among subzones, and this crosswalk should be used instead. To use this crosswalk effectively, Census data should be joined to it (not vice versa, since block group IDs appear multiple times in this table). Once the data is joined, it should be multiplied by the appropriate factor (depending whether the data of interest is measured at the housing unit, -household or person level), and then the result should be summed by subzone -ID. If calculating rates, this should only be done after the counts have been -summed to subzone. The resulting table can then be joined to \code{subzone_sf} for -mapping, if desired. +household, person or job level), and then the result should be summed by +subzone ID. If calculating rates, this should only be done after the counts +have been summed to subzone. The resulting table can then be joined to +\code{subzone_sf} for mapping, if desired. If your data is only available at the tract level, you can use \code{xwalk_tract2subzone} for a tract-level allocation instead. If the subzone @@ -67,7 +82,7 @@ geography is too granular for your needs, you can use zones instead with # View the block group allocations for subzone17 == 1 dplyr::filter(xwalk_blockgroup2subzone, subzone17 == 1) -# Map the subzones missing from xwalk_blockgroup2subzone (i.e. no HU/HH/pop) +# Map the subzones missing from xwalk_blockgroup2subzone (i.e. no HU/HH/pop/emp) library(ggplot2) ggplot(dplyr::anti_join(subzone_sf, xwalk_blockgroup2subzone)) + geom_sf(fill = "red", lwd = 0.1) + diff --git a/man/xwalk_blockgroup2zone.Rd b/man/xwalk_blockgroup2zone.Rd index f7038e4..a60810b 100644 --- a/man/xwalk_blockgroup2zone.Rd +++ b/man/xwalk_blockgroup2zone.Rd @@ -3,10 +3,11 @@ \docType{data} \name{xwalk_blockgroup2zone} \alias{xwalk_blockgroup2zone} +\alias{xwalk_blockgroup2zone_2010} \title{Block Group-to-Zone Crosswalk} \format{ -A tibble with 12510 rows and -5 variables: +\code{xwalk_blockgroup2zone} is a tibble with 13289 rows +and 6 variables: \describe{ \item{geoid_blkgrp}{Unique 12-digit block group ID, assigned by the Census Bureau. Corresponds to \code{blockgroup_sf} (although that only includes the @@ -24,37 +25,50 @@ estimate the zone's portion. Double.} group quarters) living in the specified zone. Multiply this by a block group-level measure of a population attribute (e.g. race/ethnicity) to estimate the zone's portion. Double.} +\item{emp_pct}{Proportion of the block group's total jobs located in the +specified zone. Multiply this by a block group-level measure of an +employment attribute (e.g. retail jobs) to estimate the zone's portion. +\strong{Not available in \code{xwalk_blockgroup2zone_2010}.} Double.} } + +\code{xwalk_blockgroup2zone_2010} is a tibble with +12510 rows and +5 variables (no \code{emp_pct}). } \usage{ xwalk_blockgroup2zone + +xwalk_blockgroup2zone_2010 } \description{ This table contains a set of factors to apportion Census block group-level data among the CMAP travel modeling zones. Separate factors are provided for -apportioning housing unit, household, and population attributes. All factors -were determined by calculating the percentage of a block group's housing -units, households and population that were located in each of its component -blocks, according to the 2010 Decennial Census, and then assigning each block -to a zone (based on the location of the block's centroid point). Zones that -do not contain the centroid of any blocks with at least one housing unit, -household or person are not present in this table, and should be considered -unpopulated. +apportioning housing unit, household, population and employment attributes. +All factors were determined by calculating the percentage of a block group's +housing units, households, population and employment that were located in +each of its component blocks, according to the 2020 Decennial Census and 2019 +LEHD, and then assigning each block to a zone (based on the location of the +block's centroid point). Zones that do not contain the centroid of any blocks +with at least one housing unit, household, person or job are \emph{not} present in +this table. \strong{Use \code{xwalk_blockgroup2zone} for data from the 2020 decennial +census or the American Community Survey (ACS) from 2020 onward. For data from +the 2010 decennial census or ACS from 2010 through 2019, use +\code{xwalk_blockgroup2zone_2010}.} } \details{ Other than in certain areas of Chicago, block groups tend to be larger than -zones and have highly irregular boundaries, so in most cases the population, -households and/or housing units in a block group are split across multiple -zones. For that reason, it is not appropriate to use a one-to-one block -group-to-zone assignment to apportion Census data among zones, and this +zones and have highly irregular boundaries, so in most cases the jobs, +population, households and/or housing units in a block group are split across +multiple zones. For that reason, it is not appropriate to use a one-to-one +block group-to-zone assignment to apportion Census data among zones, and this crosswalk should be used instead. To use this crosswalk effectively, Census data should be joined to it (not vice versa, since block group IDs appear multiple times in this table). Once the data is joined, it should be multiplied by the appropriate factor (depending whether the data of interest is measured at the housing unit, -household or person level), and then the result should be summed by zone ID. -If calculating rates, this should only be done after the counts have been +household, person or job level), and then the result should be summed by zone +ID. If calculating rates, this should only be done after the counts have been summed to zone. The resulting table can then be joined to \code{zone_sf} for mapping, if desired. @@ -67,7 +81,7 @@ geography is too coarse for your needs, you can use subzones instead with # View the block group allocations for zone17 == 55 dplyr::filter(xwalk_blockgroup2zone, zone17 == 55) -# Map the zones missing from xwalk_blockgroup2zone (i.e. no HU/HH/pop) +# Map the zones missing from xwalk_blockgroup2zone (i.e. no HU/HH/pop/emp) library(ggplot2) ggplot(dplyr::anti_join(zone_sf, xwalk_blockgroup2zone)) + geom_sf(fill = "red", lwd = 0.1) + diff --git a/man/xwalk_tract2cca.Rd b/man/xwalk_tract2cca.Rd index b410ce3..2ec58de 100644 --- a/man/xwalk_tract2cca.Rd +++ b/man/xwalk_tract2cca.Rd @@ -3,10 +3,11 @@ \docType{data} \name{xwalk_tract2cca} \alias{xwalk_tract2cca} +\alias{xwalk_tract2cca_2010} \title{Tract-to-CCA Crosswalk} \format{ -A tibble with 805 rows and 5 -variables: +\code{xwalk_tract2cca} is a tibble with 807 rows and +6 variables: \describe{ \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census Bureau. Corresponds to \code{tract_sf}. Character.} @@ -24,10 +25,20 @@ portion. Double.} quarters) living in the specified CCA. Multiply this by a tract-level measure of a population attribute (e.g. race/ethnicity) to estimate the CCA's portion. Double.} +\item{emp_pct}{Proportion of the tract's total jobs located in the +specified CCA. Multiply this by a tract-level measure of an employment +attribute (e.g. retail jobs) to estimate the CCA's portion. +\strong{Not available in \code{xwalk_tract2cca_2010}.} Double.} } + +\code{xwalk_tract2cca_2010} is a tibble with +805 rows and 5 +variables (no \code{emp_pct}). } \usage{ xwalk_tract2cca + +xwalk_tract2cca_2010 } \description{ This table contains a set of factors to apportion Census tract-level data @@ -35,25 +46,29 @@ among Chicago Community Areas (CCAs). Separate factors are provided for apportioning housing unit, household, and population attributes. All factors were determined by calculating the percentage of a tract's housing units, households and population that were located in each of its component blocks, -according to the 2010 Decennial Census, and then assigning each block to a -CCA (based on the location of the block's centroid point). +according to the 2020 Decennial Census, and then assigning each block to a +CCA (based on the location of the block's centroid point). \strong{Use +\code{xwalk_tract2cca} for data from the 2020 decennial census or the American +Community Survey (ACS) from 2020 onward. For data from the 2010 decennial +census or ACS from 2010 through 2019, use \code{xwalk_tract2cca_2010}.} } \details{ Generally speaking, tract boundaries align neatly with CCA boundaries as they tend to follow similar features (e.g. rivers, major roads, rail lines) but -there are cases where the population, households and/or housing units in a -tract are split across multiple CCAs, or else are partially within the City -of Chicago and partially outside of it. For that reason, it is not +there are cases where the jobs, population, households and/or housing units +in a tract are split across multiple CCAs, or else are partially within the +City of Chicago and partially outside of it. For that reason, it is not appropriate to use a one-to-one tract-to-CCA assignment to apportion Census data among CCAs, and this crosswalk should be used instead. To use this crosswalk effectively, Census data should be joined to it (not vice versa, since tract IDs appear multiple times in this table). Once the data is joined, it should be multiplied by the appropriate factor (depending -whether the data of interest is measured at the housing unit, household or -person level), and then the result should be summed by CCA. If calculating -rates, this should only be done after the counts have been summed to CCA. The -resulting table can then be joined to \code{cca_sf} for mapping, if desired. +whether the data of interest is measured at the housing unit, household, +person or job level), and then the result should be summed by CCA. If +calculating rates, this should only be done after the counts have been summed +to CCA. The resulting table can then be joined to \code{cca_sf} for mapping, if +desired. If your data is also available at the block group level, it is recommended that you use that with \code{xwalk_blockgroup2cca} instead of the tract-level @@ -62,24 +77,22 @@ allocation. \examples{ suppressPackageStartupMessages(library(dplyr)) -# View the tracts with population not fully contained in a single CCA +# View the tracts with population split between multiple CCAs filter(xwalk_tract2cca, pop_pct < 1) -# Estimate CCA-level transit mode share from tract-level ACS data -df_tract <- tidycensus::get_acs( - "tract", state = "IL", county = "031", table = "B08006", - year = 2019, survey = "acs5", output = "wide", cache_table = TRUE +# Estimate CCA-level population density from tract-level Census data +df_tract <- tidycensus::get_decennial( + geography = "tract", variables = c("P1_001N"), + year = 2020, state = "IL", county = c("031", "043"), output = "wide" ) \%>\% - rename(workers = B08006_001E, transit = B08006_008E) \%>\% - select(GEOID, workers, transit) + suppressMessages() \%>\% # Hide tidycensus messages + select(geoid_tract = GEOID, pop = P1_001N) df_cca <- xwalk_tract2cca \%>\% - left_join(df_tract, by = c("geoid_tract" = "GEOID")) \%>\% - mutate(transit = transit * pop_pct, - workers = workers * pop_pct) \%>\% + left_join(df_tract, by = "geoid_tract") \%>\% + mutate(pop = pop * pop_pct) \%>\% group_by(cca_num) \%>\% - summarize_at(vars(transit, workers), sum) \%>\% - mutate(transit_commute_pct = transit / workers) + summarize(pop = sum(pop)) df_cca # Join to cca_sf for mapping @@ -87,8 +100,8 @@ library(ggplot2) cca_sf \%>\% left_join(df_cca, by = "cca_num") \%>\% ggplot() + - geom_sf(aes(fill = transit_commute_pct), lwd = 0.1) + - scale_fill_viridis_c() + + geom_sf(aes(fill = pop / sqmi), lwd = 0.1) + + scale_fill_viridis_c(direction = -1) + theme_void() } \keyword{datasets} diff --git a/man/xwalk_tract2subzone.Rd b/man/xwalk_tract2subzone.Rd index 88e798b..e84a20e 100644 --- a/man/xwalk_tract2subzone.Rd +++ b/man/xwalk_tract2subzone.Rd @@ -3,10 +3,11 @@ \docType{data} \name{xwalk_tract2subzone} \alias{xwalk_tract2subzone} +\alias{xwalk_tract2subzone_2010} \title{Tract-to-Subzone Crosswalk} \format{ -A tibble with 15713 rows and -5 variables: +\code{xwalk_tract2subzone} is a tibble with 16800 rows and +6 variables: \describe{ \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census Bureau. Corresponds to \code{tract_sf} (although that only includes the tracts in the @@ -24,10 +25,20 @@ subzone's portion. Double.} quarters) living in the specified subzone. Multiply this by a tract-level measure of a population attribute (e.g. race/ethnicity) to estimate the subzone's portion. Double.} +\item{emp_pct}{Proportion of the tract's total jobs located in the +specified subzone. Multiply this by a tract-level measure of an employment +attribute (e.g. retail jobs) to estimate the subzone's portion. +\strong{Not available in \code{xwalk_tract2subzone_2010}.} Double.} } + +\code{xwalk_tract2subzone_2010} is a tibble with +15713 rows and +5 variables (no \code{emp_pct}). } \usage{ xwalk_tract2subzone + +xwalk_tract2subzone_2010 } \description{ This table contains a set of factors to apportion Census tract-level data @@ -35,25 +46,27 @@ among the CMAP travel modeling subzones. Separate factors are provided for apportioning housing unit, household, and population attributes. All factors were determined by calculating the percentage of a tract's housing units, households and population that were located in each of its component blocks, -according to the 2010 Decennial Census, and then assigning each block to a +according to the 2020 Decennial Census, and then assigning each block to a subzone (based on the location of the block's centroid point). Subzones that do not contain the centroid of any blocks with at least one housing unit, -household or person are not present in this table, and should be considered -unpopulated. +household, person or job are \emph{not} present in this table. \strong{Use +\code{xwalk_tract2subzone} for data from the 2020 decennial census or the American +Community Survey (ACS) from 2020 onward. For data from the 2010 decennial +census or ACS from 2010 through 2019, use \code{xwalk_tract2subzone_2010}.} } \details{ Other than in certain areas of Chicago, tracts tend to be significantly larger than subzones and have highly irregular boundaries, so in most cases -the population, households and/or housing units in a tract are split across -multiple subzones. For that reason, it is not appropriate to use a one-to-one -tract-to-subzone assignment to apportion Census data among subzones, and this -crosswalk should be used instead. +the jobs, population, households and/or housing units in a tract are split +across multiple subzones. For that reason, it is not appropriate to use a +one-to-one tract-to-subzone assignment to apportion Census data among +subzones, and this crosswalk should be used instead. To use this crosswalk effectively, Census data should be joined to it (not vice versa, since tract IDs appear multiple times in this table). Once the data is joined, it should be multiplied by the appropriate factor (depending -whether the data of interest is measured at the housing unit, household or -person level), and then the result should be summed by subzone ID. If +whether the data of interest is measured at the housing unit, household, +person or job level), and then the result should be summed by subzone ID. If calculating rates, this should only be done after the counts have been summed to subzone. The resulting table can then be joined to \code{subzone_sf} for mapping, if desired. @@ -67,7 +80,7 @@ use zones instead with \code{xwalk_tract2zone} or \code{xwalk_blockgroup2zone}. # View the tract allocations for subzone17 == 1 dplyr::filter(xwalk_tract2subzone, subzone17 == 1) -# Map the subzones missing from xwalk_tract2subzone (i.e. no HU/HH/pop) +# Map the subzones missing from xwalk_tract2subzone (i.e. no HU/HH/pop/emp) library(ggplot2) ggplot(dplyr::anti_join(subzone_sf, xwalk_tract2subzone)) + geom_sf(fill = "red", lwd = 0.1) + diff --git a/man/xwalk_tract2zone.Rd b/man/xwalk_tract2zone.Rd index 8086aa1..4ea6372 100644 --- a/man/xwalk_tract2zone.Rd +++ b/man/xwalk_tract2zone.Rd @@ -3,10 +3,11 @@ \docType{data} \name{xwalk_tract2zone} \alias{xwalk_tract2zone} +\alias{xwalk_tract2zone_2010} \title{Tract-to-Zone Crosswalk} \format{ -A tibble with 6910 rows and 5 -variables: +\code{xwalk_tract2zone} is a tibble with 7474 rows and +6 variables: \describe{ \item{geoid_tract}{Unique 11-digit tract ID, assigned by the Census Bureau. Corresponds to \code{tract_sf} (although that only includes the tracts in the @@ -24,10 +25,20 @@ portion. Double.} quarters) living in the specified zone. Multiply this by a tract-level measure of a population attribute (e.g. race/ethnicity) to estimate the zone's portion. Double.} +\item{emp_pct}{Proportion of the tracts's total jobs located in the +specified zone. Multiply this by a tract-level measure of an employment +attribute (e.g. retail jobs) to estimate the zone's portion. +\strong{Not available in \code{xwalk_tract2zone_2010}.} Double.} } + +\code{xwalk_tract2zone_2010} is a tibble with +6910 rows and 5 +variables (no \code{emp_pct}). } \usage{ xwalk_tract2zone + +xwalk_tract2zone_2010 } \description{ This table contains a set of factors to apportion Census tract-level data @@ -35,15 +46,17 @@ among the CMAP travel modeling zones. Separate factors are provided for apportioning housing unit, household, and population attributes. All factors were determined by calculating the percentage of a tract's housing units, households and population that were located in each of its component blocks, -according to the 2010 Decennial Census, and then assigning each block to a -zone (based on the location of the block's centroid point). Zones that -do not contain the centroid of any blocks with at least one housing unit, -household or person are not present in this table, and should be considered -unpopulated. +according to the 2020 Decennial Census, and then assigning each block to a +zone (based on the location of the block's centroid point). Zones that do not +contain the centroid of any blocks with at least one housing unit, household, +person or job are \emph{not} present in this table. \strong{Use \code{xwalk_tract2zone} for +data from the 2020 decennial census or the American Community Survey (ACS) +from 2020 onward. For data from the 2010 decennial census or ACS from 2010 +through 2019, use \code{xwalk_tract2zone_2010}.} } \details{ Other than in certain areas of Chicago, tracts tend to be larger than zones -and have highly irregular boundaries, so in most cases the population, +and have highly irregular boundaries, so in most cases the jobs, population, households and/or housing units in a tract are split across multiple zones. For that reason, it is not appropriate to use a one-to-one tract-to-zone assignment to apportion Census data among zones, and this crosswalk should be @@ -52,8 +65,8 @@ used instead. To use this crosswalk effectively, Census data should be joined to it (not vice versa, since tract IDs appear multiple times in this table). Once the data is joined, it should be multiplied by the appropriate factor (depending -whether the data of interest is measured at the housing unit, household or -person level), and then the result should be summed by zone ID. If +whether the data of interest is measured at the housing unit, household, +person or job level), and then the result should be summed by zone ID. If calculating rates, this should only be done after the counts have been summed to zone. The resulting table can then be joined to \code{zone_sf} for mapping, if desired. @@ -67,7 +80,7 @@ subzones instead with \code{xwalk_tract2subzone} or \code{xwalk_blockgroup2subzo # View the tract allocations for zone17 == 55 dplyr::filter(xwalk_tract2zone, zone17 == 55) -# Map the zones missing from xwalk_tract2zone (i.e. no HU/HH/pop) +# Map the zones missing from xwalk_tract2zone (i.e. no HU/HH/pop/emp) library(ggplot2) ggplot(dplyr::anti_join(zone_sf, xwalk_tract2zone)) + geom_sf(fill = "red", lwd = 0.1) + diff --git a/man/zcta_sf.Rd b/man/zcta_sf.Rd index 8ddb7ba..7640701 100644 --- a/man/zcta_sf.Rd +++ b/man/zcta_sf.Rd @@ -5,7 +5,7 @@ \alias{zcta_sf} \title{Census ZIP Code Tabulation Areas (ZCTAs)} \format{ -A multipolygon \code{sf} object with 316 rows and 3 +A multipolygon \code{sf} object with 320 rows and 3 variables: \describe{ \item{geoid_zcta}{Unique 5-digit ZCTA ID, corresponding to a 5-digit USPS @@ -24,12 +24,12 @@ zcta_sf \description{ The Census ZCTAs covering the 7-county Chicago Metropolitan Agency for Planning (CMAP) region. From the US Census Bureau's TIGER/Line shapefiles, -2019 vintage. +2021 vintage. } \details{ Census Bureau description: -\emph{ZIP Code Tabulation Areas (ZCTAs) are approximate area representations of +\emph{"ZIP Code Tabulation Areas (ZCTAs) are approximate area representations of U.S. Postal Service (USPS) five-digit ZIP Code service areas that the Census Bureau creates using whole blocks to present statistical data from censuses and surveys. The Census Bureau defines ZCTAs by allocating each block that @@ -41,25 +41,13 @@ surrounded by multiple ZCTAs will be added to a single ZCTA based on limited buffering performed between multiple ZCTAs. The Census Bureau identifies five-digit ZCTAs using a five-character numeric code that represents the most frequently occurring USPS ZIP Code within that ZCTA, and this code may -contain leading zeros.} - -\emph{There are significant changes to the 2010 ZCTA delineation from that used in -2000. Coverage was extended to include the Island Areas for 2010 so that the -United States, Puerto Rico, and the Island Areas have ZCTAs. Unlike 2000, -when areas that could not be assigned to a ZCTA were given a generic code -ending in "XX" (land area) or "HH" (water area), for 2010 there is no -universal coverage by ZCTAs, and only legitimate five-digit areas are -defined. The 2010 ZCTAs will better represent the actual Zip Code service -areas because the Census Bureau initiated a process before creation of 2010 -blocks to add block boundaries that split polygons with large numbers of -addresses using different ZIP Codes.} - -\emph{Data users should not use ZCTAs to identify the official USPS ZIP Code for -mail delivery. The USPS makes periodic changes to ZIP Codes to support more -efficient mail delivery. The ZCTAs process used primarily residential -addresses and was biased towards ZIP Codes used for city-style mail delivery, -thus there may be ZIP Codes that are primarily nonresidential or boxes only -that may not have a corresponding ZCTA.} +contain leading zeros. Not all ZIP Codes in use by the USPS may have a ZCTA +delineated to represent them, The USPS makes periodic changes to ZIP Codes to +support more efficient mail delivery. In addition, the ZCTA delineation +process primarily uses residential addresses and has a bias towards ZIP Codes +used for city-style mail delivery, thus there may be ZIP Codes that are +primarily nonresidential or used for PO boxes only that may not have a +corresponding ZCTA. ZIP Code is a trademark of the U.S. Postal Service."} } \examples{ # Display the ZCTAs with ggplot2