--- title: "Workflow with tidycensus" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{2) Workflow with tidycensus} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, warning = FALSE, message = FALSE} library(zctaCrosswalk) library(tidycensus) library(dplyr) ``` `zctaCrosswalk` was designed to work well with the `tidycensus` package. `tidycensus` is currently the most popular way to access Census data in R. Here is an example of using it to get Median Household Income on all ZCTAs in the US: ```{r} zcta_income = get_acs( geography = "zcta", variables = "B19013_001", year = 2021) head(zcta_income) ``` Note that `?get_acs` returns data for all ZCTAs in the US. It does not provide an option to get data on ZCTAs by State or County. And the dataframe it returns does not provide enough metadata to allow you to do this subselection yourself. A primary motivation for creating the `zctaCrosswalk` package was to support this type of analysis. Note that `?get_acs` returns the ZCTA in a column called `GEOID`. We can combine this fact with `?dplyr::filter`, `?get_zctas_by_county` and `?get_zctas_by_state` to subset to any states or counties we choose. Here we filter `zcta_income` to ZCTAs in San Francisco County, California: ```{r} nrow(zcta_income) sf_zcta_income = zcta_income |> dplyr::filter(GEOID %in% get_zctas_by_county("06075")) nrow(sf_zcta_income) head(sf_zcta_income) ``` ## Mapping the Result A primary motivation in creating this workflow (and indeed, this package) was to create demographic maps at the ZCTA level for selected states and counties. If this interests you as well, I encourage you to copy the below code into R and view the output yourself. (Unfortunately, R package vignettes do not seem to handle map output from the `mapview` package well). This is a powerful and elegant pattern for visualizing ZCTA demographics in R: ```{r eval = FALSE} library(zctaCrosswalk) library(tidycensus) library(dplyr) library(mapview) all_zctas = get_acs( geography = "zcta", variables = "B19013_001", year = 2021, geometry = TRUE) filtered_zctas = filter(all_zctas, GEOID %in% get_zctas_by_county(6075)) mapview(filtered_zctas, zcol = "estimate") ```