class: center, middle, inverse, title-slide # Census Data in R:
tidycensus
+
tidyverse
## Oct 8 Session | LBJ Data Studio
###
Ethan Tenison
| Project Manager for Data Initiatives, UT Austin - RGK Center ###
Matt Worthington
| Sr. Project Manager for Data Initiatives, UT Austin - LBJ ###
October 08, 2021
--- background-image: url('assets/images/dir_stat.png') background-size: cover class: center, bottom, inverse --- class: middle .pull-left[ ### Personal Background * .orange[**Personal Background**]: Born and raised in San Antonio, Texas. Live in Austin with my wife and three kids. * .orange[**Education Backgrounds**]: English Studies, Special Education, and Public Policy * .orange[**Professional Backgrounds**]: Public School Teacher, School District Administrator, and Data Scientist. ] .pull-right[ ### Quant Background * Grew up very poor * Single parent household * **Struggled with math** in school * Was really good at theatre and mostly got into college because of that. * Once in college, switched over to become an English Major because I didn't want to act professionally... ] --- background-image: url('assets/images/hermione_spell_class.png') background-size: cover class: center, bottom, inverse ## .blue[How to understand R if you are new...] --- background-image: url('assets/images/hp_vs_r.png') background-size: cover class: center, bottom, inverse --- ## What is the US Census? * __Complete count__ of the United States population that takes place every 10 years * __Mandated__ by Article 1, Section 2 of the US Constitution; taken since 1790 * Used to guide federal spending; determines proportional representation in US Congress * __US Census Bureau__: Government agency responsible for conducting the Census; part of the Department of Commerce * [Census 2020 form](https://www2.census.gov/programs-surveys/decennial/2020/technical-documentation/questionnaires-and-instructions/questionnaires/2020-informational-questionnaire.pdf) .tl.burntorange[ Credit: ([@kyle_e_walker](https://twitter.com/kyle_e_walker)) ] --- ## American Community Survey * Up through 2000: detailed socio-economic characteristics of the US population collected by the Census __long form__ * American Community Survey (ACS): yearly survey of approximately 3 million households; replaces the long form * [Sample ACS form](https://www2.census.gov/programs-surveys/acs/methodology/questionnaires/2021/quest21.pdf) .tl.burntorange[ Credit: ([@kyle_e_walker](https://twitter.com/kyle_e_walker)) ] --- class: center ## Geography Heiarchies in Census Data .rmd-small[![](assets/images/census_hierarchies.png)] --- # What is `tidycensus`? **tidycensus** was first released in 2017 by Dr. Kyle Walker, a professor of Demography at Texas Christian University. An R package, **tidycensus** was designed to facilitate the process of acquiring and working with US Census Bureau population data in the R environment. It is now maintained by Dr. Walker and Matt Herman. While **tidycensus** allows you to access a massive amount of data, it doesn't facilitate access to every single product. And while there are several functions available in **tidycensus**, tonight we're focusing on three core functions in **tidycensus**: **Primary** - `load_variables()`, an interface to navigate through all of the available variables inside of the ACS and Decennial Census products. - `get_acs()`, which requests data from the 1-year and 5-year American Community Survey samples. Data are available from the 1-year ACS back to 2005 and the 5-year ACS back to 2005-2009. **Secondary** - `get_decennial()`, which requests data from the US Decennial Census APIs for 2000 and 2010. When 2020 Census data are released via the API, R users will be able to access it with this function as well. .tl.burntorange[ Credit: ([@kyle_e_walker](https://twitter.com/kyle_e_walker)) ] --- # In Action: `tidycensus` .pull-left[ ```r library(tidycensus) # Create A List of Variables demvars <- c(White = "P1_003N", Black = "P1_004N", Asian = "P1_006N", Hispanic = "P2_002N") # Pull The Census Data harris <- get_decennial( geography = "tract", variables = demvars, year = 2020, state = "TX", county = "Harris County", geometry = TRUE, summary_var = "P1_001N" ) |> # Create pct column mutate(pct = 100 * (value / summary_value)) # Draw A Chart harris |> ggplot(aes(fill = pct)) + facet_wrap(~variable) + geom_sf(color = NA) + coord_sf(crs = 26915, datum=NA) + scale_fill_distiller(palette = "Blues", direction = 1) + theme_minimal() + labs(title="Population Demographics in Harris County", subtitle="US Census | 2020 Decennial Census") ``` ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-3-1.png" width="504" /> ] --- # How is Census Data used in R? .pull-left[ [**News Coverage**](https://twitter.com/jburnmurdoch/status/981074810020204544?s=20) ![](https://pbs.twimg.com/media/DZ15NFRXUAAA_yJ?format=jpg&name=large) ] .pull-right[ [**Policy Analysis**](https://corymccartan.github.io/redistricting21/posts/2021-06-25-colorado-preliminary-congressional-districts/) ![](https://corymccartan.github.io/redistricting21/posts/2021-09-03-oregon-congressional-districts/oregon-preliminary-congressional-districts_files/figure-html5/best-1.png) ] --- class: inverse middle center section ## You need a Census API Key ### Obtain one at https://api.census.gov/data/key_signup.html --- class: code90 ## `tidycensus::load_variables()` **What it does:** Calls variables from the Census API **Prerequisite**: Your Census API key. Obtain one at https://api.census.gov/data/key_signup.html ```r load_variables(year = "", dataset = "", cache = "") ``` * **year**: The year for which you are requesting variables. Either the year or endyear of the decennial Census or ACS sample. 5-year ACS data is available from 2009 through 2018. 1-year ACS data is available from 2005 through 2019. * **dataset** :One of "sf1", "sf3", "acs1", "acs3", "acs5", "acs1/profile", "acs3/profile, "acs5/profile", "acs1/subject", "acs3/subject", or "acs5/subject". * **cache** :Whether you would like to cache the dataset for future access, or load the dataset from an existing cache. Defaults to FALSE. ```r load_variables(2019, "acs5", cache = TRUE) # Loads 2019 ACS Variables load_variables(2010, "sf1", cache = TRUE) # Loads 2010 Census Variables load_variables(2020, "pl", cache = TRUE) # Loads 2020 Census Variables ``` --- ## `tidycensus::get_acs()` **What it does:** Calls data from the Census API .pull-left[ ```r get_acs( geography, variables = NULL, table = NULL, cache_table = FALSE, year = 2019, output = "tidy", state = NULL, county = NULL, zcta = NULL, geometry = FALSE, summary_var = NULL, key = NULL, moe_level = 90, survey = "acs5", show_call = FALSE, ... ) ``` ] .pull-right-compact[ * **geography **:The geography of your data. * **variables**: Character string or vector of character strings of variable IDs. tidycensus automatically returns the estimate and the margin of error associated with the variable. * **table**: The ACS table for which you would like to request all variables. Uses lookup tables to identify the variables; performs faster when variable table already exists through load_variables(cache = TRUE). Only one table may be requested per call. * **year**: The year, or endyear, of the ACS sample. 5-year ACS data is available from 2009 through 2019. 1-year ACS data is available from 2005 through 2019. Defaults to 2019. * **state**: An optional vector of states for which you are requesting data. State names, postal codes, and FIPS codes are accepted. Defaults to NULL. * **county**: The county for which you are requesting data. County names and FIPS codes are accepted. Must be combined with a value supplied to 'state'. Defaults to NULL. * **zcta**: The zip code tabulation area(s) for which you are requesting data. Specify a single value or a vector of values to get data for more than one ZCTA. Numeric or character ZCTA GEOIDs are accepted. When specifying ZCTAs, geography must be set to '"zcta"' and 'state' must be specified with 'county' left as 'NULL'. Defaults to NULL. * **geometry**: if FALSE (the default), return a regular tibble of ACS data. if TRUE, uses the tigris package to return an sf tibble with simple feature geometry in the 'geometry' column. * **summary_var**: Character string of a "summary variable" from the ACS to be included in your output. Usually a variable (e.g. total population) that you'll want to use as a denominator or comparison. ] --- class: journey-bg, section, center, middle, inverse # Let's get started 👟 --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_17.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_18.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('assets/images/tidyverse_hex.png') background-size: cover class: center, bottom, inverse --- background-image: url('assets/images/tidyverse_network.png') background-size: cover class: center, bottom, inverse animated fadeIn --- background-image: url('assets/images/tidyverse_data_lifecycle_map.png') background-size: cover class: center, bottom, inverse animated fadeIn --- background-image: url('assets/images/lifecycle_focus.png') background-size: cover class: center, bottom, inverse animated fadeIn --- background-image: url('assets/images/what_todays_focus_will_be.png') background-size: cover class: center, bottom, inverse animated animate fadeIn --- background-image: url('assets/images/marie_kondo.png') background-size: cover class: center, bottom, inverse animated fadeIn --- # What is R Markdown? .pull-left.b--gray.ba.bw2.ma2.pa4.shadow-1[ * ["An authoring framework for data science."](https://rmarkdown.rstudio.com/lesson-1.html) (✔️) * [A document format (.Rmd)](https://bookdown.org/yihui/rmarkdown/). (✔️) * [An R package named rmarkdown](https://rmarkdown.rstudio.com/docs/). (✔️) * ["A file format for making dynamic documents with R."](https://rmarkdown.rstudio.com/articles_intro.html) (✔️) * ["A tool for integrating text, code, and results."](https://r4ds.had.co.nz/communicate-intro.html) (✔️) * ["A computational document."](http://radar.oreilly.com/2011/07/wolframs-computational-documen.html) (✔️) * Wizardry. (🧙️) ] .pull-right[ .rmd-small[![](https://raw.githubusercontent.com/rstudio/hex-stickers/master/PNG/rmarkdown.png)] ] .tl.burntorange[ Courtesy of Alison Presmanes Hill ([@apreshill](https://twitter.com/apreshill)) ] --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_15.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_16.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_17.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_18.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_19.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_20.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_21.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_22.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_23.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_24.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_25.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_26.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_27.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_28.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_29.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_30.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_31.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_32.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_33.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_34.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_35.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_36.jpeg') background-size: cover class: center, bottom, inverse --- class: middle .pull-left[ ### Writing in Markdown If you're not familiar with Markdown, that's okay. If you're like me, you grew up using things like Microsoft Office and Google Docs to write, so you're probably not familiar with markdown, but Markdown is a lot easier. So... what is it? Markdown is just an easy way to write formatted text. Like anything else, it takes practice, but is used widely in popular writing apps like [Bear](https://bear.app), [iA Writer](https://ia.net/writer), [Ulysses](https://ulysses.app). For the purposes of this workshop, we'll use it in an RStudio document tool called "[Rmarkdown](https://rmarkdown.rstudio.com)". Here's a quick way to get familiar with writing in Markdown: .bg--burntorange.b--text-color.ba.bw2.br3.shadow-5.ph4.mt5.f7[ *__Note__: If writing in Markdown still feels weird, don't worry. You can also use R Markdown's Visual Markdown Editor.* ] ] .pull-right[ ![Doing the Markdown 10-min Tutorial](https://d33wubrfki0l68.cloudfront.net/59f29676ef5e4d74685e14f801bbc10c2dbd3cef/c0688/lesson-images/markdown-1-markup.png) * **First, visit this link:** Go to [https://commonmark.org/help/tutorial/](https://commonmark.org/help/tutorial/) * **Then, do the 10-minute tutorial:** It's a bit of time, but goes quick and will help you have a better grasp on writing in markdown. * **Finally, bookmark this site**: [https://commonmark.org/help/](https://commonmark.org/help/) ] --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_38.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_39.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_40.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_41.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_42.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_43.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_44.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_45.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_46.jpeg') background-size: cover class: center, bottom, inverse --- background-image: url('https://utexas-lbjp-data.github.io/assets/rmarkdown/slide_47.jpeg') background-size: cover class: center, bottom, inverse --- class: title center pipe-page # `|>` and `%>%` <code class ='r hljs remark-code'><b>leave_house</b>(<b>get_dressed</b>(<b>get_out_of_bed</b>(<b>wake_up</b>(<span style='color:#bf5700'>me</span>, <span style='color:#005f86'>time</span> = <span style='color:#bf5700'>"8:00"</span>), <span style='color:#005f86'>side</span> = <span style='color:#bf5700'>"correct"</span>), <span style='color:#005f86'>pants</span> = <span style='color:#bf5700'>TRUE</span>, <span style='color:#005f86'>shirt</span> = <span style='color:#bf5700'>TRUE</span>), <span style='color:#005f86'>car</span> = <span style='color:#bf5700'>TRUE</span>, <span style='color:#005f86'>bike</span> = <span style='color:#bf5700'>FALSE</span>)</code> <code class ='r hljs remark-code'>me <span style='background-color:#ffff7f'>|></span> <br> <b>wake_up</b>(<span style='color:#005f86'>time</span> = <span style='color:#bf5700'>"8:00"</span>) <span style='background-color:#ffff7f'>|></span> <br> <b>get_out_of_bed</b>(<span style='color:#005f86'>side</span> = <span style='color:#bf5700'>"correct"</span>) <span style='background-color:#ffff7f'>|></span> <br> <b>get_dressed</b>(<span style='color:#005f86'>pants</span> = <span style='color:#bf5700'>TRUE</span>, <span style='color:#005f86'>shirt</span> = <span style='color:#bf5700'>TRUE</span>) <span style='background-color:#ffff7f'>|></span> <br> <b>leave_house</b>(<span style='color:#005f86'>car</span> = <span style='color:#bf5700'>TRUE</span>, <span style='color:#005f86'>bike</span> = <span style='color:#bf5700'>FALSE</span>)</code> .tl.footnote-small[ Courtesy of Andrew Heiss ([@andrewheiss](https://twitter.com/andrewheiss)) ] --- ## Resources For This Workshop * [RStudio Prework Setup Spefically Tailored to Tidycensus](https://slides.lbjdata.org/course_workshops/wasem/prework/index.html#1) * [ACS 2019 Variables](census.lbjdata.org/acs_2019) * [tidycensus documentation](https://slides.lbjdata.org/course_workshops/wasem/prework/index.html#1) #### Other Noteworthy API Packages * [censusapi](https://www.hrecht.com/censusapi/): Non-spatial accessor to all of Census API data. * [ipumsr](http://tech.popdata.org/ipumsr/): IPUMS provides census and global survey data. * [blscrapeR](https://github.com/keberwein/blscrapeR): Bureau of Labor Statistics Data * [fredr](https://sboysel.github.io/fredr/):Federal Reserve of Economic Data * [idbr](https://github.com/walkerke/idbr): Interface to the US Census Bureau International Data Base * [cancensus](https://mountainmath.github.io/cancensus/index.html): Candian Census Data * [mxmaps](https://www.diegovalle.net/mxmaps/) and [inegiR](https://github.com/Eflores89/inegiR): Tools for working with data in Mexico. * [rKenyaCensus](https://www.rstudio.com/resources/rstudioglobal-2021/rkenyacensus-package/): Kenyan Data * [geobr](https://ipeagit.github.io/geobr/): Brazilian Data * [nomisr](https://docs.ropensci.org/nomisr/): Data from the UK * [WDI](http://vincentarelbundock.github.io/WDI/): World Bank Data --- class: middle section, center <div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'><iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='315' src='https://www.mentimeter.com/embed/847636dd1da2f15ab7dcbcc201269aa8/6b732a3a617f' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'></iframe></div> --- class: journey-bg, section, left, middle, inverse # 📚 Resources for R 📚 ## Questions? lbjdata.org #### [#rstats hashtag](https://twitter.com/search?q=%23rstats) on twitter #### [#tidycensus hashtag](https://twitter.com/search?q=%23tidycensus) on twitter #### Analyzing US Census Data: Methods, Maps, and Models in R - [Online Book](https://walker-data.com/census-r/index.html) #### R For Data Science - [Online Book](https://r4ds.had.co.nz) | [Hard Copy](https://www.oreilly.com/library/view/r-for-data/9781491910382/) | [Slack Community](https://www.rfordatasci.com) --- class: journey-bg, section, center, middle, inverse # 🙏 Thank you 🙏 ## Questions? lbjdata.org ### 📧: matthew.worthington@austin.utexas.edu ### 🐦: @mrworthington