R Tutorial - Searching for data with tidycensus

Using Tidy Census Functions to Access and Analyze American Community Survey (ACS) and Decennial Census Variables

In order to effectively utilize the vast amount of data available from the American Community Survey (ACS) and decennial censuses, users must first understand how to find and use census variable IDs. With thousands of variables at their disposal, it can be challenging for users to determine which specific variables they need. Fortunately, there are various web resources, including intuitive search tools like Census Reporters, that can assist with this task.

Aside from web resources, tidy census also includes built-in functionality to search for variables. The load_variables function in tidy census is particularly useful, as it allows users to download and browse variables datasets from the Census Bureau website. This function has three parameters: year, which refers to the year or end year of the data set; dataset, which refers to the data set in question; and an optional cash parameter that allows users to store the variables dataset on their computer for future browsing.

Once a user acquires a census or ACS variables dataset, they can explore it using tidy verse tools. The datasets returned by load_variables have three columns: census ID code, label, and concept, which refer to the general group to which the variable corresponds. In the example shown in the video, the downloaded data set is filtered for variables within census table B1_001, which covers household income.

Understanding variable ID codes from the ACS can be confusing, so it's essential to break them down. For instance, the variable B1_001_E refers to the number of households with an income in the past twelve months less than ten thousand dollars. The "B" prefix indicates that this variable comes from a base table, which provides the most detailed information available in the ACS. Other available tables and data profiles include collapsed tables denoted by C, data profiles denoted by DP, and subject tables denoted by s.

In some cases, the component 1_9_0_01 refers to the table ID, indicating that the variable belongs to a table of related variables that cover different household income bands. The "00_2" suffix refers to the specific variable ID within that table, while the "E" suffix indicates an estimate and is not required by tidy census functions.

Almost every variable in the ACS is characterized by a margin of error, which tidy census is designed to return by default. However, for data returned in wide format, users may only see these suffixes, such as "M", indicating that it's a margin-of-error variable. By utilizing tidy census functions and understanding how to identify and utilize census variable IDs, users can effectively access and analyze the vast amounts of data available from the ACS and decennial censuses.

Practicing with Tidy Census Functions

To make the most of tidy census functions, users must supply a vector of census variable IDs. This lesson will discuss how to find and use these variable IDs, as well as learn about their formatting. With thousands of variables at their disposal, it can be challenging for users to determine which specific variables they need.

Fortunately, there are web resources available to assist with this task. Aside from web resources, tidy census also includes built-in functionality to search for variables. The load_variables function in tidy census is particularly useful, as it allows users to download and browse variables datasets from the Census Bureau website.

The load_variables function has three parameters: year, which refers to the year or end year of the data set; dataset, which refers to the data set in question; and an optional cash parameter that allows users to store the variables dataset on their computer for future browsing. Once a user acquires a census or ACS variables dataset, they can explore it using tidy verse tools.

Datasets returned by load_variables have three columns: census ID code, label, and concept, which refer to the general group to which the variable corresponds. In the example shown in the video, the downloaded data set is filtered for variables within census table B1_001, which covers household income.

Understanding variable ID codes from the ACS can be confusing, so it's essential to break them down. For instance, the variable B1_001_E refers to the number of households with an income in the past twelve months less than ten thousand dollars. The "B" prefix indicates that this variable comes from a base table, which provides the most detailed information available in the ACS.

Other available tables and data profiles include collapsed tables denoted by C, data profiles denoted by DP, and subject tables denoted by s. In some cases, the component 1_9_0_01 refers to the table ID, indicating that the variable belongs to a table of related variables that cover different household income bands.

The "00_2" suffix refers to the specific variable ID within that table, while the "E" suffix indicates an estimate and is not required by tidy census functions. Almost every variable in the ACS is characterized by a margin of error, which tidy census is designed to return by default. However, for data returned in wide format, users may only see these suffixes, such as "M", indicating that it's a margin-of-error variable.

By utilizing tidy census functions and understanding how to identify and utilize census variable IDs, users can effectively access and analyze the vast amounts of data available from the ACS and decennial censuses.

"WEBVTTKind: captionsLanguage: ento use tidy census functions users must supply a vector of census variable ids in this lesson we'll discuss how to find and use census variable ids and learn about how they are formatted there are thousands of variables available across the American Community Survey and decennial censuses this can make it difficult to figure out how to find the variables you need fortunately there are web resources to assist with this like census reporters intuitive search tools aside from web resources tidy census also includes some built-in functionality to search for variables the load variables function in tidy census helps users download and browse variables datasets from the Census Bureau website the function has three parameters year refers to the year or end year of the data set data set refers to the data set in question which in the example shown here is ACS 5 for 5 year a CS and the optional cash parameter allows users to store the variables dataset on their computer to speed up future browsing once acquired census or ACS variables datasets can be explored with tidy verse tools datasets returned by load variables have three columns named for the census ID code label for a description of the variables characteristics and concept which refers to the general group to which the variable corresponds in the example shown here the downloaded data set is filtered for variables within census table B one nine zero zero one which covers household income understanding variable ID codes from the ACS can be confusing so let's go through it here for the variable B one nine zero zero one underscore zero zero two E which refers to the number of households with an income in the past twelve months less than ten thousand dollars the B prefix refers to the fact that this variable comes from a base table which gives the most detail available in the ACS other available tape and tidy census include collapsed tables denoted by C data profiles denoted by DP and subject tables denoted by s the component 1 9 0 0 1 refers to the table ID in this case the variable belongs to a table of related variables which cover different household income bands 0 0 2 then refers to the specific variable ID within that table the suffix e refers to estimate and is not required by tidy census functions almost every variable in the ACS is characterized by a margin of error and tidy census is designed to return both the estimate and margin of error by default margin of error variables have the suffix M you'll only see these suffixes when returning data in wide format let's get some practice searching for variablesto use tidy census functions users must supply a vector of census variable ids in this lesson we'll discuss how to find and use census variable ids and learn about how they are formatted there are thousands of variables available across the American Community Survey and decennial censuses this can make it difficult to figure out how to find the variables you need fortunately there are web resources to assist with this like census reporters intuitive search tools aside from web resources tidy census also includes some built-in functionality to search for variables the load variables function in tidy census helps users download and browse variables datasets from the Census Bureau website the function has three parameters year refers to the year or end year of the data set data set refers to the data set in question which in the example shown here is ACS 5 for 5 year a CS and the optional cash parameter allows users to store the variables dataset on their computer to speed up future browsing once acquired census or ACS variables datasets can be explored with tidy verse tools datasets returned by load variables have three columns named for the census ID code label for a description of the variables characteristics and concept which refers to the general group to which the variable corresponds in the example shown here the downloaded data set is filtered for variables within census table B one nine zero zero one which covers household income understanding variable ID codes from the ACS can be confusing so let's go through it here for the variable B one nine zero zero one underscore zero zero two E which refers to the number of households with an income in the past twelve months less than ten thousand dollars the B prefix refers to the fact that this variable comes from a base table which gives the most detail available in the ACS other available tape and tidy census include collapsed tables denoted by C data profiles denoted by DP and subject tables denoted by s the component 1 9 0 0 1 refers to the table ID in this case the variable belongs to a table of related variables which cover different household income bands 0 0 2 then refers to the specific variable ID within that table the suffix e refers to estimate and is not required by tidy census functions almost every variable in the ACS is characterized by a margin of error and tidy census is designed to return both the estimate and margin of error by default margin of error variables have the suffix M you'll only see these suffixes when returning data in wide format let's get some practice searching for variables\n"