R Tutorial - Measurement, validity & reliability

The Objective of Measuring Reliability and Validity in Surveys

The objective of this course is to help you craft surveys possessing both reliability and validity. These are complex concepts that will be learned more about throughout the course. In this lesson, we will introduce the basic ideas of measurement, which is defined as the process of observing and recording. To do this, we generally need some kind of tool or instrument. For example, a ruler is a tool used to measure something, such as height. Measuring something like consumer brand perceptions is a little trickier than measuring height, as it requires checking if the survey is reliable and valid.

Reliability refers to the consistency of measurement. This means that we are measuring consistently, and we can reproduce what we are measuring. We have already checked one type of reliability, equivalence, measured by inter-rater reliability. There are several other types of reliability that will be covered in the remainder of the course. Validity, on the other hand, refers to whether we are measuring what we claim to measure. There are also several types of validity, which will be discussed later in the course. For now, let's touch back briefly on our flow chart, which constructed a set of items to measure customer satisfaction using the tools from the last lesson and collected the data, putting us between steps two and three.

Now that we have our data, we are in the exploratory data analysis (EDA) phase of the process. This phase allows us to learn about our data before modeling it. To begin, let's check response frequencies. Our respondents were unanimous in rating each item or was there a range of opinions? We can get a great visualization of this with the Likert package. First, we pass our data frame to the Likert function, which requires all items to be converted to factors for this function. We do this using the D Pires mutate F function, converting all integer variables to factors.

We then pass our new object here to plot a bar chart of our response frequencies. Additionally, we need to check for items that are worded in the opposite way, so to speak, compared to the other items. Let's examine the items of our customer satisfaction survey. All these items carry positive connotations, except for the item "difficult to use." A company certainly doesn't want its site to be difficult to use, so we want to measure the opposite of the responses here. This is called reverse coding.

We create a new variable "difficult_to_use_R" which is a recoded version of the "difficult to use" item. To do this, we will use the recode function from car. Note that this is not to be confused with the recode function from deep liar. We explicitly call from the car package here and specify how to recode the values: ones become Phi's twos become fours, and so forth, threes will stay 3s, so no need to include that in our syntax.

We confirm what we just did there using select from deep liar to analyze just our two items of interest. We use the response dot frequencies function from psych, which provides us with output for readability. And indeed, we can see that two items are indeed reverse images of each other. Now that we have confirmed our recoding, we are ready to explore our survey data and take it to the next step.

"WEBVTTKind: captionsLanguage: enthe objective of this course is to help you craft surveys possessing both reliability and validity these are complex concepts which you'll learn more about throughout the course let's introduce the basic ideas in this lesson we can define measurement as the process of observing and recording to do that we generally need some kind of tool or instrument consider for example a ruler it's a tool we use to measure something height of course measuring something like consumer brand perceptions is a little trickier than measuring height we need to check if it's calibrated so to speak namely we need to check of our survey is reliable and valid let's take an in-depth look at each measurement reliability means that we are measuring consistently and we can reproduce what we are measuring we already checked one type of reliability equivalence as measured by inter-rater reliability there are several other types will cover the most common in the remainder of the course now for validity are we measuring what we claim to measure just as with reliability there are several types of validity here we have the so-called three C's of validity again we already measured the first content will cover the others later in the course let's touch back briefly on our flow chart here we constructed a set of items to measure customer satisfaction using the tools from the last lesson and collected the data which puts us between steps two and three and our flow chart now what we're in the exploratory data analysis or EDA phase of the process we want to learn about our data before modeling it first let's check response frequencies our respondents unanimous in rating each item or is there a range of opinions we can get a great visualization of this with the Likert package to do this we first pass our data frame to the Likert function all items must be converted to factors for this function so we'll do that using D Pires mutate F function converting all integer variables to factors we can pass our new object here to plot a bar chart of our response frequencies we also need to check for items that are worded in the opposite way so to speak of the other items to take an example let's examine the items of our customer satisfaction survey all these items carry positive connotations except the item difficult to use a company certainly doesn't want its site to be difficult to use so we want to measure the opposite of the responses here this is called reverse coding let's create a new variable difficult to use dot R which is a recoded version of the difficult to use item to do that we will use the recode function from car this is not to be confused with the recode function from deep liar so we will explicitly call from the car package here next we specify how to recode the values ones become Phi's twos become fours and so forth threes will stay 3s so no need to include that in our syntax let's confirm what we just did there using select from deep liar to analyze just our two items of interest we will use the response dot frequencies function from psych let's also around the output for readability we can see that two items are indeed reverse images of each other ok we're ready to explore our survey data let's get startedthe objective of this course is to help you craft surveys possessing both reliability and validity these are complex concepts which you'll learn more about throughout the course let's introduce the basic ideas in this lesson we can define measurement as the process of observing and recording to do that we generally need some kind of tool or instrument consider for example a ruler it's a tool we use to measure something height of course measuring something like consumer brand perceptions is a little trickier than measuring height we need to check if it's calibrated so to speak namely we need to check of our survey is reliable and valid let's take an in-depth look at each measurement reliability means that we are measuring consistently and we can reproduce what we are measuring we already checked one type of reliability equivalence as measured by inter-rater reliability there are several other types will cover the most common in the remainder of the course now for validity are we measuring what we claim to measure just as with reliability there are several types of validity here we have the so-called three C's of validity again we already measured the first content will cover the others later in the course let's touch back briefly on our flow chart here we constructed a set of items to measure customer satisfaction using the tools from the last lesson and collected the data which puts us between steps two and three and our flow chart now what we're in the exploratory data analysis or EDA phase of the process we want to learn about our data before modeling it first let's check response frequencies our respondents unanimous in rating each item or is there a range of opinions we can get a great visualization of this with the Likert package to do this we first pass our data frame to the Likert function all items must be converted to factors for this function so we'll do that using D Pires mutate F function converting all integer variables to factors we can pass our new object here to plot a bar chart of our response frequencies we also need to check for items that are worded in the opposite way so to speak of the other items to take an example let's examine the items of our customer satisfaction survey all these items carry positive connotations except the item difficult to use a company certainly doesn't want its site to be difficult to use so we want to measure the opposite of the responses here this is called reverse coding let's create a new variable difficult to use dot R which is a recoded version of the difficult to use item to do that we will use the recode function from car this is not to be confused with the recode function from deep liar so we will explicitly call from the car package here next we specify how to recode the values ones become Phi's twos become fours and so forth threes will stay 3s so no need to include that in our syntax let's confirm what we just did there using select from deep liar to analyze just our two items of interest we will use the response dot frequencies function from psych let's also around the output for readability we can see that two items are indeed reverse images of each other ok we're ready to explore our survey data let's get started\n"