Intro to statistics with R - Interval & Ratio Variables in R

The Types of Variables in Statistics

If all I know is the rank ordering, that's the ordinal variable - just first place, second place, third place, who got gold, who got silver, who got bronze. If that's all I know, then I can't ask well by how much did the winner win, was it a really close race between first place and second or was it just a blowout? So the person who won was way ahead of the person in second. But if I have their time, say it's a running race or a swimming competition, then that's a ratio variable, there's a true zero, and I can ask questions about well by how much did the winner come in first place over the second place finisher and by how much did the second place finisher come ahead of the third place finisher and so on.

As you go down this list, you're able to ask more detailed questions. And that's what we want to strive for in statistics - variables that give us interval or ratio scale. Not always possible, but ideally we'll use interval or ratio variables because they're the richest in terms of information. They allow us to ask the most indepth questions of our data.

A Variable: A Tool for Gathering Information

In statistics, a variable is a characteristic or attribute of a population that can be measured and recorded. Variables are used to collect data and make comparisons between groups. There are several types of variables, each with its own strengths and limitations.

Nominal Variables

One type of variable is nominal, which means it doesn't have any numerical value. Nominal variables are often categorical and describe characteristics such as age, sex, or country of origin. All I know about a student in a class is that they are from the United States or Canada - that's my only piece of information about their country of origin.

Nominal variables allow us to do certain things with our data, like compare groups based on a characteristic. For example, we can compare students who come from countries with similar population sizes. But all I know is if they are from the same country or different country - that's all that I can say about their population size.

Ordinal Variables

Another type of variable is ordinal, which means it has a ranked value but doesn't have equal intervals between consecutive ranks. Ordinal variables describe rankings such as first place, second place, third place, and so on. We know who won the race or the competition, but we don't know by how much they won.

Ordinal variables allow us to ask questions about rank ordering, like whether two students are from the same country or different countries, or whether one student comes from a country with a larger population than another. But we can't say anything about the relative size of their populations.

Interval Variables

An interval variable is a type of variable that has equal intervals between consecutive ranks. Interval variables describe measurements such as temperatures or lengths, where each point on the scale represents an equal amount. We know not only who won the competition but also by how much they won - that's an example of an interval variable.

Interval variables allow us to ask questions about differences and similarities between groups. For example, we can compare students based on their scores on a test or the lengths of their arms and legs.

Ratio Variables

A ratio variable is a type of variable that has both equal intervals and true zero. Ratio variables describe measurements such as weights or profits, where each point on the scale represents an equal amount, and there's a true zero point - you can say how much more or less something is compared to nothing.

Ratio variables allow us to ask questions about comparisons between groups, like whether one student scores higher than another on a test. Ratio variables give us the richest information of all, allowing us to make precise calculations and comparisons.

Classifying Variables: A Tool for Statistics

In 1946, Stevens published a paper classifying variables into four distinct categories or types of variables. These categories are based on the type of measurement used, which in turn determines what questions can be asked about those variables.

The Four Types of Variables

1. Nominal Variables

2. Ordinal Variables

3. Interval Variables

4. Ratio Variables

Each category has its own strengths and limitations, and understanding these differences is crucial for effective data analysis. By recognizing the type of variable we are dealing with, we can ask more informed questions about our data and make more accurate conclusions.

In conclusion, variables play a critical role in statistics, allowing us to collect and analyze data. The four types of variables - nominal, ordinal, interval, and ratio - each offer unique strengths and limitations, and understanding these differences is essential for effective data analysis.

"WEBVTTKind: captionsLanguage: enso in this classic uh paper by Stevens in 1946 he came up with these four distinct categories uh or types of variables uh because there are different types of things we can do with these kinds of variables I know this font is a little fuzzy and hard to see I purposely cut and paste this table from Stevens 1946 because I think it's an important paper uh everyone should read it um I won't go through the whole whole slide you can look at this at your leisure the point is that uh a variable of type say nominal um only allows us to do certain things it only allows me to say are two entities equal or not equal so are you are you from the same country or different country so if I take two students at random from this course and I just know this nominal variable country of origin all I can do is say are you from the same country or are you from a different country I can't do say anything about greater than or less than I can just say are you from the same or different uh that's all that's all nominal variables allow us to do as you go down this this uh list you'll see as you go from nominal to ordinal to interval to ratio you're allowed to ask more and more questions of your variables so for ordinal I can ask not only are you the same or different but I can ask are you ranked higher or lower so do you come from the same or different countries and does your country have a greater than or less than population than the other student and so on with interval I can ask questions um about how by how much are you different right if I just have rank borderings so if you're just all I know is you come from the country that has the greatest population and then I know that you come from the country that has the second greatest population I can't ask well what's the difference in population that's what interval and ratio variables allow us to ask just another example perhaps a little more intuitive think about the the uh PE people who are running a race or uh swim against each other in a swim meet you know you can get first place second place third place like in the Olympics you get the gold medal you get the silver medal you get the bronze medal if all I know is the rank ordering that's the ordinal variable just first place second place third place who got gold who got silver who got bronze if that's all I know then I can't ask well by how much did the winner win so was it a really close race between first place and second or or was it just a blowout so the person who won was way ahead of the person in second if all I know are the rankings if all I know are the ordinal variable then I can answer that question about distance but if I have their time say it's a running race or or or a or a swimming competition um if I have time that's a a ratio variable there's a true zero then I can ask questions about well by how much did the winner come in first place over the second place finisher and by how much did the second place finisher come ahead of the third place finisher and so on so as you go down this list you're able to ask more detailed questions and so in statistics and in this course we're going to strive for variables uh that give us that interval or ratio scale we can't always do that um but ideally we'll use interval or ratio variables because they're the richest in terms of information they allow us to ask the most indepth questions of our dataso in this classic uh paper by Stevens in 1946 he came up with these four distinct categories uh or types of variables uh because there are different types of things we can do with these kinds of variables I know this font is a little fuzzy and hard to see I purposely cut and paste this table from Stevens 1946 because I think it's an important paper uh everyone should read it um I won't go through the whole whole slide you can look at this at your leisure the point is that uh a variable of type say nominal um only allows us to do certain things it only allows me to say are two entities equal or not equal so are you are you from the same country or different country so if I take two students at random from this course and I just know this nominal variable country of origin all I can do is say are you from the same country or are you from a different country I can't do say anything about greater than or less than I can just say are you from the same or different uh that's all that's all nominal variables allow us to do as you go down this this uh list you'll see as you go from nominal to ordinal to interval to ratio you're allowed to ask more and more questions of your variables so for ordinal I can ask not only are you the same or different but I can ask are you ranked higher or lower so do you come from the same or different countries and does your country have a greater than or less than population than the other student and so on with interval I can ask questions um about how by how much are you different right if I just have rank borderings so if you're just all I know is you come from the country that has the greatest population and then I know that you come from the country that has the second greatest population I can't ask well what's the difference in population that's what interval and ratio variables allow us to ask just another example perhaps a little more intuitive think about the the uh PE people who are running a race or uh swim against each other in a swim meet you know you can get first place second place third place like in the Olympics you get the gold medal you get the silver medal you get the bronze medal if all I know is the rank ordering that's the ordinal variable just first place second place third place who got gold who got silver who got bronze if that's all I know then I can't ask well by how much did the winner win so was it a really close race between first place and second or or was it just a blowout so the person who won was way ahead of the person in second if all I know are the rankings if all I know are the ordinal variable then I can answer that question about distance but if I have their time say it's a running race or or or a or a swimming competition um if I have time that's a a ratio variable there's a true zero then I can ask questions about well by how much did the winner come in first place over the second place finisher and by how much did the second place finisher come ahead of the third place finisher and so on so as you go down this list you're able to ask more detailed questions and so in statistics and in this course we're going to strive for variables uh that give us that interval or ratio scale we can't always do that um but ideally we'll use interval or ratio variables because they're the richest in terms of information they allow us to ask the most indepth questions of our dataso in this classic uh paper by Stevens in 1946 he came up with these four distinct categories uh or types of variables uh because there are different types of things we can do with these kinds of variables I know this font is a little fuzzy and hard to see I purposely cut and paste this table from Stevens 1946 because I think it's an important paper uh everyone should read it um I won't go through the whole whole slide you can look at this at your leisure the point is that uh a variable of type say nominal um only allows us to do certain things it only allows me to say are two entities equal or not equal so are you are you from the same country or different country so if I take two students at random from this course and I just know this nominal variable country of origin all I can do is say are you from the same country or are you from a different country I can't do say anything about greater than or less than I can just say are you from the same or different uh that's all that's all nominal variables allow us to do as you go down this this uh list you'll see as you go from nominal to ordinal to interval to ratio you're allowed to ask more and more questions of your variables so for ordinal I can ask not only are you the same or different but I can ask are you ranked higher or lower so do you come from the same or different countries and does your country have a greater than or less than population than the other student and so on with interval I can ask questions um about how by how much are you different right if I just have rank borderings so if you're just all I know is you come from the country that has the greatest population and then I know that you come from the country that has the second greatest population I can't ask well what's the difference in population that's what interval and ratio variables allow us to ask just another example perhaps a little more intuitive think about the the uh PE people who are running a race or uh swim against each other in a swim meet you know you can get first place second place third place like in the Olympics you get the gold medal you get the silver medal you get the bronze medal if all I know is the rank ordering that's the ordinal variable just first place second place third place who got gold who got silver who got bronze if that's all I know then I can't ask well by how much did the winner win so was it a really close race between first place and second or or was it just a blowout so the person who won was way ahead of the person in second if all I know are the rankings if all I know are the ordinal variable then I can answer that question about distance but if I have their time say it's a running race or or or a or a swimming competition um if I have time that's a a ratio variable there's a true zero then I can ask questions about well by how much did the winner come in first place over the second place finisher and by how much did the second place finisher come ahead of the third place finisher and so on so as you go down this list you're able to ask more detailed questions and so in statistics and in this course we're going to strive for variables uh that give us that interval or ratio scale we can't always do that um but ideally we'll use interval or ratio variables because they're the richest in terms of information they allow us to ask the most indepth questions of our data\n"