Spreadsheets Tutorial - Introduction to Statistics in Spreadsheets

Welcome to the Course: Understanding Statistics

What is this statistic? It's just a piece of information from a large quantity of data, simply put, a statistic describes data in some way and in this course, you'll learn about different statistics you can use to extract insights from your data. Let's begin with averages. An average is an information reduction technique; you start with the data population which may be too large to understand, so you need a way to reduce the amount of information into a comprehensible amount.

For example, there are 126 million people living in Japan, that's a lot of people. You could look at a list of every person and their age to intuit something about the population's age, however, it's easier to take a mean average to understand the population. A mean is the sum of all observations in your population or sample divided by the number of observations in that population or sample.

This is a sample of ten Japanese ages; if you sum up the ages, you get 473. Therefore, the mean average is 473 divided by ten or forty-seven point three. The mean reduces information from these ten observations to one value. Let's compare Japan's average age, forty-seven point three two, with Uganda's average age. A Ugandan sample sums to 158 and the count is ten; thus, the Ugandan mean is fifteen point eight.

Comparing the two samples, you didn't have to compare all twenty samples to arrive at this conclusion simply taking an average of each was enough to help you learn about the sample differences. Another average is the median. The median is the middle number of a data set when sorted from smallest to largest, half the numbers are less than the median, and half the numbers are above the median.

The Japanese ages have been sorted; to make it clear, the person column has the top and bottom four values crossed off. This leaves 47 and 48 in the middle. The median lies in between 47 and 48 which is forty-seven point five. If our sample had an odd number of observations, such as nine, the middle number would have just been 47. An advantage of the median compared to the mean is that it is more robust to outliers; if your dataset has any outliers, the median may be a more informative statistic.

The final statistic we'll cover here is the mode. The mode is a number that appears most often in the data set. This Japanese age sample has 48 listed twice thus the mode average is 48. You can use spreadsheet functions to easily and quickly calculate the mean, median, and mode. Let's go over these now; you can calculate a mean using the average function. Here we're passing in a range consisting of the cells B2 through B11 to the average function to calculate the mean.

To calculate the median, use the median function, no surprise there, and use mode to calculate a data set's mode. Let's practice calculating some spreadsheets.

"WEBVTTKind: captionsLanguage: enwelcome to the course I'm Ted and I'll be your instructor what is this statistic it's just a piece of information from a large quantity of data simply put a statistic describes data in some way and in this course you'll learn about different statistics you can use to extract insights from your data let's begin with averages an average is an information reduction technique you start with the data population which may be too large to understand so you need a way to reduce the amount of information into a comprehensible amount for example there are a hundred and twenty six million people living in Japan that's a lot of people you could look at a list of every person and their age to Intuit something about the populations age however it's easier to take a mean average to understand the population a mean is the sum of all observations in your population or sample divided by the number of observations in that population or sample this is a sample of ten Japanese ages summing the ages you get 473 therefore the mean average is 473 divided by ten or forty seven point three mean reduces information from these ten observations to one value let's compare Japan's average age forty seven point three two Uganda's average age the ugandan sample sums to 158 and the count is 10 thus the Uganda mean is 15 point 8 comparing the two samples Ugandan have a lower average age compared to Japan in this example you didn't have to compare all 20 samples to arrive at this conclusion simply taking an average of each was enough to help you learn about the sample differences another average is the median the median is the middle number of a data set when sorted from smallest to largest half the numbers are less than the median and half the numbers are above the median the Japanese ages have been sorted and to make it clear the person column has the top and bottom four values crossed off this leaves 47 and 48 in the middle the median lies in between 47 and 48 which is forty seven point five if our sample had an odd number of observations such as nine the middle number would have just been 47 an advantage of the median compared to the mean is that it is more robust to outliers so if your dataset has any outliers the median may be a more informative statistic the final statistic we'll cover here is the mode the mode is a number that appears most often in the data set this Japanese age sample has 48 listed twice thus the mode average is 48 you can use spreadsheet functions to easily and quickly calculate the mean median and mode let's go over these now you can calculate a mean using the average function here we're passing in a range consisting of the cells b2 through b11 to the average function to calculate the mean to calculate the median use median no surprise there and use mode to calculate a data sets mode let's practice calculating some spreadwelcome to the course I'm Ted and I'll be your instructor what is this statistic it's just a piece of information from a large quantity of data simply put a statistic describes data in some way and in this course you'll learn about different statistics you can use to extract insights from your data let's begin with averages an average is an information reduction technique you start with the data population which may be too large to understand so you need a way to reduce the amount of information into a comprehensible amount for example there are a hundred and twenty six million people living in Japan that's a lot of people you could look at a list of every person and their age to Intuit something about the populations age however it's easier to take a mean average to understand the population a mean is the sum of all observations in your population or sample divided by the number of observations in that population or sample this is a sample of ten Japanese ages summing the ages you get 473 therefore the mean average is 473 divided by ten or forty seven point three mean reduces information from these ten observations to one value let's compare Japan's average age forty seven point three two Uganda's average age the ugandan sample sums to 158 and the count is 10 thus the Uganda mean is 15 point 8 comparing the two samples Ugandan have a lower average age compared to Japan in this example you didn't have to compare all 20 samples to arrive at this conclusion simply taking an average of each was enough to help you learn about the sample differences another average is the median the median is the middle number of a data set when sorted from smallest to largest half the numbers are less than the median and half the numbers are above the median the Japanese ages have been sorted and to make it clear the person column has the top and bottom four values crossed off this leaves 47 and 48 in the middle the median lies in between 47 and 48 which is forty seven point five if our sample had an odd number of observations such as nine the middle number would have just been 47 an advantage of the median compared to the mean is that it is more robust to outliers so if your dataset has any outliers the median may be a more informative statistic the final statistic we'll cover here is the mode the mode is a number that appears most often in the data set this Japanese age sample has 48 listed twice thus the mode average is 48 you can use spreadsheet functions to easily and quickly calculate the mean median and mode let's go over these now you can calculate a mean using the average function here we're passing in a range consisting of the cells b2 through b11 to the average function to calculate the mean to calculate the median use median no surprise there and use mode to calculate a data sets mode let's practice calculating some spread\n"