The Most Boring Time Series You Will Ever See: Understanding White Noise
When it comes to time series data, there are many different types of patterns and characteristics that can be present. However, one type of time series stands out for its simplicity and lack of complexity: white noise. In fact, some statisticians refer to this type of data as "iid" or independent and identically distributed observations. This means that the data is completely random and lacks any trend, seasonality, cyclicity, or autocorrelation.
So, what exactly is white noise? The name comes from physics, where white light has similar mathematical characteristics. Although it may appear boring, white noise is actually a very important type of time series because it is the basis of almost all forecasting models. When we say that white noise consists of "many insignificant spikes," we mean that the data is simply random and lacks any meaningful patterns or relationships between observations.
The autocorrelation function (ACF) of white noise shows many small, insignificant spikes. This is because the data is truly random, so we expect correlations between observations to be close to zero. The dashed blue lines in the example are used to show us how large a spike has to be before we can consider it significantly different from zero. In this case, even the largest spike at lag 10 is well within the expected range for a white noise series.
Any spike that falls within the blue line should be ignored, as it is not statistically significant. On the other hand, spikes outside of the blue lines might indicate something interesting in the data, or suggest that there may be some information that could be used in building a forecasting model. For example, let's consider a time series showing the number of pigs slaughtered each month in my home state of Victoria.
At first glance, this time series looks relatively random, possibly with a slight upward trend. However, it is difficult to see the difference between this and a white noise series based on a simple time plot. But when we look at the ACF plot of the same data, we can see that there is actually some information in the data. The first three spikes are significantly larger than zero, which means we can be confident that this is not a white noise series.
In other words, there is some information in the data that can be used in building a forecasting model. Looking at an ACF is useful because it helps us understand the underlying patterns and relationships in the data. However, sometimes it can be tempting to test all the auto correlations together rather than considering each one separately. To do this, we can use a long box test, which considers the first H autocorrelation values to see if they as a group look like what you would expect from a white noise series.
For example, let's apply the long box test to the first 24 auto correlations seen in the ACF plot for the pigs data set. In this case, the p-value is very small again, suggesting that this is not a white noise series. This means we can be confident that there are some meaningful patterns and relationships present in the data, which can be used to improve our forecasting models.
In summary, white noise is a purely random time series that lacks any trend, seasonality, cyclicity, or autocorrelation. It is often referred to as "iid" data, meaning it is completely independent and identically distributed observations. The ACF of white noise shows many small, insignificant spikes, which are used to determine whether a spike is statistically significant or not.
Using a long box test can be an effective way to determine whether a time series is white noise or not. If the p-value is small, it suggests that there are some meaningful patterns and relationships present in the data, which can be used to improve our forecasting models. On the other hand, if the p-value is large, it means we should consider ignoring any spikes outside of the expected range for a white noise series.
Now it's your turn to try these ideas on another time series!
"WEBVTTKind: captionsLanguage: enlet me show you the most boring time series you will ever see it's just random independent and identically distributed observations in shorthand statisticians often call this iid data in other words there's nothing going on no trend no seasonality no cyclicity not even any auto correlations just randomness in time series it is called white noise the name comes from physics where white light has some similar mathematical characteristics although it appears boring it is a very important type of time series because it is the basis of almost all forecasting models the autocorrelation function of white noise consists of many insignificant spikes because the data is simply random we expect correlations between observations to be close to zero the dashed blue lines are there to show us how large a spike has to be before we can consider it significantly different from zero in this example the first 15 spikes are all within the blue lines as you would expect even the largest spike at lag 10 is well within the range you would expect for a white noise series the blue lines are based on the sampling distribution for autocorrelation assuming the data are white noise any spike within the blue line should be ignored spikes outside the blue lines might indicate something interesting in the data at least they suggest there may be some information that you could use in building a forecasting model here is a time series showing the number of pigs slaughtered each month in my home state of Victoria at first glance it looks relatively random possibly there's a slight upward trend but it's hard to see the difference between this and a white noise series on the basis of a time plot but when you look at the ACF plot of the same data you can see there is actually some information in the data the first three spikes are significantly larger than zero so you can be confident that this is not a white noise series in other words there is some information in the data that can be used in building a forecasting model looking at an ACF is useful but sometimes it is easy to to test all the auto correlations together rather than consider each one separately to do this you can use a long box test it considers the first H autocorrelation values to see if they as a group look like what you would expect from a white noise series you can apply to the first 24 auto correlations you saw in the ACF plot for the pigs data set here the p-value is very small again suggesting that this is not a white noise series to summarize white noise is a purely random time series often you will use a long box test to see if you have a white noise series if you don't have white noise you can then look at the ACF to see which spikes are the most significant now it's your turn to try these ideas on another time serieslet me show you the most boring time series you will ever see it's just random independent and identically distributed observations in shorthand statisticians often call this iid data in other words there's nothing going on no trend no seasonality no cyclicity not even any auto correlations just randomness in time series it is called white noise the name comes from physics where white light has some similar mathematical characteristics although it appears boring it is a very important type of time series because it is the basis of almost all forecasting models the autocorrelation function of white noise consists of many insignificant spikes because the data is simply random we expect correlations between observations to be close to zero the dashed blue lines are there to show us how large a spike has to be before we can consider it significantly different from zero in this example the first 15 spikes are all within the blue lines as you would expect even the largest spike at lag 10 is well within the range you would expect for a white noise series the blue lines are based on the sampling distribution for autocorrelation assuming the data are white noise any spike within the blue line should be ignored spikes outside the blue lines might indicate something interesting in the data at least they suggest there may be some information that you could use in building a forecasting model here is a time series showing the number of pigs slaughtered each month in my home state of Victoria at first glance it looks relatively random possibly there's a slight upward trend but it's hard to see the difference between this and a white noise series on the basis of a time plot but when you look at the ACF plot of the same data you can see there is actually some information in the data the first three spikes are significantly larger than zero so you can be confident that this is not a white noise series in other words there is some information in the data that can be used in building a forecasting model looking at an ACF is useful but sometimes it is easy to to test all the auto correlations together rather than consider each one separately to do this you can use a long box test it considers the first H autocorrelation values to see if they as a group look like what you would expect from a white noise series you can apply to the first 24 auto correlations you saw in the ACF plot for the pigs data set here the p-value is very small again suggesting that this is not a white noise series to summarize white noise is a purely random time series often you will use a long box test to see if you have a white noise series if you don't have white noise you can then look at the ACF to see which spikes are the most significant now it's your turn to try these ideas on another time series\n"