Manipulating Time Series Data with Pandas
In this video, we will begin to manipulate time series data and learn how to move our data across time so that we can compare values at different points in time. This involves shifting values into the future or creating lags by moving data into the past. We will also learn how to calculate changes between values at different points in time and finally, we will see how to calculate the change between values in percentage terms.
To get started, let's import a recent stock price time series for Google using the `read_csv` function from Pandas. The `read_csv` function can parse dates for us instead of using the `two_day_time` function. We can also tell `read_csv` to parse certain columns as dates by providing one or more target labels as a list. Additionally, we can let `read_csv` treat a column as indexed by providing the `index_col` parameter.
As a result, we get a properly formatted time series. Our first time series method is `dot_shift`, which allows us to move all data in a series or data frame into the past or future. The shifted version of the stock price has all prices moved by one period into the future, as a result, the first value in the series is not missing. In contrast, the lagged version of the stock price is moved one period into the past in this case, the last value is now missing.
To shift data into the past, we use negative period numbers. Shifting data is useful to compare data at different points in time. We can for instance calculate the rate of change from period to period, which is also called financial return in finance. The method `div` allows us not only to divide a series by a value but by an entire series, for instance, by another column in the same data frame. Partners make sure that the dates for both series match up and we'll divide the aligned values accordingly as a result.
We get the relative change from the last period for every price, which is the factor by which you need to multiply the last price to get the current price. We can chain all data frame methods that return the data frame, and the return data frame will be used as input for the next calculation. Here, we are subtracting 1 and multiplying the result by 100 to obtain the relative change in percentage terms.
Another time series method is `diff`, which calculates the change between values at different points in time by default. The difference of the close price is the difference in value since the last day stocks were traded. We can use this information to also calculate one period returns, just divide the absolute change by the shifted price and then multiply by 100 to get the same result as before.
Finally, since it is such a common operation, Pandas has a built-in method for you to calculate the percent change directly. Just select the column and call `percent_change` and multiply by 100 to get the same result as before. All these methods have a `periods` keyword that defaults to the value 1 if you provide a higher value, you can calculate returns for data points several periods apart.
Let's practice this new time series method now
"WEBVTTKind: captionsLanguage: enin this video you will begin to manipulate time series data in particular you will learn how to move your data across time so that they can compare values at different points in time this involves shifting values into the future or creating lags by moving data into the past you will also learn how to calculate changes between values at different points in time lastly you will see how to calculate the change between values in percentage terms also call the rate of growth panelist has built-in methods for these calculations that leverage the day time index you learned about in the last segment let's again import a recent stock price time series for google you can let the read CSV function to the date parsing for you instead of using the two day time function you can tell read CSV to parse certain columns as dates just provide one or more target labels as a list you can also let read CSV cell a column as indexed by providing the index called parameter as a result you get a properly formatted time series your first time series method is dot shift it allows you to move all data in a series or data frame into the past or future the shifted version of the stock price has all prices moved by one period into the future as a result the first value in the series is not missing in contrast the lagged version of the stock price is moved one period into the past in this case the last value is now missing to shift data into the past use negative period numbers shifting data is useful to compare data at different points in time you can for instance calculate the rate of change from period to period which is also called financial return in finance the method div allows you not only to divide a series by a value but by an entire series for instance by another column in the same data frame partners make sure the dates for both series match up and we'll divide the aligned values accordingly as a result you get the relative change from the last period for every price that is the factor by which you need to multiply the last price to get the current price as you have seen before you can chain all data frame methods that return the data frame the return data frame will be used as input for the next calculation here we are subtracting 1 and multiplying the result by 100 to obtain the relative change in percentage terms another time series method is diff which calculates the change between values at different points in time by default the diversion of the close price is the difference in value since the last day stocks were traded you can use this information to also calculate one period returns just divide the absolute change by the shifted price and then multiply by 100 to get the same result as before finally since it is such a common operation ponders has a built-in method for you to calculate the percent change directly just select the column and call percent change multiply by 100 to get the same result as before all these methods have a periods keyword that you have already seen for shift and their defaults to the value 1 if you provide a higher value you can calculate returns for data points several periods apart as in this example for prices 3 trading days apart now let's practice this new time series methodin this video you will begin to manipulate time series data in particular you will learn how to move your data across time so that they can compare values at different points in time this involves shifting values into the future or creating lags by moving data into the past you will also learn how to calculate changes between values at different points in time lastly you will see how to calculate the change between values in percentage terms also call the rate of growth panelist has built-in methods for these calculations that leverage the day time index you learned about in the last segment let's again import a recent stock price time series for google you can let the read CSV function to the date parsing for you instead of using the two day time function you can tell read CSV to parse certain columns as dates just provide one or more target labels as a list you can also let read CSV cell a column as indexed by providing the index called parameter as a result you get a properly formatted time series your first time series method is dot shift it allows you to move all data in a series or data frame into the past or future the shifted version of the stock price has all prices moved by one period into the future as a result the first value in the series is not missing in contrast the lagged version of the stock price is moved one period into the past in this case the last value is now missing to shift data into the past use negative period numbers shifting data is useful to compare data at different points in time you can for instance calculate the rate of change from period to period which is also called financial return in finance the method div allows you not only to divide a series by a value but by an entire series for instance by another column in the same data frame partners make sure the dates for both series match up and we'll divide the aligned values accordingly as a result you get the relative change from the last period for every price that is the factor by which you need to multiply the last price to get the current price as you have seen before you can chain all data frame methods that return the data frame the return data frame will be used as input for the next calculation here we are subtracting 1 and multiplying the result by 100 to obtain the relative change in percentage terms another time series method is diff which calculates the change between values at different points in time by default the diversion of the close price is the difference in value since the last day stocks were traded you can use this information to also calculate one period returns just divide the absolute change by the shifted price and then multiply by 100 to get the same result as before finally since it is such a common operation ponders has a built-in method for you to calculate the percent change directly just select the column and call percent change multiply by 100 to get the same result as before all these methods have a periods keyword that you have already seen for shift and their defaults to the value 1 if you provide a higher value you can calculate returns for data points several periods apart as in this example for prices 3 trading days apart now let's practice this new time series method\n"