Logistic Regression: A Predictive Modeling Technique
Logistic regression is a widely used predictive modeling technique that will be explored in this video. This technique is used to predict the target from candidate predictors, and it will be demonstrated how to use logistic regression in Python.
A Simple Example Using Age as a Predictor
Recall from previous coding exercises that elderly people are more likely to donate blood. If we plot the targets (donation status) and function of age for all donors in the population, we see that a1 occurs more to the right where the older donors are. This indicates a positive relationship between age and donation likelihood. We can fit a regression line through these points using the form a times X plus B. In this context, 'a' is called the coefficient of H, and 'B' is called the intercept. By plotting the target as a function of time since the last donation for each donor, it becomes clear that people who recently donated are more likely to donate again. The coefficient of recency in this case is negative.
Using Logistic Regression for Predictive Modeling
The regression line constructed can be used as a predictive model; however, the output is a real number that can be anything - it will be more convenient to obtain a probability as output, a number between 0 and 1 that expresses how likely it is that someone will donate. Thankfully, we can use the logit function to transform this into a probability. The logit function takes the regression formula as inputs and calculates the probability from it.
Mathematical Trick for Binary Classification
This little mathematical trick allows us to use linear regression for binary classification problems. By utilizing the logit function, we can convert our model to predict probabilities instead of just outputs. We can build a logistic regression model using the module linear_model from SK-learn. First, we create a logistic regression model object using the logistic regression function.
Feeding Data to the Logistic Regression Model
Next, we need to feed data to the logistic regression model so that it can be fit. The predictor and target are stored in two separate objects, x and y, which are indexed for feeding into the fit function. After the model is fit, we can observe the coefficient corresponding with the predictor age by checking the KO f value of the fitted model.
Deriving the Formula from the Fitted Model
In this case, the coefficient is positive, namely 0.02, as we expected. If you want to derive the entire formula from the fitted model, you can also retrieve the intercept by checking the intercept value, which is about -4. We assume there is only one predictor; however, many candidate predictors are available in the base table.
Extending Univariate Logistic Regression to Multivariate Logistic Regression
Extending univariate logistic regression to multivariate logistic regression is pretty straightforward. Instead of using a times X plus B, we can add multiple predictors in the formula in Python. Nothing changes apart from the fact that you now need to select multiple variables in the X object. If you output the coefficients, you will see that for each predictor used, a coefficient is calculated.
Constructing Your First Logistic Regression Model
You should now be ready to construct your first logistic regression model.
"WEBVTTKind: captionsLanguage: enlogistic regression is a widely used predictive modeling technique in this video you will learn how logistic regression predicts the target from candidate predictors and how to use logistic regression in Python recall from the coding exercises that elderly people are more likely to donate indeed if we plot the targets and function of the age for all donors in the population we see that a1 occurs more to the right where the older donors are if we fit a regression line through these points it is of the form a times X plus B with a a positive number a is called coefficient of H and B is called the intercept if we plot a target as a function of the time since the last donation for each donor it can be seen the people who recently donated are more likely to donate in this case the coefficient of recency is negative the regression line constructed can be used as a predictive model however the output is a real number that can be anything it will be more convenient to obtain a probability as output a number between 0 and 1 that expresses how likely it is that someone will donate luckily we can use the logit function to dead-ends this function takes the regression formula as inputs and calculates the probability from it as shown in this graph you can see that the output is indeed a number between 0 and 1 this little mathematical trick allows to use linear regression for binary classification problems you can built a logistic regression model using the module linear model from SK learn first you create a logistic regression model object using the logistic regression function next you need to feed data to the logistic regression model so that it can be fit the predictor and target are stored in two separate objects x and y using indexing both x and y are fed to the fit function that works on logistic regression model after the model is fit you can observe the coefficient that corresponds with the predictor age by checking the KO f value of the fitted model in this case the coefficient is positive namely 0.02 as we expected if you want to derive the entire formula from the fitted you can also retrieve the intercept by checking the intercept value which is about -4 until now we assume that there is only one predictor however many candidate predictors are available in the base table extending univariate logistic regression to multivariate logistic regression is pretty straightforward instead of using a times X plus B we can add multiple predictors in the formula in Python nothing changes apart from the fact that you now need to select multiple variables in the X object if you output the coefficients you see that for each predictor used a coefficient is calculated you should now be ready to construct your first logistic regression model let'slogistic regression is a widely used predictive modeling technique in this video you will learn how logistic regression predicts the target from candidate predictors and how to use logistic regression in Python recall from the coding exercises that elderly people are more likely to donate indeed if we plot the targets and function of the age for all donors in the population we see that a1 occurs more to the right where the older donors are if we fit a regression line through these points it is of the form a times X plus B with a a positive number a is called coefficient of H and B is called the intercept if we plot a target as a function of the time since the last donation for each donor it can be seen the people who recently donated are more likely to donate in this case the coefficient of recency is negative the regression line constructed can be used as a predictive model however the output is a real number that can be anything it will be more convenient to obtain a probability as output a number between 0 and 1 that expresses how likely it is that someone will donate luckily we can use the logit function to dead-ends this function takes the regression formula as inputs and calculates the probability from it as shown in this graph you can see that the output is indeed a number between 0 and 1 this little mathematical trick allows to use linear regression for binary classification problems you can built a logistic regression model using the module linear model from SK learn first you create a logistic regression model object using the logistic regression function next you need to feed data to the logistic regression model so that it can be fit the predictor and target are stored in two separate objects x and y using indexing both x and y are fed to the fit function that works on logistic regression model after the model is fit you can observe the coefficient that corresponds with the predictor age by checking the KO f value of the fitted model in this case the coefficient is positive namely 0.02 as we expected if you want to derive the entire formula from the fitted you can also retrieve the intercept by checking the intercept value which is about -4 until now we assume that there is only one predictor however many candidate predictors are available in the base table extending univariate logistic regression to multivariate logistic regression is pretty straightforward instead of using a times X plus B we can add multiple predictors in the formula in Python nothing changes apart from the fact that you now need to select multiple variables in the X object if you output the coefficients you see that for each predictor used a coefficient is calculated you should now be ready to construct your first logistic regression model let's\n"