Creating a Classification Dashboard with Python and Heroku
Let's start by creating our own classification dashboard using Python and Heroku. To do this, we'll first go back to the show code page and copy the code from the example dashboard. We'll then create a new file called "classification_app" in our Atom IDE and paste the copied code into it. After saving the file, we'll run the app by typing "python classification_app.py" in our terminal.
When we run the app, we can see that the model was being trained and various plots were being calculated. This is because Heroku's dynos are running the calculations in the background, and they're not displayed directly on the dashboard. However, once the model has finished training, the dashboard is now available at the URL provided.
Our dashboard includes a variety of interactive features, such as feature importance, classification stats, individual predictions, and what-if analysis. The feature importance tab allows us to see the relative importance of each feature on the model's performance. The classification stats tab is very interactive and allows us to adjust thresholds and see how they affect the data.
The individual predictions tab allows us to scroll down and view the contributions of each passenger to the prediction, as well as the aggregate value of each feature. This gives us a detailed view of how each feature contributes to the model's performance. The what-if feature dependence plot shows the relationship between feature values and class values, with the option to remove outliers.
Another interactive feature is the feature interactions tab, which displays the interaction between features sets and passenger class. We can also add specific passengers to see how they affect the interaction plots. Finally, the decision trees tab allows us to view the individual decision tree inside the random forest, giving us insight into how each tree contributes to the model's performance.
The classification explainer dashboard is a powerful tool that provides detailed insights into our machine learning models. By creating and customizing our own dashboards using Python and Heroku, we can gain a deeper understanding of our data science projects and improve our predictive modeling skills. If you find value in this tutorial, please give it a like, subscribe to our channel, and hit the notification bell so you'll be notified of future videos.
Classification Stats
The classification stats tab is very interactive and allows us to adjust thresholds and see how they affect the data. We can also view the confusion matrix, precision plot, classification plot, prauc plot, rlc auc plot, lift curve, cumulative position plot, and description of each plot. The plots are all interactive, allowing us to zoom in and see detailed information.
Here is a description of each plot:
* Confusion Matrix: This plot shows the true positives, false positives, true negatives, and false negatives.
* Precision Plot: This plot shows the precision at different thresholds.
* Classification Plot: This plot shows the classification results at different thresholds.
* PRAUC Plot: This plot shows the precision-recall AUC at different thresholds.
* RLC AUC Plot: This plot shows the receiver operating characteristic AUC at different thresholds.
* Lift Curve: This plot shows the lift of different predictions at different thresholds.
* Cumulative Position Plot: This plot shows the cumulative position of different predictions at different thresholds.
All of these plots provide valuable insights into our machine learning model's performance and can help us improve our predictive modeling skills. By using these interactive features, we can gain a deeper understanding of our data science projects and make more informed decisions.
Individual Predictions
The individual predictions tab allows us to scroll down and view the contributions of each passenger to the prediction. We can also see how each feature contributes to the prediction for each passenger. This gives us a detailed view of how each feature affects the model's performance for each specific passenger.
Here is an example of what we can see in this tab:
* Passenger ID: This shows the unique identifier for each passenger.
* Predicted Class: This shows the predicted class for each passenger.
* Feature Contributions: This shows the contribution of each feature to the prediction for each passenger.
* Relative Percent Effect: This shows the relative percent effect of each feature on the model's performance.
By viewing this information, we can gain a deeper understanding of how our machine learning models are making predictions and identify areas where we can improve their performance.
Feature Interactions
The feature interactions tab allows us to display the interaction between features sets and passenger class. We can also add specific passengers to see how they affect the interaction plots. This gives us insight into how the interactions between features affect the model's performance for different passengers.
Here is an example of what we can see in this tab:
* Feature Sets: This shows the feature sets used in the plot.
* Passenger Class: This shows the passenger class used in the plot.
* Interaction Plot: This shows the interaction between the feature set and passenger class.
By viewing this information, we can gain a deeper understanding of how our machine learning models are making predictions and identify areas where we can improve their performance.
Decision Trees
The decision trees tab allows us to view the individual decision tree inside the random forest. This gives us insight into how each tree contributes to the model's performance and how they're used to make predictions.
Here is an example of what we can see in this tab:
* Tree ID: This shows the unique identifier for each tree.
* Decision Rule: This shows the decision rule used by each tree.
* Node Values: This shows the node values used by each tree.
* Leaf Class: This shows the leaf class used by each tree.
By viewing this information, we can gain a deeper understanding of how our machine learning models are making predictions and identify areas where we can improve their performance.