FLAML - The AutoML from Microsoft (Machine Learning Models in 3 Lines of Code)

**Getting Started with Auto ML from Flammable**

In this tutorial, we will explore how to quickly implement a regression or classification model using the Auto ML library from Flammable on Google Colab. With just three lines of code, you can switch between regression and classification models and leverage standard functions from scikit-learn to perform data split and calculate performance metrics.

**Loading the Iris Data Set**

To begin, we load the Iris data set from the Flammable library. We use `load_iris()` function to load the dataset. This data set is a classic multi-class classification problem, where we need to predict one of three species: setosa, versicolor, or virginica.

```python

import flambal as fl

from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt

import seaborn as sns

sns.set_style("whitegrid")

```

**Making Prediction**

Next, we make a prediction using the Auto ML library from Flammable. We use `make_prediction()` function to make predictions on our dataset.

```python

# Make prediction

y_train_pred = fl.make_prediction(X_train, y_train)

```

**Calculating Model Performance Metrics**

We calculate the model performance metrics using standard functions from scikit-learn. In this case, we calculate the Matthews correlation coefficient (MCC) which is a measure of the accuracy of our classification model.

```python

# Calculate MCC

from sklearn.metrics import matthews_corrcoef

mcc = matthews_corrcoef(y_train, y_train_pred)

print("MCC: ", mcc)

```

**Creating a Scatter Plot**

We create a scatter plot to visualize our predictions. We use `sns.scatterplot()` function to create the scatter plot.

```python

# Create scatter plot

plt.figure(figsize=(10, 6))

sns.scatterplot(x=y_train[:, 0], y=y_train_pred, hue=y_train)

plt.title("Iris Data Set Scatter Plot")

plt.xlabel("Feature 1")

plt.ylabel("Predicted Value")

plt.show()

```

**Boston Housing Data Set**

Next, we load the Boston housing data set from Flammable. We use `load_boston_housing()` function to load the dataset.

```python

# Load Boston Housing Data Set

X_train, y_train = fl.load_boston_housing()

```

**Making Prediction on Boston Housing Data Set**

We make a prediction using the Auto ML library from Flammable. We use `make_prediction()` function to make predictions on our dataset.

```python

# Make prediction

y_train_pred = fl.make_prediction(X_train, y_train)

```

**Calculating Model Performance Metrics on Boston Housing Data Set**

We calculate the model performance metrics using standard functions from scikit-learn. In this case, we calculate the R-squared value which is a measure of how well our regression model fits the data.

```python

# Calculate R-squared value

from sklearn.metrics import r2_score

r2 = r2_score(y_train, y_train_pred)

print("R-squared Value: ", r2)

```

**Creating a Scatter Plot on Boston Housing Data Set**

We create a scatter plot to visualize our predictions. We use `sns.scatterplot()` function to create the scatter plot.

```python

# Create scatter plot

plt.figure(figsize=(10, 6))

sns.scatterplot(x=y_train[:, 0], y=y_train_pred, hue=y_train)

plt.title("Boston Housing Data Set Scatter Plot")

plt.xlabel("Feature 1")

plt.ylabel("Predicted Value")

plt.show()

```

**Conclusion**

In this tutorial, we demonstrated how to quickly implement a regression or classification model using the Auto ML library from Flammable on Google Colab. We loaded the Iris data set and made predictions using the `make_prediction()` function. We calculated the Matthews correlation coefficient (MCC) which is a measure of the accuracy of our classification model. We also created a scatter plot to visualize our predictions. Similarly, we loaded the Boston housing data set and made predictions using the `make_prediction()` function. We calculated the R-squared value which is a measure of how well our regression model fits the data. We also created a scatter plot to visualize our predictions.

"WEBVTTKind: captionsLanguage: enso today we're going to talk about an automl library called flammo which is released by microsoft and so in a nutshell flammable is a lightweight python library that will allow you to perform automated machine learning and so in just a few lines of code you could build some pretty amazing machine learning models and so today we're going to have a quick video taking a look at some of the codes that you could use to implement your very own automl using the flammo and so this is the github of flammo from microsoft and so in order to get started you must first install it by pip install flammable so why don't we do that on the google collab so let me split this into half so we're going to run it side by side here so let me document this tuber notebook so flammable tutorial and so let's install it install flammable and then we'll do pip install flammable oh but then when we're using it in a jupiter notebook so we should do this one as well so let's do this flammable notebook all right and here in the quick start it says that if you would like to build your model you just put the following three lines of code and you could get a quick classification model using flammable so let's do that so here we're installing the notebook version for flammable so it just provides support for the notebook so let's just copy the code here copy and let's say building a classification model so paste the code here but then we need to have some example data sets so let's see let me get one example data set from the github let me go here github data professor let's go to the data repository and then let's go over to the delani solubility all right click on the raw click on this link and so let's load in the data load data so we're going to use panda so import pandas as pd and then we're going to create a data frame pd.read csv we'll put in the link to the file we'll read it in and let's take a look at the data frame then we're going to split it into x and y and then we're going to perform some data split so to get x we're going to select the first four variables and then to get y we're just going to select the log s so in order to get x variables then we're going to just drop that log s here let's say df dot drop and then we have log s and then we will specify axis equals to one to mean columns all right now we have the four columns that we need and then we're going to assign this to x variable and then we're going to print it out and here we have x let's create the y y equals two df and we could just say dot log s and then we're going to get y all right good and now we're going to do some data split so let me search for cycle learn and then the train test splits and so we're going to copy this part import the library so we're gonna do data split here so i'm building a model from scratch so we're typing it here live so we're going to run the train test split so let's copy this code here and then we're going to make the test size to be 20 so it will be 0.2 and we specify the random state to be 42 so that it will be reproducible and x and y are in the x and y variables here y and x all right and then we run the data splits and to ensure that we have indeed performed successfully let's take a look at the shape of x and y and why don't we do the same for the x strain as well let's copy this part and put shape afterwards and then it looks alright so it's 80 percent and 20 80 20 for the x and for the y's all right and so let's use the x strain and the y train which we got from here x train and y train to build a classification model oh but then our data set here is regression so task should be regression then all right and there you go the model is being built so nothing else than the example three lines of code that they provided so it's essentially importing the automl function and then instantiating the model to the automl variable performing the dot fit in order to perform the model training and then here we specify that we're going to use the xtreme and the y train pair as the input data and then the task here we specify to be regression because our y variable is a quantitative value and we can see here as a preview that is currently building some xgboos cat boost models here and also extra tree learning model as well so let's give it some time so sit back relax grab a cup of coffee and wait until it's finished and so here it says that if you would like to specify the learning algorithms you could put it in the estimator list or even add your own custom learner so for this one they specify the lg ibm or even you could specify it to perform some hyper parameter training as well and they provide you some example code which you could just use the iris data set so instead of performing this calculation here you could just copy the entire code block here and then you'll be able to run an example classification model and also a regression model so we're going to try this in just a moment so let's hop on back to our model for the solubility data set so let's see let's see right here after the model is being built you could apply it for making a prediction okay so it works pretty much similar to the scikit-learn and let's print out the best model let's get this code here so we're gonna print out the best model so because we have already instantiated the model into the automl variable so we're just gonna print automl dot model and so it appears that the cat boost provided the best performance and so why don't we apply the model on a x test so let's see right here auto ml predict probability on the x train regression okay so we could do this so why don't we apply the model so let's make some predictions so let's say auto ml dot predict on the x test and then we'll put it in the predict variable x test predict variable we'll print it out all right so that's our predicted value all right so let's have a look at the scatter plot so let's do a quick search at plot lib scatter plot you can see we're just gonna build a simple scatter plot actual versus predictive clock is plt scatter and we're going to use the x y test and the y test predict yeah okay it should be y here we have y test predict all right and there you go we have the scatter plot but better yet let's have a look at the sample code that we have previously used search for prior code because we have already stylized some of the scatter plot and regression okay so pretty basic plots see solvability code solvability let's go to the repo i think it's here in the what app yeah let's do that the jubilee notebook here okay right here so we could just you know copy the code gears and we could just change the data inside so on the x axis we're going to use y test y test and here we're going to use y test thread so x axis is the experimental value and y-axis is the predicted value let's try it not to find white okay so it's predicted right predict predict all right there you go same plot but then we stylize it a bit in the matpot lip and so this is the predicted value from cad boost algorithm and the model looks pretty good here and let's compute the mathematics correlation coefficients let me search for socket learn not to use so it says copy here performance so we're gonna compute the mcc and then here this is y test and we have y test red predict say aim c c y test oh okay i'm sorry it's matthews not matthews it should be r squared sorry math and then second one it's the r2 score so it's r2 4 and then r to score and then we just hit r2 and then we type it in again to print out value and if we would like to make it a round number we just say round and then three decimal points so right here we got 0.885 for the r square all right so let's try out another example let's have a look at the basic classification example here let's copy that so let me document the notebook here so this is the classification example so we'll make it we'll make it bold we'll make it headed one and this is the example okay so there's no log file we'll just take it out and this is the example for regression let's copy it all right here we go so also cat boots provided the best performance here and okay so this is for classification and let me put the example here we have the regression sample so regression is similar to the above solubility which is also a regression example so we just you know comment it out first and then we could probably create the file later if we would like to log it so it's mentioned here we could calculate the mcc value you know take here so you could mix and match and use the predicted value from the auto ml of the flammable and then you could then treat it kind of like how you would when you use psychic learn and then for the predicted results you could use the cycle learn packages to compute other model performance metrics such as the r2 score or the matthew's correlation coefficients so here we're going to compute the mcc so let's see what is the so it's y train okay so let's say that we use it to predict auto ml dot click probable okay so let's see that we assign this to y no we just take the print out will say y test and then predict equals to an x test after input all right so actually why don't we just you know comment let me undo it i'll just comment this part out because i want to have to retrain the model again and then we're going to generate the predicted value right here y test predict and then we're going to apply the model automl dot predict probably to x test but actually we don't need to predict the probability we just make the prediction okay probably is the data okay so we'll just predict the proper and then let's print it out oh okay we don't have x test here we only have train all right all right okay so let's just use that string so in the example they didn't do any data split train take a problem so it's y train it doesn't look like the iris data set here x3 y2 okay i have to rerun it because it's probably coming from the boston data set all right and then let's run it again okay now we got the probability and i'll just put the probability that i got here directly into the data label in order to see which of the classes they are because number one number zero number two they each represent a different color they each represent a different flower species in the iris data sets so we have load iris right load iris so it's here in the let's see in the feature names right here target names target names let me find it target names target names okay i'll call it feature names and then let's see if i say feature names and i put in zero it would be tulsa i need to read it for three of them and if i do zero one two i'll get the three different flower species the toaster versus color and virginia so for the predicted values here instead of having at zero one or two you could also print out whether they are setosa versus color or vegeta as well okay so zero here would be ctrl sub one here would be versicolor and virginica would be two so you can see here that some of them are mispredicted right so instead of being two it's predicted to be one and instead of being either zero or one it's predicted to be two so there are some misclassifications here so let's see what is the performance my train and my train thread y train predicts let's see let's just call it accy all right so mcc value here is 0.93 so we could also print it out you know put in the round function we can make it four this time and we get four digits here nine three one six and let's have a look at the regression example for the boston housing data set from flammable here and so you could pretty much reuse the code that i've shown you right here okay so why don't i copy this part here make prediction right predict did we do any data split no we only have x stream and y train so we say stream here so this becomes y train train and we'll copy the r r2 score from above oh wrong code something wrong with the keyboard let me do what you can edit copy scroll down oh it's not working let me try again right here oh it is correct but i mislabeled it r2 okay r2 and it is train train train print it out and we've got 0.975 for the boston housing data set now let's make some scatter plot here so let's reuse the code here all right okay it should be y train just for the other data set so this y train line train on screen so this is for the boston housing data set and so you could also pick another color here and my favorite website to get some html color it's this one html color codes so you could select another color of your choice okay let's just use this color then oh just put the color code here and it becomes oh that's the for the the line let's put here and it becomes pink and we could even make the line like that color no it's actually fff is white zero zero zero so we have black color or we can have it more gray so let's go with see or you could try you know selecting a color heel from here so you have like a grayish color like this one right here paste it here all right let me get you know a bit lighter all right there you go okay so this is a quick tutorial on using flammable and as you can see in just three lines of code right here at the top right here three lines of code you just change it to regression or classification depending on which one it is and then you could use the standard functions from scikit-learn in order to perform data split in order to calculate the model performance metric and also matplotlib or seaborn or any other of your favorite graph and plotting libraries to illustrate the prediction results so here we use the scatter plot and also we calculated the matthews correlation coefficients and as you can see this is not a pre-recorded video so everything is done live and so i've just you know split the screen look at the documentation and just implement it directly into the google code lab and so you can see here how easy it is to use auto ml from the flammable library to quickly implement a regression or a classification model from scratch on the google collab and so if you're finding value in this video please give it a like subscribe if you haven't already and also make sure to hit on the notification bell so that you will be notified of the next video and as always the best way to learn data science is to do data science and please enjoy the journeyso today we're going to talk about an automl library called flammo which is released by microsoft and so in a nutshell flammable is a lightweight python library that will allow you to perform automated machine learning and so in just a few lines of code you could build some pretty amazing machine learning models and so today we're going to have a quick video taking a look at some of the codes that you could use to implement your very own automl using the flammo and so this is the github of flammo from microsoft and so in order to get started you must first install it by pip install flammable so why don't we do that on the google collab so let me split this into half so we're going to run it side by side here so let me document this tuber notebook so flammable tutorial and so let's install it install flammable and then we'll do pip install flammable oh but then when we're using it in a jupiter notebook so we should do this one as well so let's do this flammable notebook all right and here in the quick start it says that if you would like to build your model you just put the following three lines of code and you could get a quick classification model using flammable so let's do that so here we're installing the notebook version for flammable so it just provides support for the notebook so let's just copy the code here copy and let's say building a classification model so paste the code here but then we need to have some example data sets so let's see let me get one example data set from the github let me go here github data professor let's go to the data repository and then let's go over to the delani solubility all right click on the raw click on this link and so let's load in the data load data so we're going to use panda so import pandas as pd and then we're going to create a data frame pd.read csv we'll put in the link to the file we'll read it in and let's take a look at the data frame then we're going to split it into x and y and then we're going to perform some data split so to get x we're going to select the first four variables and then to get y we're just going to select the log s so in order to get x variables then we're going to just drop that log s here let's say df dot drop and then we have log s and then we will specify axis equals to one to mean columns all right now we have the four columns that we need and then we're going to assign this to x variable and then we're going to print it out and here we have x let's create the y y equals two df and we could just say dot log s and then we're going to get y all right good and now we're going to do some data split so let me search for cycle learn and then the train test splits and so we're going to copy this part import the library so we're gonna do data split here so i'm building a model from scratch so we're typing it here live so we're going to run the train test split so let's copy this code here and then we're going to make the test size to be 20 so it will be 0.2 and we specify the random state to be 42 so that it will be reproducible and x and y are in the x and y variables here y and x all right and then we run the data splits and to ensure that we have indeed performed successfully let's take a look at the shape of x and y and why don't we do the same for the x strain as well let's copy this part and put shape afterwards and then it looks alright so it's 80 percent and 20 80 20 for the x and for the y's all right and so let's use the x strain and the y train which we got from here x train and y train to build a classification model oh but then our data set here is regression so task should be regression then all right and there you go the model is being built so nothing else than the example three lines of code that they provided so it's essentially importing the automl function and then instantiating the model to the automl variable performing the dot fit in order to perform the model training and then here we specify that we're going to use the xtreme and the y train pair as the input data and then the task here we specify to be regression because our y variable is a quantitative value and we can see here as a preview that is currently building some xgboos cat boost models here and also extra tree learning model as well so let's give it some time so sit back relax grab a cup of coffee and wait until it's finished and so here it says that if you would like to specify the learning algorithms you could put it in the estimator list or even add your own custom learner so for this one they specify the lg ibm or even you could specify it to perform some hyper parameter training as well and they provide you some example code which you could just use the iris data set so instead of performing this calculation here you could just copy the entire code block here and then you'll be able to run an example classification model and also a regression model so we're going to try this in just a moment so let's hop on back to our model for the solubility data set so let's see let's see right here after the model is being built you could apply it for making a prediction okay so it works pretty much similar to the scikit-learn and let's print out the best model let's get this code here so we're gonna print out the best model so because we have already instantiated the model into the automl variable so we're just gonna print automl dot model and so it appears that the cat boost provided the best performance and so why don't we apply the model on a x test so let's see right here auto ml predict probability on the x train regression okay so we could do this so why don't we apply the model so let's make some predictions so let's say auto ml dot predict on the x test and then we'll put it in the predict variable x test predict variable we'll print it out all right so that's our predicted value all right so let's have a look at the scatter plot so let's do a quick search at plot lib scatter plot you can see we're just gonna build a simple scatter plot actual versus predictive clock is plt scatter and we're going to use the x y test and the y test predict yeah okay it should be y here we have y test predict all right and there you go we have the scatter plot but better yet let's have a look at the sample code that we have previously used search for prior code because we have already stylized some of the scatter plot and regression okay so pretty basic plots see solvability code solvability let's go to the repo i think it's here in the what app yeah let's do that the jubilee notebook here okay right here so we could just you know copy the code gears and we could just change the data inside so on the x axis we're going to use y test y test and here we're going to use y test thread so x axis is the experimental value and y-axis is the predicted value let's try it not to find white okay so it's predicted right predict predict all right there you go same plot but then we stylize it a bit in the matpot lip and so this is the predicted value from cad boost algorithm and the model looks pretty good here and let's compute the mathematics correlation coefficients let me search for socket learn not to use so it says copy here performance so we're gonna compute the mcc and then here this is y test and we have y test red predict say aim c c y test oh okay i'm sorry it's matthews not matthews it should be r squared sorry math and then second one it's the r2 score so it's r2 4 and then r to score and then we just hit r2 and then we type it in again to print out value and if we would like to make it a round number we just say round and then three decimal points so right here we got 0.885 for the r square all right so let's try out another example let's have a look at the basic classification example here let's copy that so let me document the notebook here so this is the classification example so we'll make it we'll make it bold we'll make it headed one and this is the example okay so there's no log file we'll just take it out and this is the example for regression let's copy it all right here we go so also cat boots provided the best performance here and okay so this is for classification and let me put the example here we have the regression sample so regression is similar to the above solubility which is also a regression example so we just you know comment it out first and then we could probably create the file later if we would like to log it so it's mentioned here we could calculate the mcc value you know take here so you could mix and match and use the predicted value from the auto ml of the flammable and then you could then treat it kind of like how you would when you use psychic learn and then for the predicted results you could use the cycle learn packages to compute other model performance metrics such as the r2 score or the matthew's correlation coefficients so here we're going to compute the mcc so let's see what is the so it's y train okay so let's say that we use it to predict auto ml dot click probable okay so let's see that we assign this to y no we just take the print out will say y test and then predict equals to an x test after input all right so actually why don't we just you know comment let me undo it i'll just comment this part out because i want to have to retrain the model again and then we're going to generate the predicted value right here y test predict and then we're going to apply the model automl dot predict probably to x test but actually we don't need to predict the probability we just make the prediction okay probably is the data okay so we'll just predict the proper and then let's print it out oh okay we don't have x test here we only have train all right all right okay so let's just use that string so in the example they didn't do any data split train take a problem so it's y train it doesn't look like the iris data set here x3 y2 okay i have to rerun it because it's probably coming from the boston data set all right and then let's run it again okay now we got the probability and i'll just put the probability that i got here directly into the data label in order to see which of the classes they are because number one number zero number two they each represent a different color they each represent a different flower species in the iris data sets so we have load iris right load iris so it's here in the let's see in the feature names right here target names target names let me find it target names target names okay i'll call it feature names and then let's see if i say feature names and i put in zero it would be tulsa i need to read it for three of them and if i do zero one two i'll get the three different flower species the toaster versus color and virginia so for the predicted values here instead of having at zero one or two you could also print out whether they are setosa versus color or vegeta as well okay so zero here would be ctrl sub one here would be versicolor and virginica would be two so you can see here that some of them are mispredicted right so instead of being two it's predicted to be one and instead of being either zero or one it's predicted to be two so there are some misclassifications here so let's see what is the performance my train and my train thread y train predicts let's see let's just call it accy all right so mcc value here is 0.93 so we could also print it out you know put in the round function we can make it four this time and we get four digits here nine three one six and let's have a look at the regression example for the boston housing data set from flammable here and so you could pretty much reuse the code that i've shown you right here okay so why don't i copy this part here make prediction right predict did we do any data split no we only have x stream and y train so we say stream here so this becomes y train train and we'll copy the r r2 score from above oh wrong code something wrong with the keyboard let me do what you can edit copy scroll down oh it's not working let me try again right here oh it is correct but i mislabeled it r2 okay r2 and it is train train train print it out and we've got 0.975 for the boston housing data set now let's make some scatter plot here so let's reuse the code here all right okay it should be y train just for the other data set so this y train line train on screen so this is for the boston housing data set and so you could also pick another color here and my favorite website to get some html color it's this one html color codes so you could select another color of your choice okay let's just use this color then oh just put the color code here and it becomes oh that's the for the the line let's put here and it becomes pink and we could even make the line like that color no it's actually fff is white zero zero zero so we have black color or we can have it more gray so let's go with see or you could try you know selecting a color heel from here so you have like a grayish color like this one right here paste it here all right let me get you know a bit lighter all right there you go okay so this is a quick tutorial on using flammable and as you can see in just three lines of code right here at the top right here three lines of code you just change it to regression or classification depending on which one it is and then you could use the standard functions from scikit-learn in order to perform data split in order to calculate the model performance metric and also matplotlib or seaborn or any other of your favorite graph and plotting libraries to illustrate the prediction results so here we use the scatter plot and also we calculated the matthews correlation coefficients and as you can see this is not a pre-recorded video so everything is done live and so i've just you know split the screen look at the documentation and just implement it directly into the google code lab and so you can see here how easy it is to use auto ml from the flammable library to quickly implement a regression or a classification model from scratch on the google collab and so if you're finding value in this video please give it a like subscribe if you haven't already and also make sure to hit on the notification bell so that you will be notified of the next video and as always the best way to learn data science is to do data science and please enjoy the journey\n"