Building a web app in Python for analyzing YouTube channels

**Unpacking the Code: A Deep Dive into the Streamlit Web App**

The code for the web app is remarkably concise, with only 85 lines of code. This is a testament to the power and flexibility of Streamlit, a low-code web framework that enables data scientists to build interactive web applications quickly and easily.

**Importing Libraries and Setting Up the Framework**

At the beginning of the code, we see the necessary libraries being imported: `pandas`, `Altair`, `numpy`, and `json`. These libraries play crucial roles in the app's functionality. For instance, `pandas` is used for data wrangling, while `Altair` is employed to create the scatter plot that dominates the app's user interface.

**Rounding Up Values with NumPy**

The `Roundup` feature is implemented using `numpy`, a library renowned for its numerical processing capabilities. The code snippet showcases how this feature can be used to round up values, which is essential for data analysis and visualization.

**Using Session State to Save User Input**

Session state is utilized to save the user's input from the select box and slider. This ensures that when the user selects a new value, the app updates in real-time. The code snippet demonstrates how session state can be used to store user input and retrieve it later for plotting.

**Analyzing YouTube Channel Data with Pandas**

The app imports JSON data containing YouTube channel information, including keyword data science, join date, channel ID, view count, number of videos on the channel, and number of subscribers. The `pandas` library is employed to perform data wrangling, filtering, and selection.

**Creating a Scatter Plot with Altair**

On line 74, the app creates a scatter plot using Altair, a visualization library that enables users to create interactive plots. The code snippet demonstrates how to define colors, select alternative colors, and display the scatter plot on the screen.

**Expanding Data Frames for Scatter Plots**

The final piece of code creates an expandable data frame used as the underlying data source for the scatter plot. This ensures that when the user interacts with the app, the plot updates dynamically.

**A Flexible Template for Customization**

This web app template is remarkably flexible and can be extended or modified to suit various data science projects. With a basic understanding of Streamlit, pandas, Altair, numpy, and JSON libraries, users can build their own custom data-driven applications using this template.

**Beyond YouTube Data: Expanding the App's Capabilities**

The possibilities for this web app are vast, and it can be applied to various domains beyond YouTube data. Users can experiment with different data sources like Instagram, Glassdoor, LinkedIn, or even their own personal datasets to create unique applications that suit their needs.

**Conclusion: Building Upon This Template**

This Streamlit web app template is a shining example of how data science can be applied to real-world problems through interactive and engaging visualizations. We invite readers to experiment with this template, modify it to fit their specific use cases, and explore new projects that drive meaningful insights from data.

"WEBVTTKind: captionsLanguage: enso in several of the videos that I've made on the channel about how to get started in coding or in data science I usually recommend to choose a data problem that interests and resonate with you and probably one of the best way is to collect your own data and you could do that either manually or automatically using web scraping and for this I choose bright data for its web scraping IDE platform and also for its large Network infrastructure and so at first glance you might see that bright data might have only tools revolving around data collection but under the hood it also provides this large proxy Network that automatically performs and handle different IPS that are used during web scraping because often times when you're performing web scraping when you're doing it locally what happens is that your IP might get blocked and whether it's blocked you won't have access to the data because you can access the website and since for example I'm living here in Thailand and some websites could not be accessed because they blocked IPS outside of for example the United States so a platform like bright data which handles all of the proxy Network for you as well as providing a IDE framework for you to perform web scraping is probably a good choice to start with so I'm going to use that in this particular tutorial and so let's Dive In before proceeding further let's take a look at the conceptual overview of the data project so firstly what we're going to do is we're going to go to YouTube and then we're going to search for data science and then from the search results we're going to extract all of the YouTube channels and then for each of the YouTube channel we're going to go to the about page and then we're going to perform web scraping on that so you're going to notice that we're going to take the name of the YouTube channel the subscriber count the video count the view count and also the about section as well as the join date and also the channel ID and now we're going to take a closer look at the tech stack so at a high level you're going to see that there's two main sections in the first section we're going to perform web scraping with bright data and in the second section we're going to take the web script data and then perform data analysis as well as coding the streamlit web app let's hover back to the first part before we could perform the actual web scraping we're going to code the JS scripts as this will essentially comprise of the interaction component and also the parser component once we have the JS scripts ready we're going to paste in the data into the web scraper IDE on the bright data platform which will allow us to perform the actual web scraping whereby we're going to extract information from the YouTube website and then we're going to put the web script data in Json format in the second phase we're going to take the Json data as an input and then we're going to code a streamlit web app in order to perform data processing which will then be used to create the data visualization using the Altair library and it will add input widgets into the web app in order to allow the user to customize the data visualization to their own preference when we're happy with the app we're going to push it to the GitHub repo and that we're going to deploy it and in this project we're using the streamlined Community Cloud which will allow us to freely host the drumlet web app and all of the URL to the GitHub repo and also the streamlit web app is provided in the video description and so before I show you how to actually build the project let me show you first how it will look like at the end of the project so this is the project folder so we have here the Json data web script from the write data platform and then we have these trimlet app.py which contains this trimlet app for visualizing data inside the Json file so you can see here that we have about 85 lines of code that you could use to build this particular web application and so let me run this first let me activate the environment let me make sure that I'm in the current directory okay I'm in this particular YouTube data directory and then I'm going to run the app streamlit run and then install it app.py and in a few moments a web browser will load up all right so this is the web app that we're going to build today so it is a YouTube channel data analytics app so to be more exact it analyzes topics of your choice so in this particular tutorial we're focusing on YouTube channels that are having the keyword of data science and so this simple app will allow you to choose the variable that will go on the x-axis as we visualize the data and then choosing the variable that will go to the y-axis as well and then depending on what you have chosen here it will update the various variable minimum and maximum value as a range slider so you could adjust the value and then you'll be able to see an updated data okay and you'll notice here at the bottom there's a corresponding data frame or the data that are underlying the visualization that you see so each of the points here are one row entry in the data frame here okay so you can see that this is a very simple application and I created this as a template that you could use for building on top of this particular template app and so let me know in the comment section how you extended this particular application for your own project so now that we have taken a quick look at what the N outcome will look like let's start from the beginning so the first thing that you want to do is you want to sign up for bright data and I'll provide you the link in the video description alright so let's have a look at the bright data platform let's log in I'll just head over to my dashboard since I'm already logged in and you could see the proxy infrastructure here and then you can see the data sets that I mentioned and the web scraper so these are the data products or data that has already been web scraped so if you have no coding background no problem because you could get access to web script data and there's a lot of data for you to choose from so you could check these out as well and you can see here that I've already played around with the Amazon reviewed books and let's dive into the scraper so when I embarked upon this data project to perform web scraping the first thought that came to mind was to web scrape data from the YouTube platform and when write data reached out I thought it would be a great opportunity to create a video showing the thought process and how I went about building this data project and so let's have a look here briefly so you can see that these are those scrapers that I've worked on so if you click on it you're going to see that it expands and then it provides too with the details of your task that you have already performed so let's take a look inside the sweeper so you can see here that there are two major code blocks there's the interaction code and there's the parser code so in the interaction code it is where the scraper will perform the actual task of hopping over to the website that you want to perform web scraping and then it will perform the scraping and after it performs scraping of the data it will have to parse the data So the instructions for these two tasks will be provided into two respective windows and you'll notice here that we perform web scripting in three successive and so you can see here that for this project I performed web scripting in three successive steps in stage one two and three and at the bottom here I specify the input for searching YouTube to be data science so all of the search results that have YouTube channels with the keyword of data science will be retrieved into the search results and then it will be web script and then the preview window here will allow us to see the web scraping while it happens for the initial run and after you're finished you could say finish editing or you could also save this to development if you have already modified the code and so since I have already done that I'll just skip over to the next phase which is to perform the actual app building so let's have a look at the data that we had already web scraped so you'll notice here that this is a Json file and the contents so you'll notice that in the about of the YouTube channel from IBM there's a keyword data science because the query is on Channels with data science keywords and then we have like their join date the channel ID to view count the number of videos on the channel the number of subscribers on the channel and for example this is from data Professor YouTube channel join date about Channel ID and so all of this data will be analyzed using the stromaed app that we have just created here all right guys so let's have a look under the hood into the code so you can see here that the entire code for the web app takes less than 100 lines of code so particularly only 85 blind zip code let's have a look at the breakdown of the code but before doing so let's run the app so in the command line you could type in streamlit run and then the name of the app which in our case is to remove underscore app.py hit on enter and then the app will appear in the browser shown here so you can see here that in the first couple of lines we're going to import the necessary libraries who are used in streamlit as the low code web framework for this entire project and then we're using pandas in order to perform data wrangling because we have to do a lot of data preparation in order to select particular columns that we'd like or perform some additional data filtering and then we're using Altair to display the scatter plot that you see here in the app we're using numpy to perform some numerical processing particularly in the Roundup feature here that we've created as well as the math lab library and then we're using Json library in order to import the data that we have web scripts so you see here that lines number 9 into 11 we've created a custom function in order to perform a rounding up of the values that are shown here in the slider and the lines 13 until 21 we use session state in order to save the selected data values from the select box drop down menu here so whenever we select a value here particularly from the slider you'll see that it will save into the memory right and then line 24 will essentially just printing out a simple statement for our web app which is the YouTube channel data analytics linestone 829 we're importing the Json data and then we're going to perform some additional data wrangling 39 until 63 we're going to have this select box here where the user could select a different value here from this select box and then they could select a different value from the slide and then the scatter plot will be updated in real time so under the hood in order to do that we have to perform some additional data wrangling where when we select a different value from the slider here it will be reflected because we perform additional data wrangling and so let's have a look at line 627 particularly 74 into 80 we're going to create this scatter plot using Altair and it's a simple scatter plot where we Define the color and you can select another color that you like and finally we display the scatter plot here on line 82 and then 8485 we're creating this expandable data frame where it is the underlying data that are used for the scatter plot and so that's it less than 100 lines of code you could create this data-driven web application for your data project so as you can see this particular data project could be extended and built upon for example if you are a product marketer you could use this particular web app to drive your influencer marketing campaigns you could identify potential YouTubers to create content for your products or if you are a content creator you could potentially use the app here to find potential collaborators for your Channel or even find potential viral topics that you could use to create your own videos on and so the possibilities are at less you could change the data set from YouTube data here and use other data like Instagram data Glassdoor data LinkedIn data and then build your own custom data app so I'd love to hear from you how you're building upon this particular web app template and so let me know in the comment section down below thanks for watching until the end of the video and I hope that this video was helpful to you and let me know in the comments down below which future project that you would like me to work on and while you're at it please drop a Star Emoji so that I know that you're the real one and it will mean the world to me if you could smash the like button share the video to your friend post it on social media subscribe to the channel and as always the best way to learn data science is to web script your own data and analyze it and please enjoy the journeyso in several of the videos that I've made on the channel about how to get started in coding or in data science I usually recommend to choose a data problem that interests and resonate with you and probably one of the best way is to collect your own data and you could do that either manually or automatically using web scraping and for this I choose bright data for its web scraping IDE platform and also for its large Network infrastructure and so at first glance you might see that bright data might have only tools revolving around data collection but under the hood it also provides this large proxy Network that automatically performs and handle different IPS that are used during web scraping because often times when you're performing web scraping when you're doing it locally what happens is that your IP might get blocked and whether it's blocked you won't have access to the data because you can access the website and since for example I'm living here in Thailand and some websites could not be accessed because they blocked IPS outside of for example the United States so a platform like bright data which handles all of the proxy Network for you as well as providing a IDE framework for you to perform web scraping is probably a good choice to start with so I'm going to use that in this particular tutorial and so let's Dive In before proceeding further let's take a look at the conceptual overview of the data project so firstly what we're going to do is we're going to go to YouTube and then we're going to search for data science and then from the search results we're going to extract all of the YouTube channels and then for each of the YouTube channel we're going to go to the about page and then we're going to perform web scraping on that so you're going to notice that we're going to take the name of the YouTube channel the subscriber count the video count the view count and also the about section as well as the join date and also the channel ID and now we're going to take a closer look at the tech stack so at a high level you're going to see that there's two main sections in the first section we're going to perform web scraping with bright data and in the second section we're going to take the web script data and then perform data analysis as well as coding the streamlit web app let's hover back to the first part before we could perform the actual web scraping we're going to code the JS scripts as this will essentially comprise of the interaction component and also the parser component once we have the JS scripts ready we're going to paste in the data into the web scraper IDE on the bright data platform which will allow us to perform the actual web scraping whereby we're going to extract information from the YouTube website and then we're going to put the web script data in Json format in the second phase we're going to take the Json data as an input and then we're going to code a streamlit web app in order to perform data processing which will then be used to create the data visualization using the Altair library and it will add input widgets into the web app in order to allow the user to customize the data visualization to their own preference when we're happy with the app we're going to push it to the GitHub repo and that we're going to deploy it and in this project we're using the streamlined Community Cloud which will allow us to freely host the drumlet web app and all of the URL to the GitHub repo and also the streamlit web app is provided in the video description and so before I show you how to actually build the project let me show you first how it will look like at the end of the project so this is the project folder so we have here the Json data web script from the write data platform and then we have these trimlet app.py which contains this trimlet app for visualizing data inside the Json file so you can see here that we have about 85 lines of code that you could use to build this particular web application and so let me run this first let me activate the environment let me make sure that I'm in the current directory okay I'm in this particular YouTube data directory and then I'm going to run the app streamlit run and then install it app.py and in a few moments a web browser will load up all right so this is the web app that we're going to build today so it is a YouTube channel data analytics app so to be more exact it analyzes topics of your choice so in this particular tutorial we're focusing on YouTube channels that are having the keyword of data science and so this simple app will allow you to choose the variable that will go on the x-axis as we visualize the data and then choosing the variable that will go to the y-axis as well and then depending on what you have chosen here it will update the various variable minimum and maximum value as a range slider so you could adjust the value and then you'll be able to see an updated data okay and you'll notice here at the bottom there's a corresponding data frame or the data that are underlying the visualization that you see so each of the points here are one row entry in the data frame here okay so you can see that this is a very simple application and I created this as a template that you could use for building on top of this particular template app and so let me know in the comment section how you extended this particular application for your own project so now that we have taken a quick look at what the N outcome will look like let's start from the beginning so the first thing that you want to do is you want to sign up for bright data and I'll provide you the link in the video description alright so let's have a look at the bright data platform let's log in I'll just head over to my dashboard since I'm already logged in and you could see the proxy infrastructure here and then you can see the data sets that I mentioned and the web scraper so these are the data products or data that has already been web scraped so if you have no coding background no problem because you could get access to web script data and there's a lot of data for you to choose from so you could check these out as well and you can see here that I've already played around with the Amazon reviewed books and let's dive into the scraper so when I embarked upon this data project to perform web scraping the first thought that came to mind was to web scrape data from the YouTube platform and when write data reached out I thought it would be a great opportunity to create a video showing the thought process and how I went about building this data project and so let's have a look here briefly so you can see that these are those scrapers that I've worked on so if you click on it you're going to see that it expands and then it provides too with the details of your task that you have already performed so let's take a look inside the sweeper so you can see here that there are two major code blocks there's the interaction code and there's the parser code so in the interaction code it is where the scraper will perform the actual task of hopping over to the website that you want to perform web scraping and then it will perform the scraping and after it performs scraping of the data it will have to parse the data So the instructions for these two tasks will be provided into two respective windows and you'll notice here that we perform web scripting in three successive and so you can see here that for this project I performed web scripting in three successive steps in stage one two and three and at the bottom here I specify the input for searching YouTube to be data science so all of the search results that have YouTube channels with the keyword of data science will be retrieved into the search results and then it will be web script and then the preview window here will allow us to see the web scraping while it happens for the initial run and after you're finished you could say finish editing or you could also save this to development if you have already modified the code and so since I have already done that I'll just skip over to the next phase which is to perform the actual app building so let's have a look at the data that we had already web scraped so you'll notice here that this is a Json file and the contents so you'll notice that in the about of the YouTube channel from IBM there's a keyword data science because the query is on Channels with data science keywords and then we have like their join date the channel ID to view count the number of videos on the channel the number of subscribers on the channel and for example this is from data Professor YouTube channel join date about Channel ID and so all of this data will be analyzed using the stromaed app that we have just created here all right guys so let's have a look under the hood into the code so you can see here that the entire code for the web app takes less than 100 lines of code so particularly only 85 blind zip code let's have a look at the breakdown of the code but before doing so let's run the app so in the command line you could type in streamlit run and then the name of the app which in our case is to remove underscore app.py hit on enter and then the app will appear in the browser shown here so you can see here that in the first couple of lines we're going to import the necessary libraries who are used in streamlit as the low code web framework for this entire project and then we're using pandas in order to perform data wrangling because we have to do a lot of data preparation in order to select particular columns that we'd like or perform some additional data filtering and then we're using Altair to display the scatter plot that you see here in the app we're using numpy to perform some numerical processing particularly in the Roundup feature here that we've created as well as the math lab library and then we're using Json library in order to import the data that we have web scripts so you see here that lines number 9 into 11 we've created a custom function in order to perform a rounding up of the values that are shown here in the slider and the lines 13 until 21 we use session state in order to save the selected data values from the select box drop down menu here so whenever we select a value here particularly from the slider you'll see that it will save into the memory right and then line 24 will essentially just printing out a simple statement for our web app which is the YouTube channel data analytics linestone 829 we're importing the Json data and then we're going to perform some additional data wrangling 39 until 63 we're going to have this select box here where the user could select a different value here from this select box and then they could select a different value from the slide and then the scatter plot will be updated in real time so under the hood in order to do that we have to perform some additional data wrangling where when we select a different value from the slider here it will be reflected because we perform additional data wrangling and so let's have a look at line 627 particularly 74 into 80 we're going to create this scatter plot using Altair and it's a simple scatter plot where we Define the color and you can select another color that you like and finally we display the scatter plot here on line 82 and then 8485 we're creating this expandable data frame where it is the underlying data that are used for the scatter plot and so that's it less than 100 lines of code you could create this data-driven web application for your data project so as you can see this particular data project could be extended and built upon for example if you are a product marketer you could use this particular web app to drive your influencer marketing campaigns you could identify potential YouTubers to create content for your products or if you are a content creator you could potentially use the app here to find potential collaborators for your Channel or even find potential viral topics that you could use to create your own videos on and so the possibilities are at less you could change the data set from YouTube data here and use other data like Instagram data Glassdoor data LinkedIn data and then build your own custom data app so I'd love to hear from you how you're building upon this particular web app template and so let me know in the comment section down below thanks for watching until the end of the video and I hope that this video was helpful to you and let me know in the comments down below which future project that you would like me to work on and while you're at it please drop a Star Emoji so that I know that you're the real one and it will mean the world to me if you could smash the like button share the video to your friend post it on social media subscribe to the channel and as always the best way to learn data science is to web script your own data and analyze it and please enjoy the journey\n"