Welcome to the Data Professor: An Introduction to Data Science and Building Your First Prediction Model
Hello and welcome back to my channel, the Data Professor! I'm Chanin Nantasenamat, and today we're going to explore one of the most exciting fields out there - data science. As we all know, data is ubiquitous in our daily lives, and with the ever-increasing amount of big data available, it's essential to learn how to analyze, gain insights from, and make informed decisions using this information.
So, what exactly is data? Data pertains to information about entities of interest, such as health parameters of a human being, characteristics of cars, or properties of drugs. These examples illustrate the diverse range of data types that we can collect and work with in our daily lives. In essence, data science is a broad field that encompasses various smaller disciplines like statistics, mathematics, data visualization, programming, data mining, and machine learning.
Data mining is a subset of data science that refers to the specific process of making use of data to build prediction models and extract knowledge from the data. On the other hand, machine learning refers to the learning algorithms used to create these prediction models within the data mining process. So, as you can see, data science is a multidisciplinary field that offers a wealth of opportunities for individuals looking to explore and apply their analytical skills.
Now that we've had a brief introduction to data science, let's get started with building our very first prediction model! To achieve this, we'll be using the WEKA program, which is an excellent tool for performing data mining. WEKA has an intuitive graphical user interface that allows us to pre-process, transform the data, and construct the prediction model using various machine learning algorithms.
Weka was created by two developers, Ian Witten and Eibe Frank, from the University of Waikato. It's a versatile program that can be used on multiple platforms, including Windows, Mac, and Linux. Before we begin, it's essential to select the correct version of WEKA for our operating system. We have several versions available, including stable and developer versions, each with its unique features.
The first file we need to download is the WEKA program itself, which comes in a 64-bit version. However, we also have options that include or exclude the Java Virtual Environment (JRE), depending on our operating system requirements. If you're starting out with data science, it's recommended to use the stable version of WEKA.
Once we've selected the correct file, we need to check if our computer is running a 64-bit or 32-bit operating system. This information will help us determine whether we need to install the Java Virtual Environment (JRE) as part of the WEKA software. To do this, we can open the Properties window and look for the Java version. If your computer doesn't have Java installed, you can download it from Google.
After selecting our desired version of WEKAsome, it's time to start downloading the program. This process may take a few minutes, depending on your internet speed. Once the download is complete, we'll need to install the software. During this process, we'll be asked if we want to allow the program to make changes to our device.
To continue with the installation, we simply need to follow the prompts and click through the next steps. This will involve installing the Java Virtual Environment (JRE) as part of the WEKA software. Once this is complete, we can start using WEKA and begin building our first prediction model.
And that's it for today's tutorial! I hope you enjoyed learning about data science and how to build your very first prediction model using WEKA. If you haven't subscribed to my channel yet, please consider doing so, as well as clicking on the notification bell to stay informed about upcoming videos. Until next time, I'll see you in the next video!