Advanced Computer Vision with Python - Full Course
# Creating an AI-Based Mouse Controller Using Python and OpenCV
## Introduction
In this comprehensive guide, we will walk through the process of creating an AI-based mouse controller using Python and OpenCV. This innovative project leverages hand tracking technology to control your computer’s mouse cursor simply by moving your hand in front of a webcam. Additionally, you can perform clicks by bringing your index and middle fingers together.
This article is designed for individuals with a basic understanding of Python programming and computer vision concepts. By the end of this guide, you will have a functional AI-based mouse controller that you can customize to suit your needs.
---
## Prerequisites
Before diving into the project, ensure you have the following:
1. **Python installed** on your system.
2. **OpenCV (cv2)** installed for image processing.
3. **MediaPipe** installed for hand detection and tracking.
4. **AutoPi** installed for mouse control.
You can install these packages using pip:
```bash
pip install opencv-python mediapipe autopi
```
---
## Setting Up the Project
### Step 1: Create a New Python File
Open your preferred IDE (e.g., PyCharm, VS Code) and create a new Python file. We’ll name it `ai_mouse_controller.py`.
### Step 2: Import Necessary Libraries
Add the following imports at the top of your file:
```python
import cv2
import numpy as np
from mediapipe import solutions as mp
from mediapipe.framework import landmark as mp_landmark
import time
import autopy
```
---
## Creating the Hand Tracking Module
### Step 3: Define a Class for Hand Detection and Tracking
Create a class `HandTracker` with methods to detect hands, find landmarks, check which fingers are up, and calculate distances between points.
```python
class HandTracker:
def __init__(self):
self.mpHands = mp.solutions.hands.Hands()
self.detector = self.mpHands
self.lm_list = []
self.bbox_info = []
def find_hands(self, img, draw=True):
imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
self.detector.process(imgRGB)
return img
def get_landmarks(self, img):
landmarks = self.detector.get_landmarks(img)[-1]
if landmarks:
self.lm_list = [(lm.x * 1080, lm.y * 720) for lm in landmarks.landmark]
self.bbox_info = [landmarks.boundingBox]
return self.lm_list
def fingers_up(self):
up_fingers = []
if len(self.lm_list) > 0:
# Index finger
tip_x, tip_y = self.lm_list[8][0], self.lm_list[8][1]
base_x, base_y = self.lm_list[5][0], self.lm_list[5][1]
if (tip_y - base_y) < 30:
up_fingers.append(1)
else:
up_fingers.append(0)
# Middle finger
tip_x2, tip_y2 = self.lm_list[12][0], self.lm_list[12][1]
base_x2, base_y2 = self.lm_list[9][0], self.lm_list[9][1]
if (tip_y2 - base_y2) < 30:
up_fingers.append(1)
else:
up_fingers.append(0)
return up_fingers
def find_distance(self, p1, p2):
x1, y1 = self.lm_list[p1][0], self.lm_list[p1][1]
x2, y2 = self.lm_list[p2][0], self.lm_list[p2][1]
distance = np.hypot(x2 - x1, y2 - y1)
return distance
```
---
## Implementing the Mouse Controller
### Step 4: Initialize Variables and the Webcam
In your main function, initialize variables for mouse control and set up the webcam:
```python
def main():
tracker = HandTracker()
cap = cv2.VideoCapture(0)
width, height = autopy.screen.size()
p_x, p_y = 0, 0
current_x, current_y = 0, 0
smoothning_value = 5
while True:
success, img = cap.read()
img = tracker.find_hands(img)
lm_list = tracker.get_landmarks(img)
if lm_list:
x1, y1 = lm_list[8][0], lm_list[8][1]
x2, y2 = lm_list[12][0], lm_list[12][1]
# Smoothening the values
current_x = p_x + (x1 - p_x) / smoothning_value
current_y = p_y + (y1 - p_y) / smoothning_value
# Moving the mouse
autopy.mouse.move(current_x, current_y)
# Clicking mechanism
up_fingers = tracker.fingers_up()
if up_fingers == [1, 1]:
distance = tracker.find_distance(8, 12)
if distance < 40:
autopy.mouse.click()
# Drawing the circle for better visualization
cv2.circle(img, (int(x1), int(y1)), 15, (100, 100, 255), cv2.FILLED)
# Displaying FPS and bounding box
cv2.putText(img, str(int(cap.get(cv2.CAP_PROP_FPS))), (10, 50),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2)
cv2.rectangle(img, (100, 100), (width - 100, height - 100), (0, 255, 0), 2)
p_x, p_y = current_x, current_y
cv2.imshow("AI Mouse Controller", img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
if __name__ == "__main__":
main()
```
---
## Explanation of the Code
### Hand Tracking Module (`HandTracker` Class)
- **Initialization**: The class initializes MediaPipe’s hand detection model.
- **Finding Hands**: Converts the image to RGB format and processes it using MediaPipe to detect hands.
- **Getting Landmarks**: Extracts the coordinates of the hand landmarks and stores them in `lm_list`.
- **Checking Fingers Up**: Determines which fingers are up by comparing tip and base points of each finger.
- **Finding Distance**: Calculates the distance between two specified landmarks.
### Main Function
- **Webcam Setup**: Captures video from the webcam and initializes variables for mouse control.
- **Mouse Movement**: Uses the index finger’s position to move the mouse cursor, with smoothening applied to reduce jitter.
- **Click Mechanism**: Detects when both the index and middle fingers are brought close together to simulate a click.
- **Visualization**: Draws a circle on the index finger for better visualization and displays FPS and bounding box.
---
## Conclusion
This project demonstrates how powerful computer vision libraries like OpenCV and MediaPipe can be used to create innovative applications. By following this guide, you’ve created an AI-based mouse controller that allows you to control your computer using hand gestures. Feel free to experiment with the parameters (e.g., `smoothning_value`, click distance threshold) to optimize the system for your environment.
Remember to share your experience and modifications in the comments below!