Home
About me
Skills
Resume
Certifications
Projects
View all projects
Projects
View all projects
Contacts
Eye Tracking Mouse
Home
Progetti
Eye Tracking Mouse
Eye Tracking Mouse
Computer vision is a field of artificial intelligence that enables computers and systems to derive meaningful information from digital images, videos and other visual inputs — and take actions or make recommendations based on that information. If AI enables computers to think, computer vision enables them to see, observe and understand.
Computer vision trains machines to perform these functions, but it has to do it in much less time with cameras, data and algorithms rather than retinas, optic nerves and a visual cortex. Because a system trained to inspect products or watch a production asset can analyze thousands of products or processes a minute, noticing imperceptible defects or issues, it can quickly surpass human capabilities.
It runs analyses of data over and over until it discerns distinctions and ultimately recognize images. For example, to train a computer to recognize automobile tires, it needs to be fed vast quantities of tire images and tire-related items to learn the differences and recognize a tire, especially one with no defects.
Two essential technologies are used to accomplish this: a type of machine learning called deep learning and a convolutional neural network (CNN).
A Convolutional Neural Network, also known as CNN or ConvNet, is a class of neural networks that specializes in processing data that has a grid-like topology, such as an image. A digital image is a binary representation of visual data. It contains a series of pixels arranged in a grid-like fashion that contains pixel values to denote how bright and what color each pixel should be.
This script provides the ability to move the mouse through shifting the position of the eyes.
This script in Python uses the
'OpenCV'
library to capture and manage frames of a video from the computer webcam, the
'Mediapipe'
library to detect facial landmarks and eye position, and the
'PyAutoGUI'
library to control the mouse cursor and simulate clicks.
Mediapipe
's
'FaceMesh'
model is used to detect facial landmarks in the webcam frame by extracting points related to eye positions. Then, using the
PyAutoGUI
library, it calculates the cursor position based on the position of the detected eyes, and moves the mouse cursor based on this position. In addition, the program also detects if the user is blinking, that is, if the y-coordinates of the eyes are approaching, and if so, it performs a mouse click using the
PyAutoGUI.click()
function.
Finally, the program uses the
OpenCV
libraryto display the frame with the detected landmarks. The entire operation is performed within an infinite while loop that continues until it is manually interrupted.
Project Information
Category
: ML
Proeject Url
:
About
: Move your mouse using your eyes
Computer vision is a field of artificial intelligence that enables computers and systems to derive meaningful information from digital images, videos and other visual inputs — and take actions or make recommendations based on that information. If AI enables computers to think, computer vision enables them to see, observe and understand.
Computer vision trains machines to perform these functions, but it has to do it in much less time with cameras, data and algorithms rather than retinas, optic nerves and a visual cortex. Because a system trained to inspect products or watch a production asset can analyze thousands of products or processes a minute, noticing imperceptible defects or issues, it can quickly surpass human capabilities.
It runs analyses of data over and over until it discerns distinctions and ultimately recognize images. For example, to train a computer to recognize automobile tires, it needs to be fed vast quantities of tire images and tire-related items to learn the differences and recognize a tire, especially one with no defects.
Two essential technologies are used to accomplish this: a type of machine learning called deep learning and a convolutional neural network (CNN).
A Convolutional Neural Network, also known as CNN or ConvNet, is a class of neural networks that specializes in processing data that has a grid-like topology, such as an image. A digital image is a binary representation of visual data. It contains a series of pixels arranged in a grid-like fashion that contains pixel values to denote how bright and what color each pixel should be.
This script provides the ability to move the mouse through shifting the position of the eyes.
This script in Python uses the
'OpenCV'
library to capture and manage frames of a video from the computer webcam, the
'Mediapipe'
library to detect facial landmarks and eye position, and the
'PyAutoGUI'
library to control the mouse cursor and simulate clicks.
Mediapipe
's
'FaceMesh'
model is used to detect facial landmarks in the webcam frame by extracting points related to eye positions. Then, using the
PyAutoGUI
library, it calculates the cursor position based on the position of the detected eyes, and moves the mouse cursor based on this position. In addition, the program also detects if the user is blinking, that is, if the y-coordinates of the eyes are approaching, and if so, it performs a mouse click using the
PyAutoGUI.click()
function.
Finally, the program uses the
OpenCV
libraryto display the frame with the detected landmarks. The entire operation is performed within an infinite while loop that continues until it is manually interrupted.