6 TH
6 TH
6 TH
Abstract: The paper proposes a hand free human-machine interaction (HMI) system to give a better way for communication
between humans and computers. It is mostly seen that long hours in front of the pc and using the mouse puts the user's
hand for too long in a bad posture that increases inflammation in the wrist and hand, leading to some major issues.
Additionally, the main focus is on the usage of the modern technology by the handicapped people who tend to get a lot of
boost, since it is a hindrance to them to use the mouse. The following paper develops a modern way for differently abled
people to use the mouse with just their eyes. The proposed system carries the human eyes to perform the movement of the
mouse cursor. This paper also carries out the design of a machine learning and deep learning approach with some inbuilt
dataset to classify the individual eyes of both human eyes with quite a high accuracy that ensures the control over the
movement. The use of filters which is provided by the libraries which are used to remove the noise to obtain smooth and
accurate operation the details along with the performance evaluation shows that the proposed HMI system has extensive
control over its performance for differently abled people.
I. INTRODUCTION
Our existing laptop and computer user interface cannot be used by all people, in other words the differently abled , they often face
difficulty while using these devices. They always need to seek out help from others which makes them dependent on someone. To
connect them with the modern world and so as to make them feel independent, the following project will give a boost in their day
to day life. Also this project is useful for all types of people who would use their systems efficiently and with greater speed. This
project is not only for the sole purpose of the differently abled but also for those who spend hours using their pc and the mouse , it
is observed that excessive use of the mouse can put their wrist in danger of getting some serious inflammation which could lead to
some major issues of internal tissue damage .Through this project it can be said that how an electronic device can be used to
simplify things without any excessive usage of hands , by simply using the vision one can control the mouse without any
hindrance, ,relatively improving speed of the user. The paper will mainly focus on how to reduce hardware in our system by
implementing new technology just by using software and a simple webcam, to move and to click. The remainder of this paper is
organized as follows: Literature Survey in Section II, Face and Eye Detection in Section III that is in Methodology also with the
systems Block Diagram, Section IV contains the Implementation and Result of the systems with Real time Output snaps ,including
Blink Detection and Cursor Movement ,in Section V Discussion about the project is done with observation and how the system is
better and what are the future scope . Section VI shows the Conclusion of the project and Section VII is the Reference with citation.
the Histogram of the image and with the addition of SVM, detection of face takes place and the iris portion is cropped and then
there are some points in eye which are targeted and the movements of those points maps the cursor movements ,this is a non
training based algorithm and it uses Image processing for all the functions .
In a paper proposed by S. R. Fahim, et al[6], it focus on the uses HOG system and motion vector with python programming and
Haar Cascade Algorithm which is a training based algorithm, and it is used mainly with programming in machine learning , eye
dataset is given in this, by having multiple dataset of eyes, then the eye data is collected and the following system works
accordingly to the eye movement and clicking is done with the help of the eye blinking .
III. METHODOLOGY
In our proposed work the cursor is controlled by eye movement using only the webcam. By using a webcam or external camera
we designed a system where people can move the cursor and perform left and right clicks using eye winks.
The features such as face and eyes detected should be accurate and real time.. We are working with the live feed obtained through
a webcam. In this system face and eye features are detected using Facial Landmark detection. Facial landmark detection estimates
the location of coordinates (x, y) that map the facial points on a person’s face. After capturing the live images, we use the Facial
Landmark Detector to map the facial points. We make a black mask of the same size as our webcam frame. We store the
coordinates of the facial marks on the left and right eyes respectively and draw them on the mask. After detecting the eyes, we
draw the eyes on the black mask that we created. Now we have a mask where the area across the eye is drawn in white, wherein
the mask is black. This white area is expanded and morpholized a little using a morphological operation. We now segment out the
eyes. Now we convert all the white pixels to black that are 255 values so that only the eyeball area is the dark part left across the
white eyes area. We need to create a binary mask by thresholding. For thresholding, we convert the resultant image to grayscale.
So now we find an appropriate value of threshold against which we segment out the eyeballs from the eye. Now the eyeballs are
segmented out. We now find the two largest contours from the mask which should be our eyeballs. This leaves out some errors
such as not being able to detect eyes accurately, but it can be measured by taking the midpoints between the eye area and dividing
the image accordingly. Then we find the value of contours[7] which is maximum in those frames and those should be our eyeballs.
For the detection of movement of the eyeball , we calculate the ratio of eye landmark points to get the position of eyeball and then
connect the movement of eyeball to cursor using a script that controls the mouse to automate interaction with other applications.
Now the cursor will move according to the movement of our eyeballs.
We now move on to the clicking part where the blinking of eyes is used for clicking events. Firstly, we calculate the eye aspect
ratio (EAR). The Eye Aspect Ratio is a constant value when one's eye is open, but rapidly falls to 0 when one's eye is closed.
We then take the average EAR[8]of both eyes respectively and calculate the absolute difference between both the EAR’s. Now
we compare the difference EAR and value of threshold that we have set for the blink. If the difference is greater than the blinking
value of threshold, we move further and compare the EAR of right and left eyes respectively. So if the EAR of the right eye is
greater than the EAR of the left eye, it implies that the right eye is open but we cannot be sure of the left eye; or the other way
round. To check whether the left eye is open or not, we compare the EAR of the left eye and the value of the threshold for the eyes
that we had set previously. So if the value of threshold is greater than the EAR value, it implies that the left eye is closed. Now we
link this to the left click of the mouse and therefore, left mouse click is performed. We now use the same procedure for right click
using the right eye wink. Now we can move the cursor using our eyes and also perform right click and left click by winking the
right eye and left eye respectively. The basic explanation of the above process is given in the following block diagram, the webcam
of the pc or laptop will take the input as an image of the eye [9].
First step in the proposed system is to detect faces in a given input image, for this we use deep learning based face detectors.
This deep learning based face detector is based on the single shot detector (SSD) framework with a ResNet based network.[10].
The shape predictor algorithm and facial landmark detector [11] This deep learning -method uses a training set of facial landmarks
on a labeled image. These images are labeled manually. More specifically it works on the distance between pairs of input which is
pixelated.
Providing training to the data, an ensemble of regression graphs are trained to estimate landmarks position from pixel intensities.
So, this pre-trained network in the dlib library can detect 68[12] (X,Y) co-ordinates that map facial structure on the face. These
annotations are part of a 68 point IBUG 300-W dataset on which dlib shape predictor was trained.[13]
Accuracy of the system practically should have been 100 percent ,but due to external impact such as intensity of light , an average
camera etc were the reason to get the accuracy little low , the accuracy of the system is about 70-85 percent after all the trial and
error , in which the maximum average being 85 at broad daylight , whereas at night time the average accuracy decreases to about
75 percent. These are the output which are determined using this experiment , how the cursor moves in that specific direction , as
one can see in the fig 4,5,6, the cursor moves down , left and right respectively , the threshold is determined earlier and then the
cursor is moved, the result where taken as follows
Fig 7: clicking
Next is the clicking event in fig 7 which is recorded using the EAR ratio. when the ratio is very less it is calculated and
simultaneously executed with somewhat good accuracy. The output of the code can be seen roughly with the help of these following
graphs,fig.8. shows the Accuracy of the system in different lighting conditions, whereas fig.9. is the estimated value of the threshold
value for different light intensity The following graph shows that when there is natural light and we have a good light quantity then
the accuracy of the system is much higher than when there is less light. Also the effect of a good webcam is must for a much better
result.For optimum accuracy a good camera should be used and if possible use it in ample amount of light for the best result.
The following graph shows the adjusted threshold of the system on different lighting conditions ,it does not have such an effect on
accuracy or precision in such but it will be easier to get the average threshold quicker if we know where the threshold is acquired
on that specific time of the day.In the optimum threshold value one get the most efficient solution of the system.
The setup reduces the burden of holding the mouse for its operation, decreases wrist pain and ensures reliable communication
between the computer and differently abled people. Promising to perform better with a trained user and over the using time.
It can be extended ,which will be used for the implementation of a soft keyboard. By the control action using eye movements,
video games can also be played using eye and eye gazing. It can also be used to detect if the user is sleepy or not, so that an alarm
can be set for safe driving and improved safety. The system can be made in such a way to help the physically challenged to control
appliances such as TV sets, tube lights etc. The system can also be used by individuals suffering from paralysis, to operate and
control a wheelchair. The following principle can also be used to detect sleep and drowsiness of drivers in order to prevent vehicle
accidents which can be . Eye gazing detection and tracking are also in gaming ,streaming and virtual reality .
VI. CONCLUSION
This project presents a novel HMI system for controlling a pointer on the computer screen based on eyes. The following
proposed technique offers physically handicapped people another approach to cooperate with the real world in a better and efficient
way. This particular system allows the user to mimic the actions of an ordinary computer mouse with the movements of their eyes
and help them to scroll and click on the event ,greatly enhancing productivity for physically challenged people. It opens another
age that is controlling mouse cursor developments using human eyes.Various problems were identified for the physically abled and
existing business items that fell in a similar situation related to the mouse cursor and it manages to solve those problems. This is a
user-friendly framework, specifically for its utilization in computer applications. Today, these data streams in the modules of eye
tracking/following are being produced. These setups give more adaptability because they are easily accessible if you just know
how to install these few libraries.
REFERENCES
[1] G. Norris and E. Wilson, “The Eye Mouse, an eye communication device,” Proceedings of the IEEE 23 rd Northeast
Bioengineering Conference, Durham, NH, USA, 1997, pp. 66-67,May 1997
[2] Bullying, J. A. Ward, H. Gellersen and G. Tröster, “Eye movement analysis for activity recognition using electrooculography”,
IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 4, pp. 741-753, Apr. 2011.
[3] V. Khare, S. G. Krishna and S. K. Sanisetty, "Cursor Control Using EyeBall Movement," 2019 Fifth International Conference
on Science Technology Engineering and Mathematics (ICONSTEM), Chennai, India, 2019, pp. 232-235,August 2019
[4] Mohamed Nasor, Mujeeb Rahman K K, Maryam Mohamed Zubair, Haya Ansari, Farida Mohamed.Department of Biomedical
EngineeringAjman University, Ajman, UAE
[5] S. Mathew, A. Sreeshma, T. A. Jaison, V. Pradeep and S. S. Jabarani, “Eye Movement Based Cursor Control and Home
Automation for Disabled People,” 2019 International Conference on Communication and Electronics Systems (ICCES),
Coimbatore, India, 2019, pp. 1422-1426,2019
[6]S. R. Fahim et al., “A Visual Analytic in Deep Learning Approach to Eye Movement for Human-Machine Interaction Based on
Inertia Measurement,” in IEEE Access, vol. 8, pp. 45924-45937, March 2020
[7] https://www.pyimagesearch.com/2016/02/01/opencv-center-of-contour
[8]Tereza Soukupova and Jan ´ C ech. Real-Time Eye Blink Detection using Facial Landmarks. At https://vision.fe.uni-
lj.si/cvww2016/proceedings/papers/05.pdf
[9]Face detection with OpenCV and deep learning - PyImageSearch
[10] https://www.hindawi.com/journals/acisc/2018/1439312/
[11] V. Kazemi and J. Sullivan, "One millisecond face alignment with an ensemble of regression trees," 2014 IEEE Conference
on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 1867-1874, doi: 10.1109/CVPR.2014.241.
[12] https://towardsdatascience.com/real-time-eye-tracking-using-opencv-and-dlib-b504ca724ac6
[13] https://ibug.doc.ic.ac.uk/resources/facial-point-annotations