Group D - IP Project Report
Group D - IP Project Report
Group D - IP Project Report
Course Project
Date: April 6 2024
Harinarayanan J B200741CS
Declaration
This report has been prepared on the basis of our group’s work. All the other published and
unpublished source materials have been cited in this report.
Aaron Joseph
Jackson Stephan
Harinarayanan J
Abhay Raj
Anuram Anil
2
Face Recognition: Attendance Tracking System Group D, 2024
Table of Contents
Abstract .......................................................................................................................... 4
2.1 Recognition Using Eigen Faces and Artificial Neural Network [1] ........................................... 7
2.2 Face recognition and detection using Random Forest and combination of LBP and HOG [2].. 7
References .................................................................................................................... 31
3
Face Recognition: Attendance Tracking System Group D, 2024
Abstract
Image processing involves the application of various algorithms and techniques to manipulate and
analyse digital images. The process can enhance or extract useful information from images for
further analysis or decision-making. Image processing has different applications in the real world
ranging from image restoration, colour processing, compression to object recognition and pattern
recognition. Under pattern recognition, facial recognition technology has taken the world by storm
in recent years, revolutionizing various industries and applications. In the project, we explore the
various methodologies when it comes to facial recognition and gain insights into the key
components of a facial recognition system. The project takes inspiration from previous research
papers, to use two feature extraction methods - Local Binary Pattern (LBP) and Histograms of
Oriented Gradients (HOG) – and use a Random Forest Classifier to achieve the task of facial
recognition and help create an attendance tracking system.
4
Face Recognition: Attendance Tracking System Group D, 2024
Project Specification
1. Project Definition: Utilizing image processing techniques to design a facial recognition model
that can efficiently differentiate and correctly identify different faces in an image to help solve
a real-life problem such as class attendance.
2. Deliverables:
c. Compare the recognized faces with the pre-existing database and creating an attendance
sheet.
Face Recognition: The proposed system employs face detection and recognition using random
forest and combination of LBP and HOG features. LBPH algorithm ensures to find the local
structure of an image and it does that by comparing each pixel with its neighbouring pixels. We
have to decide the number of neighbours to be considered for LBP and the sampling method.
Each image is converted to grayscale before passing to LBP. LBP is applied on to each sub-
region and a histogram of N bins is generated from the pixel labels. The individual histograms
are concatenated into a single higher dimensional histogram.
During recognition phase, using random forest (RF) classification, the results from each tree
are collected for the input image and the majority voting is gathered to give the resulting class
label. RF has been applied to compare the feature vectors from both training and testing stage
to match the corresponding person.
Attendance Marking: The application should automatically create an attendance sheet for
recognized students based on the input image.
Scalability: The system should be able to handle a growing number of students added to the
facial database.
4. Technical Specifications:
Hardware: Processing power sufficient for image processing and model execution.
Software: Programming language with image processing libraries
5
Face Recognition: Attendance Tracking System Group D, 2024
Chapter 1: Introduction
1.1 Background
Facial Recognition involves the task of identification of a face from a video or image given a pre-
existing database of faces. The process mainly consists of the detection of human faces from the
image followed by the identification of the detected faces.
There has been substantial advancement in facial recognition technology due to the recent
advancements in computer vision and machine learning algorithms. The applications of facial
recognition spread out to multiple domains such as authentication, security and surveillance.
Image Processing methods play a crucial role in tasks like categorization, feature extraction and
face detection. Some of the most common methods used in facial identification include Local
Binary Patterns (LBP), Eigenfaces, Histogram of Oriented Gradients (HOG), and Convolutional
Neural Networks (CNNs).
The Eigenfaces method goes through training images to try and extract components that catch
maximum variance and discard the less important components. These extracted components are
called principal components. Local Binary Patterns methods try to find a local structure of the
image by comparing each pixel with its neighbouring pixels. Deep learning techniques like
convolutional neural networks trained on massive datasets to achieve very high recognition rates.
Automated attendance systems are really useful in a variety of settings like offices, schools and
public gatherings. The face recognition technology benefits in increased accuracy and efficiency
when compared to conventional attendance methods.
1.2 Challenges
1. Accuracy and Reliability:
Achieving high accuracy and reliability is one of the primary challenges in facial recognition
systems. Factors such as illumination variations, pose variations and occlusions could hinder the
performance of the recognition system. Changes in lighting can significantly affect feature
extraction, leading to false recognition. There is also a possibility that faces captured from various
angles may not be recognized accurately. Occlusions like hats and scarves also make the process of
recognition difficult. Hence developing a system from a robust algorithm is essential.
2. Privacy Concerns:
Significant privacy concerns arise, for facial recognition systems, regarding collection, storage and
usage of data. Implementing appropriate security measures in compliance with privacy regulations
is crucial.
3. Dataset Management:
Managing a large dataset of enrolled students facial image data poses challenges with respect to
scalability, security and integrity of the system. The increase in student count would require
efficient storage and mechanisms for retrieval.
6
Face Recognition: Attendance Tracking System Group D, 2024
2.1 Recognition Using Eigen Faces and Artificial Neural Network [1]
The paper presents an approach to face recognition where the features are extracted from face
images using principal component analysis (PCA) and derived using feed forward back
propagation neural network (ANN). PCA searches a subspace of low dimension (face space) which
has the maximum variance among a set of face images. The eigenvectors of the face space are
called the eigenfaces, and a face image can be approximated by a linear combination of those
eigenfaces. One ANN per person of the database was suggested, using the face descriptors (weights
of eigenfaces) as inputs for training. The new face image is first projected into the face space and
given as an input for each ANN. The maximum output determines the identity of the person. The
proposed method was then subjected to testing using the ORL face database. The authors also
compared their method with K-mean and Fuzzy Ant with Fuzzy C-means. The F.A.F.C obtained
the recognition rate of 94.75%. The proposed method on the other hand obtained an improvement
recognition rate of 97.018%.
Drawbacks:
1. Eigenface method is highly dependent on the number of eigenfaces used for feature
extraction, and selecting an inappropriate number of eigenfaces can impact the recognition
performance.
3. A face image can be approximately reconstructed by using its feature vector and the
eigenfaces. But the degree of the fit or the "rebuild error ratio" increases as the training set
members differ heavily from each other. This is due to the addition of the average face
image that becomes messy.
4. Sensitivity to head orientations, as the model is prone to mismatches for images with large
head orientations.
2.2 Face recognition and detection using Random Forest and combination of LBP
and HOG [2]
This conference paper introduced a new framework to handle facial recognition and detection
problem in a video broadcast under uncontrolled environments. It is based on an algorithm called
Viola-Jones, which is a fast and accurate face detector locating faces in images and video
sequences. Here a combination of LBP and HOG descriptors with a new formulation and new
statistics extracting a feature vector robust and low-complexity are combined for face feature
extraction resulting invariant to illumination, pose, expression and occlusion variations. Then an
ensemble learning method called Random Forest used to face feature classification that presents
high discrimination performance, low computational cost and outperformed other classifiers as
SVM. In the voting stage a new formula is introduced to increase the accuracy of face verification,
which compares the matching results from different frames of the video and selects the person with
the highest percentage.
7
Face Recognition: Attendance Tracking System Group D, 2024
Drawbacks:
1. The model may face challenges while trying to accurately recognize faces in scenarios with
occlusion, pose variation, and illumination.
2. The document also mentions limitations in distance, algorithm maturity, and camera
qualities that could affect the overall accuracy of the model.
8
Face Recognition: Attendance Tracking System Group D, 2024
1. Collection of Database: For the purpose of creating the training database, 100 photos per
subject are extracted for each training video, varying in terms of illumination, stance,
background, expressions, and occlusions. After completing the face detection, the tagged
faces are cropped off, leaving only the face, and the majority of the background is removed
to create the training database. Following pre-processing, these faces are scaled,
normalized, and turned to grayscale. The 3 sets of images that make up the testing data: the
first set, which consists of ten images, is used to test the system's performance when there
is only one person in the image; the second set is used to test the developed algorithm
when there are multiple people in the image; and the third set is used when the person is
not present in the training database.
2. Face detection and Preprocessing image: Face detection is a technique for locating faces
in images and video sequences. A face detection algorithm called Viola & Jons is
suggested. An image window with just the face region in it is the face detection process'
9
Face Recognition: Attendance Tracking System Group D, 2024
output. Following face detection and cropping of the face's windows, the histogram
equalization approach is used to normalize the image's face.
3. Features extraction: The LBP and HOG methods combine two local feature approaches to
extract features from a face classifier in order to identify or confirm the identity of an
unknown face.
4. Random Forest (RF) Classification module: In order to match the appropriate person,
RF has been used to compare extracted templates (vector features) from both the training
and testing stages. To provide matching results, the features of the generated vectors from
the testing data are compared to the vectors that were recorded in the training data.
LBP algorithm ensures to find the local structure of an image and it does that by comparing each
pixel with its neighbouring pixels. We have to decide the number of neighbours to be considered
for LBP and the sampling method. In our project we have decided a neighbourhood of 24 points
within a set radius of 3 around the centre pixel. The method parameter is set to 'uniform', which
means that it will consider only those patterns which have at most 2 bitwise transitions from 0 to 1
or vice versa when the bit pattern is traversed circularly. Each image is converted to grayscale
before passing to LBP. LBP is applied on to each sub-region and a histogram of N “bins” is
generated from the pixel labels. The “bins” parameter is set for each possible LBP value plus two
extra bins for non-uniform patterns. The individual histograms are concatenated into a single higher
dimensional histogram after normalization.
The HOG feature's computational complexity is significantly lower than that of the original data,
and the HOG descriptor is resilient and insensitive to changes in geometry and illumination. The
image will be divided into cells of 8x8 pixels size each. The cells are grouped into blocks of 2x2
cell size that are normalized together, which improves performance by providing some invariance
to changes in lighting and shadowing.
During recognition phase, using random forest (RF) classification, the features from LBP and HOG
are combined to help create the decision trees. The results from each tree are collected for the input
image and the majority voting is gathered to give the resulting class label. RF has been applied to
compare the feature vectors from both training and testing stage to match the corresponding person.
10
Face Recognition: Attendance Tracking System Group D, 2024
4.1 Evaluation
1. LBP Faces
The Local Binary Pattern (LBP) image of a digital image represents the texture information at
each pixel. It's generated by comparing the intensity of a pixel with its neighboring pixels and
encoding these comparisons into binary patterns.Here are a few LBP images for the images
from the dataset:
11
Face Recognition: Attendance Tracking System Group D, 2024
2. Histogram
12
Face Recognition: Attendance Tracking System Group D, 2024
In contrast, the combination of LBP with HOG, referred to as LBP + HOG, offers a more
comprehensive approach to feature representation. By integrating the local texture information
from LBP with the global shape and edge information from HOG, this combined approach
provides a more robust representation of complex characteristics like facial features. While LBP
captures local texture details, HOG highlights global shape and edge features, resulting in a more
nuanced and accurate representation of objects like faces.
However, the integration of LBP and HOG increases computational complexity compared to using
either method alone. Despite this drawback, the combined approach tends to outperform LBPH
alone, especially in challenging conditions where variations in lighting, pose, and facial
expressions are common. The choice between LBPH and LBP + HOG depends on factors such as
the desired level of accuracy, speed, and available computational resources.
Observation
The LBPH model successfully identified one positive match but missed eight faces. It incorrectly
identified one face. In contrast, the LBP + HOG model achieved seven correct identifications but
made three false identifications.
13
Face Recognition: Attendance Tracking System Group D, 2024
Output
Time taken to test 29.92 seconds (for 10 images in 1.10 seconds (for 10 images in
test_facedataset) test_facedataset)
14
Face Recognition: Attendance Tracking System Group D, 2024
5.1 Documentation
File 1: lbp_hog_train.py
These are the necessary libraries imported for various tasks such as image processing (OpenCV),
feature extraction (scikit-image), dataset handling (PyTorch), machine learning (scikit-learn),
visualization (Matplotlib), and saving/loading models (joblib).
Function Definitions
3. plot_image_with_prediction(image, predicted_label):
4. extract_lbp_features(image):
This function extracts Local Binary Pattern (LBP) features from the given grayscale image. The
explanation for each parameter in the above feature extraction is mentioned in tools and
techniques.
5. extract_hog_features(image):
15
Face Recognition: Attendance Tracking System Group D, 2024
This function extracts Histogram of Oriented Gradients (HOG) features from the given grayscale
image. The explanation for each parameter in the above feature extraction is mentioned in tools
and techniques.
Class Definition
1. class FaceDataset(Dataset):
The FaceDataset class is designed to facilitate the handling of a dataset containing face images. In
its constructor method, it initializes by storing the directory path where the face images are located.
It then lists all the files in the directory and filters out only the files with a '.jpg' extension. The
filenames are sorted alphabetically to ensure consistent processing order.
The __len__ method returns the total number of images in the dataset. This method is called when
using the len() function on an instance of FaceDataset, providing a convenient way to know the
dataset's size.
The __getitem__ method retrieves an item from the dataset at a specified index. Given an index
idx, it extracts the filename of the corresponding image and its label. Then, it constructs the full
path to the image file and reads the image using OpenCV's cv2.imread() function in grayscale
mode. The image is resized to a fixed size of 720x1280 pixels. Finally, it returns a tuple containing
the grayscale image and its corresponding label.
16
Face Recognition: Attendance Tracking System Group D, 2024
The provided code block is responsible for training a Random Forest classifier, for face
recognition/ classification tasks.
Initially, it shuffles the dataset and extracts features from each face image using Local Binary
Pattern (LBP) and Histogram of Oriented Gradients (HOG) techniques. This process ensures
diverse feature representation for each image, enhancing the model's ability to learn discriminative
patterns. The features are concatenated into a single feature vector for each image.
Subsequently, the dataset is split into training and testing sets using the train_test_split() function
from scikit-learn. This step is crucial for assessing the model's performance on unseen data and
avoiding overfitting.
A Random Forest classifier is then instantiated with 100 decision trees and trained on the extracted
features and corresponding labels using the fit() method. Random Forest is a powerful ensemble
learning method capable of handling high-dimensional data and capturing complex relationships
between features and labels.
After training, the model's accuracy is evaluated on the testing set using the accuracy_score()
function from scikit-learn. The accuracy represents the proportion of correctly classified samples in
the testing set, providing insight into the model's performance.
The trained classifier is saved to a file using dump() function from joblib library.
17
Face Recognition: Attendance Tracking System Group D, 2024
File 2: lbp_hog_test.py
This code imports necessary libraries and modules for conducting face recognition, including
OpenCV for image processing, scikit-image for feature extraction, and scikit-learn for machine
learning functionalities. Additionally, it imports modules for handling datasets and data splitting,
visualization, and CSV file operations.
The take_picture() function allows users to capture images from a webcam and save them for
further processing. Upon execution, it initializes the webcam capture using OpenCV's
cv2.VideoCapture(0) function, setting the resolution to 1280x720 pixels. It creates a window
named "test" to display the live video feed.
We use student images as a dataset. We label each student's image with their names to enable the
model to extract the label. We train the model using 100 images of each student, which we store in
a folder named "facedataset." The transcription will require almost 5–10 minutes for completion.
The images are required to have a dimension of 1280x720. We acquired the data using a Python
script that enables users to capture images in the required format. The code also assigns a label to
the captured images.
18
Face Recognition: Attendance Tracking System Group D, 2024
19
Face Recognition: Attendance Tracking System Group D, 2024
5. After capturing images or exiting the loop, release the webcam resources and close the
OpenCV windows.
Figure 2: The images being saved to the folder with a proper label
Inside the main loop, it continuously reads frames from the webcam using cam.read() and displays
them in the "test" window using cv2.imshow(). It waits for key events using cv2.waitKey(1). If the
'Esc' key (ASCII value 27) is pressed, indicating the user wants to exit, the loop breaks, and the
webcam capture is released using cam.release(), and the OpenCV windows are destroyed with
cv2.destroyAllWindows().
20
Face Recognition: Attendance Tracking System Group D, 2024
If the 'Space' key (ASCII value 32) is pressed, it saves the current frame as an image file with a
unique name based on the value of `img_counter`. The image is saved in the specified directory as
a JPEG file with the naming convention "face_{}.jpg", where `{}` is replaced by the current value
of `img_counter`. Finally, `img_counter` is incremented to ensure each image has a unique
filename.
Feature Extraction
The function extract_lbp_features(image) computes Local Binary Pattern (LBP) features from a
given grayscale image. It first defines parameters such as the radius (lbp_radius) and the number of
points (lbp_points) for the LBP calculation. Using the local_binary_pattern() function from the
skimage.feature module, it generates the LBP image representation of the input image. Then, it
constructs a histogram (hist) of the LBP image intensities, ensuring that the bins cover the range of
possible intensity values. The histogram is normalized by dividing each bin count by the total count
to obtain normalized LBP features. Finally, it returns the normalized histogram representing the
LBP features of the input image.
21
Face Recognition: Attendance Tracking System Group D, 2024
This segment of code loads the previously trained Random Forest model from the file named
'random_forest_model.joblib' using the load() function from the joblib module.
Loading test_facedataset
The FaceDataset class encapsulates a dataset of face images, facilitating data handling for machine
learning tasks. Upon instantiation with a directory path containing face images, it organizes the
data by listing and filtering image files based on their extensions (.jpg or .png).
With its __len__ method, the class provides the total number of images in the dataset. Moreover,
the __getitem__ method retrieves individual items from the dataset, providing access to images at
specific indices.
22
Face Recognition: Attendance Tracking System Group D, 2024
In the provided code segment, each selected image from the list `selected_images` is processed
iteratively. For each image, its full path is generated by combining the directory path (`data_dir`)
with the image name. Features are then extracted from the image using the
preprocess_and_extract_features() function.
The loaded Random Forest model is employed to predict the label corresponding to the extracted
features. The predicted label is printed to the console for visibility. Additionally, the predicted label
is added to the `attendance_list` set, ensuring attendance tracking.
23
Face Recognition: Attendance Tracking System Group D, 2024
A csv file titled "attendance.csv" is opened in write mode. The predicted labels are then written into
this file, facilitating the tracking of attendance.
Figure 5: attendance.csv
LBPH
File 3: detect.py
Face detection
Function for face detection within an image. Since the face detection algorithm works on grayscale
images, the input image is converted to a grayscale image. The haar cascade classifier is loaded for
frontal face detection using OpenCV's Cascade Classifier class. The method detectMultiScale
detects the faces and returns a list of rectangles representing the detected faces. It selects the first
face from the list and returns the ROI along with the bounding box coordinates. If there are no
faces detected, it will return 'None, None'.
24
Face Recognition: Attendance Tracking System Group D, 2024
Training Data
This function is used to prepare the training data. Empty lists for faces and labels are initialized.
For each image in the dataset the detect_face function is called and the detected face is appended to
the faces list and corresponding label in the labels list. The function returns the list of detected
faces and corresponding labels.
Face Recognition
This function is used to predict the identity of an input image. The detect_face method is invoked
to detect faces in the image. The label of the detected face is predicted using the face_recognizer
method and assigns the label to it. A rectangle is drawn around the detected face and the name of
the subject is retrieved using the label predicted. The modified image is finally returned.
25
Face Recognition: Attendance Tracking System Group D, 2024
The detected face can be visualized by the bounding box and the predicted label is displayed above
the box.
import os
import cv2
import numpy as np
from torch.utils.data import Dataset
from skimage.feature import local_binary_pattern, hog
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from random import Random, sample
import matplotlib.pyplot as plt
from joblib import dump
def extract_lbp_features(image):
lbp_radius = 3
lbp_points = 24
lbp_image = local_binary_pattern(image, lbp_points,
lbp_radius, method='uniform')
hist, _ = np.histogram(lbp_image.ravel(), bins=np.arange(0,
lbp_points + 3), range=(0, lbp_points + 2))
hist = hist.astype("float")
hist /= (hist.sum() + 1e-7)
return hist
26
Face Recognition: Attendance Tracking System Group D, 2024
def extract_hog_features(image):
hog_features = hog(image, orientations=9, pixels_per_cell=(8,
8),
cells_per_block=(2, 2),
transform_sqrt=True, block_norm="L2-Hys")
return hog_features
class FaceDataset(Dataset):
def __init__(self, data_dir):
self.data_dir = data_dir
contents = os.listdir(data_dir)
self.images = [f for f in contents if f.endswith('.jpg')]
self.images.sort()
def __len__(self):
return len(self.images)
X_features = []
y_labels = []
data_dir = 'C:/Users/jayag/Desktop/ip_project/facedataset'
dataset = FaceDataset(data_dir)
X_features = np.array(X_features)
y_labels = np.array(y_labels)
X_train, X_test, y_train, y_test = train_test_split(X_features,
y_labels, test_size=0.2, random_state=42)
rf_classifier = RandomForestClassifier(n_estimators=100,
random_state=42)
rf_classifier.fit(X_train, y_train)
y_pred = rf_classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
model_filename = 'random_forest_model.joblib'
27
Face Recognition: Attendance Tracking System Group D, 2024
dump(rf_classifier, model_filename)
lbp_hog_test.py
def take_picture():
cam = cv2.VideoCapture(0)
cam.set(3, 1280)
cv2.namedWindow("test")
img_counter = 0
while True:
ret, frame = cam.read()
if not ret:
print("failed to grab frame")
break
cv2.imshow("test", frame)
k = cv2.waitKey(1)
if k%256 == 27:
print("Escape hit, closing...")
break
elif k%256 == 32:
img_name =
"C:/Users/jayag/Desktop/ip_project/test_facedataset/face_{}.jpg".f
ormat(img_counter)
cv2.imwrite(img_name, frame)
print("{} written!".format(img_name))
img_counter += 1
cam.release()
cv2.destroyAllWindows()
take_picture()
def extract_lbp_features(image):
lbp_radius = 3
lbp_points = 24
lbp_image = local_binary_pattern(image, lbp_points,
lbp_radius, method='uniform')
hist, _ = np.histogram(lbp_image.ravel(), bins=np.arange(0,
lbp_points + 3), range=(0, lbp_points + 2))
hist = hist.astype("float")
hist /= (hist.sum() + 1e-7)
return hist
def extract_hog_features(image):
28
Face Recognition: Attendance Tracking System Group D, 2024
def preprocess_and_extract_features(image_path):
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
image.resize((720, 1280))
lbp_features = extract_lbp_features(image)
hog_features = extract_hog_features(image)
combined_features = np.concatenate((lbp_features,
hog_features))
return combined_features
class FaceDataset(Dataset):
def __init__(self, data_dir):
self.data_dir = data_dir
contents = os.listdir(data_dir)
self.images = [f for f in contents if f.endswith('.jpg') ]
print(self.images)
self.images.sort()
def __len__(self):
return len(self.images)
data_dir = 'C:/Users/jayag/Desktop/ip_project/test_facedataset'
dataset = FaceDataset(data_dir)
selected_images = sample(dataset.images, len(dataset))
attendance_list = set()
for image_name in selected_images:
test_image_path = os.path.join(data_dir, image_name)
test_features =
preprocess_and_extract_features(test_image_path)
# Use the loaded model for prediction
predicted_label = loaded_model.predict([test_features])[0]
print("Predicted Label:", predicted_label)
attendance_list.add(predicted_label)
29
Face Recognition: Attendance Tracking System Group D, 2024
plot_image_with_prediction(cv2.imread(test_image_path,
cv2.IMREAD_GRAYSCALE), predicted_label)
attendance_list = list(attendance_list)
with open('attendance.csv', mode='w', newline='') as file:
writer = csv.writer(file)
writer.writerow(["SlNo", "Name"])
for i in range(len(attendance_list)):
writer.writerow([i+1, attendance_list[i]])
30
Face Recognition: Attendance Tracking System Group D, 2024
References
[1] Agarwal, Mayank & Jain, Nikunj & Kumar, Manish & Agrawal, Himanshu. (2010). Face
Recognition Using Eigen Faces and Artificial Neural Network. International Journal of
Computer Theory and Engineering. 624-629. 10.7763/IJCTE.2010.V2.213.
[2] Mady, Huda & Hilles, Shadi M. (2018). Face recognition and detection using Random
Forest and combination of LBP and HOG features. 1-7. 10.1109/ICSCEE.2018.8538377.
[3] Farah Deeba, Hira Memon, Fayaz Ali Dharejo, Aftab Ahmed and Abddul Ghaffar,
“LBPH-based Enhanced Real-Time Face Recognition” International Journal of Advanced
Computer Science and Applications(IJACSA), 10(5), 2019.
31