MP IEEE Report

Framework For Analysis and Automatic
Summarization of Surveillance Footage

Kaustubh Vivek Khedkar Shlok Bhosale
Dept. of Information Technology Dept. Information Technology
NITK NITK
Karnataka, India Karnataka, India
201IT128 201IT258
Abstract—The integration of dashcams in vehicles has revolu- plate recognition system, which makes it possible to extract
tionized data collection on roadways, leading to an unprecedented alphanumeric characters for identification and study.
accumulation of video footage. The aim of this project is to create Detecting potholes is crucial to evaluating the condition of
a method for analyzing large amounts of surveillance video into
brief but informative summaries, which will increase the data’s the road and guaranteeing driving safety. Yolov5 can identify
usability and accessibility for a range of applications. We suggest anomalies in the road surface, including gaps or depressions,
a system that easily incorporates dashcam video, increasing its and use that information to identify potholes. Authorities can
influence on security procedures and urban development work. prioritize repairs and enhance road conditions by timestam-
Leveraging automatic number plate detection and recognition, peding and recording the locations of these potholes after they
the system monitors vehicle activity in specific areas, aiding
in the detection of unauthorized access. We are implementing have been found.
Automatic Number Plate Recognition and Detection, Pothole Yolov5 is also capable of detecting a wide range of other
Detection with Timestamping, Road Quality Assessment, and items on the road, such as cars and road signs. Road sign
Road Object Recognition. By applying Yolov5 Models to object detection makes it possible to recognize directional signs,
detection, we are able to accurately locate and identify every point traffic signals, and speed limits, which is essential for traffic
of interest in processed videos, generating informative summaries
for further analysis. regulation and navigation systems. On the other hand, anomaly
Keywords: Video Analysis, Surveillance footage, License Plate identification (e.g., parked cars in no-parking zones) and traffic
Recognition, Road Object Detection, YoloV5 flow analysis are made easier by vehicle detection.
By utilizing Yolov5 for object detection pothole locations
I. INTRODUCTION can be examined for road quality evaluation, license plate areas
can be processed for recognition, and other items on the road
The incorporation of dashcams into automobiles has trans- can be identified for a variety of uses, including traffic control
formed the collection of data on public roads, resulting in and urban planning. This comprehensive approach makes it
an unprecedented quantity of recorded video. The increasing possible to effectively use surveillance data to improve traffic
amount of data generated by dashcams needs the development control, road safety, and infrastructure management.
of effective techniques for summarizing the vast quantity
of data into concise and practical summaries. In order to II. BACKGROUND
better meet this demand, the project presents a technique for The emergence of dashcams in vehicles has revolutionized
analyzing vast quantities of surveillance footage and producing the collection of data on roadways, resulting in an abundance
concise yet informative summaries, which improves the data’s of video footage capturing various driving scenarios and
usability and accessibility for a range of applications. The cre- conditions. This wealth of data presents both opportunities
ation of a system that seamlessly integrates dashcam footage and challenges for enhancing road safety, traffic management,
plays a crucial role to this strategy, expanding its influence to incident analysis, and urban planning. To harness the potential
include security protocols and urban development projects. of this data effectively, automated systems for summarizing
Yolov5, a state-of-the-art object detection model, offers sig- dashcam video footage have become a subject of extensive
nificant capabilities for detecting various objects on the road, research and development.
including license plates, potholes, road signs, and vehicles. A comprehensive literature survey reveals several key themes
Yolov5’s robust architecture and training methods enable the and advancements in the field of automatic summarization
identification and localization of these items with precision. of dashcam video footage. One prominent area of focus is
For applications like parking management, law enforcement, License Plate (LP) detection and recognition, which plays
and traffic monitoring, license plate detection is essential. a crucial role in applications such as vehicle tracking, law
Yolov5 can precisely draw bounding boxes around license enforcement, and toll collection. Researchers have explored
plates it detects in the recorded road footage. After additional diverse methodologies, ranging from traditional image pro-
processing, these locations of interest can be sent to a license cessing techniques to state-of-the-art deep learning models.
These efforts have addressed challenges such as variations in the effectiveness of automatic summarization systems.
LP formats between regions, adverse lighting conditions, and Despite notable advancements, the literature also identifies
perspective distortion, leading to the development of accurate several limitations and areas for future research. These include
and robust LP recognition systems. the need for broader evaluation datasets covering diverse road
Zeid Selmi et Al [Selmi et al., 2017] introduces an automatic conditions, extending detection capabilities to multi-lane roads
deep learning-based system for detecting and recognizing and complex intersections, considering hardware requirements
vehicle License Plates (LPs). It emphasizes the challenges for practical implementation, and addressing false negatives to
posed by LP variations between countries and the limitations ensure the reliability and safety of automated systems.
of existing systems that operate under controlled conditions. In summary, the background literature provides a compre-
M. Swathi et Al [Swathi and Suresh, 2017] introduced a sign hensive overview of the current state-of-the-art in automatic
detection system. In the traffic sign detection stage, various summarization of dashcam video footage, highlighting both
methods are reviewed. This includes colour-based detection, advancements and avenues for further exploration and devel-
shape- based detection, and methods combining colour and opment.
shape information. III. METHODOLOGY
Mengjuan Fei et Al [Fei, 2018] proposes a novel compact
yet rich key frame creation method for compressed video A. Video Input and Preprocessing using OpenCV
summarization. First, we directly extract DC coefficients of I Our approach combines OpenCV for preprocessing and
frame from a compressed video stream, and DC-based mutual video input processing, allowing frame-by-frame analysis to
information is computed to segment the long video into shots. analyse each frame in detail. We use a range of picture
S. Aggarwal et Al [Agrawal and Joshi, 2022] adds that the preprocessing methods to improve the quality of the frames
identification of licence plates (LPR) on Indian commercial and spot possible subjects, like licence plates and billboards.
trucks poses distinct difficulties because there are no standard- We modify the number of frames skipped during preprocess-
ised databases, different font styles, and occasionally hand- ing to maintain real-time video processing capabilities while
painted numbers. reducing processing delay. Yolov5 object detection algorithms
D. Arya et Al [Arya et al., 2022] helped create a dataset are integrated to detect and identify objects of interest. Post-
to overcome the Road damage detection challenge that has detection processing is then used to enhance the quality
always been a laborious and subjective process that depends and accuracy of the images. Yolov5’s detection and effective
on visual inspections that are prone to human error. RDD2022 preprocessing techniques ensure robust and accurate detection
covers a wide range of road types, weather conditions, and and analysis methods.
damage manifestations, from India’s busy city streets to
Japan’s immaculate motorways.
K. Zheng et Al [Zheng et al., 2012] proposed a novel
method that leverages the strengths of both Haar-like features
and Histogram of Oriented Gradients (HOGs) to achieve
robust and accurate plate detection. Haar-like features, known
for their efficiency in identifying local image patterns, excel
at locating potential plate regions.
Furthermore, studies have delved into the detection and
Fig. 1. System Flowchart
assessment of road defects, including potholes, cracks, and
bumps, using computer vision and machine learning ap-
proaches. By employing techniques such as image prepro- B. Object detection methods
cessing, object detection algorithms, and transfer learning, We examined the performance of machine learning-based
researchers have achieved significant progress in accurately object detection algorithm YOLOv5 (You Only Look Once
identifying road anomalies from dashcam footage. However, version 5). Conventional approaches depend on manually
challenges remain in evaluating these systems across diverse designed attributes and regulations, resulting in complex work-
road conditions, addressing speed limitations, and minimizing flows and restricted flexibility. With its deep learning-powered
false negatives to ensure reliable performance in real-world YOLOv5, which can analyse complete images in a single
scenarios. run and learn complex features straight from data, it is an
Additionally, the literature highlights the importance of image excellent real-time processor. It offers robustness and versatil-
preprocessing techniques to enhance the quality of input ity, detecting multiple objects with precise localization in one
data for subsequent analysis. Techniques such as grayscale go, without requiring multiple iterations. The computational
conversion, normalization, and noise reduction are essential requirements of YOLOv5 during training and execution, in
for improving the accuracy and robustness of deep learning addition to the fact that the quality of the training data
models trained on dashcam footage. Moreover, the utilization affects its performance. After comparing methods, we decided
of multiple datasets, including those specific to different re- that YOLOv5 would be the most effective object recognition
gions and road types, is emphasized for training and evaluating method for our project.
C. Number Plate and Sign Board recognition The database also makes vehicle frequency analysis pos-
In our system, vehicles are marked as regions of interest sible, which helps determine how frequently vehicles appear
(ROIs) for focused analysis of number plates, employing Easy- over a given period of time. In order to maximise road usage
OCR for character extraction. Character visibility inside ROIs and infrastructure development, this research offers useful
is improved by preprocessing techniques like thresholding and information for resource allocation and traffic management.
noise reduction. Prominent for its accuracy, EasyOCR detects
alphanumeric characters and attaches them to timestamps for
accurate identification.
CNN-based detection is also integrated with a traffic sign
identification module, which classifies signs according to their
shape, colour, and symbolism. Database combines classifica-
tions and timestamps made possible by sign localization. Re-
peated number plate detections are handled by de-duplication
algorithms, which guarantee accurate identification over time.
D. Road Quality Assessment
The road quality detection methodology begins with com-
prehensive data collection, acquiring diverse video footage
from dashcams and annotating frames to highlight specific Fig. 3. Database Functionality
features like potholes and road cracks. Data preprocessing then
breaks down the video into separate frames, separating regions Moreover, the comprehensive vehicle tracking function per-
of interest (ROIs) that include the road surface and identifying mits ongoing observation of individual cars, letting users
important features related to the detection of potholes and follow the whereabouts and actions of certain automobiles
road cracks. By using deep learning models like ResNet and across time. This feature is especially helpful in security
convolutional neural networks (CNNs), the methodology aims applications where it’s crucial to detect suspicious vehicle
to extract useful characteristics that are indicative of road movements or make sure authorized vehicles are following
fractures and potholes. approved routes.
Overall, these advanced functionalities of our license plate
recognition database significantly enhance the system’s ver-
satility, adaptability, and effectiveness in various surveillance
scenarios, ensuring robust surveillance and security capabili-
ties for both dashcam and surveillance footage.
Fig. 2. Pothole detection
E. Database functions
By adding additional functionality, such as accurate frame
extraction for focused analysis and thorough statistical track-
ing for daily, weekly, and monthly records, we have increased
the scope of our license plate recognition database. These
improvements enable thorough vehicle tracking, simplify entry
and departure tracking, permit in-depth vehicle frequency anal-
ysis, and make it easier to examine certain vehicle operations
in detail. Following these modifications, our system demon-
strates flexible tracking functions compatible with dashcam
and CCTV applications. The robust database guarantees the
efficient storage, retrieval, and analysis of licence plate data,
improving the overall surveillance and security applications’
sophistication and efficacy.
IV. R ESULTS time of only 75ms, underscoring their effectiveness in the LPD
system.
In this section, we present a comparison of the results of
the tested architecture with other bench marking methods.
TABLE I
P ERFORMANCE M ETRICS OF M ODEL
A. Automatic Number Plate Detection
Metric Performance
In our licence plate detection (LPD) system, we employ
OpenCV functions to initially identify license plates, followed Average Precision 0.8310
Average Levenshtein Distance 3
by character extraction using the EasyOCR Python library. Levenshtein Distance (Accuracy) 70%
Yolov5, a variant of the YOLO (You Only Look Once)
object detection algorithm, is instrumental in license plate
recognition (LPR) due to its accuracy and efficiency, allowing B. Road Sign Detection and Classification
real-time processing across diverse settings. Yolov5’s unified Initially, a Darknet model trained via the OpenCV dnn
architecture predicts bounding boxes and class probabilities library detected Traffic Signs across four categories. Subse-
simultaneously, facilitating efficient extraction of license plate quently, a CNN model in Keras classified cut fragments of
data crucial for subsequent character recognition. Traffic Signs into 43 classes. Trained weights were then loaded
The LPD system leverages EasyOCR for robust Optical into the YOLO v3 network, with parameters configured for
Character Recognition (OCR) on localized license plate re- detection. Frames were extracted from video and processed
gions, enhancing the capabilities of the Haar Cascade frame- with this model.
work. By integrating EasyOCR, alphanumeric characters can The classification model trained on the German Traffic Sign
be reliably extracted from detected license plates. This com- Recognition Benchmark (GTSRB) achieved an accuracy of
bination of EasyOCR’s character recognition accuracy and 0.868, utilizing 66,000 RGB images with 19x19 convolutional
Yolov5’s effective initial detection ensures a comprehensive layer filters. The initial model focusing on traffic sign location
approach to license plate recognition, further bolstered by the within four categories attained a high mAP accuracy of 97.22
Levenshtein distance metric to calculate character similarity
and enhance overall system reliability and accuracy.
Fig. 5. Sign Detection from video frames
Introducing an additional convolutional layer for classifi-

cation enhanced overall system efficiency. Video processing
demonstrated a frames-per-second (FPS) range between thirty-
six and sixty-one, making the system suitable for real-time
Fig. 4. Automatic Plate Detection
applications, varying based on the number of traffic signs
detected per frame.
Yolov5’s adaptability to various license plate designs, sizes,
Benchmarking this model with other systems on the German
and orientations, derived from training on diverse annotated
Traffic Sign Recognition Benchmark (GTSRB), our model
datasets, makes it a potent tool for precise and rapid license
reached an accuracy of 97.04% where some models like CNN
plate detection. This capability, along with its efficient man-
with 3 Spatial Transformers were able to reach an accuracy
agement of multiple objects in a scene, enhances the overall
of 99.71% according to the benchmark metrics.
efficacy of license plate recognition systems in applications
such as access control, security, and traffic monitoring. Pure TABLE II
CV methods, such as Morphological Filters, achieved a de- C OMPARISON OF R ESULTS ON GTSRB
tection rate of 95.5% with minimal false positives and a
recognition rate of 90%. Models employing transfer learning Model Accuracy
and a YOLO architecture achieved a mean average precision YoloV3 with CNN 97.04%
(mAP) of 99.2%. Additionally, Haar cascades demonstrated a CNN with 3 Spatial Transformers [Arcos-Garcı́a et al., 2018] 99.71%
recall of 95.2%, accuracy of 98.0%, with an average detection
C. Road Quality Assessment
Employing YOLOv5 for pothole detection presents a com-
pelling approach owing to its efficiency and accuracy in object
detection tasks. Pothole features such as size, form, and orien-
tation are included in the annotated datasets used to train the
model. This allows YOLOv5 to accurately learn to recognise
potholes in real-world circumstances. Its ability to anticipate
bounding boxes and class probabilities simultaneously im-
proves the detection process by precisely localising potholes
inside road surfaces. Furthermore, YOLOv5’s capacity to ad-
just to various lighting situations and road conditions enhances
Fig. 6. Vehicle Detection and Counter
its pothole detection efficacy. The identification and evaluation
of road surface problems may be greatly streamlined by
integrating YOLOv5 into pothole detection systems, which system’s web interface can be made more user-friendly by
would enhance road safety and infrastructure maintenance integrating with Streamlit, a Python-based web application
operations. framework. Streamlit provides real-time video streaming with
It approaches object detection as a regression problem, easy-to-use controls for managing camera feeds, as well as
simultaneously learning both bounding box coordinates and overlays of detected vehicles and count statistics.
class labels from a given input image. A number of obser- After development, performance parameters including ac-
vations were made after a careful analysis of the detection curacy, precision, recall, and processing speed are measured
results. Raindrops on car windscreens or reflections of objects through extensive testing and evaluation using a different test
on wet roadways were often mistaken for potholes. Similarly, dataset. In order to improve system accuracy and performance,
potholes have occasionally been recognised as tiny fissures and iterative improvements are made depending on evaluation
dark patches on roads during low visibility. False detections outcomes.
also occurred in clear weather conditions, such as the reflection
from a car hood being mistaken for a pothole.
TABLE III
C OMPARISON OF R ESULTS ON RDDC2022 [A RYA ET AL ., 2022]
Model F1 Score
YOLO-series and FasterRCNN-series Ensemble model 0.716
YOLOv5x P5 and P6 En-semble with Image patch 0.674
YOLOv7 with labelsmoothing and coordinateattention 0.663
D. Vehicle Detection and Counter

To develop a vehicle detection and counting system for
highway traffic, we begin by collecting a diverse dataset of
highway traffic videos capturing various vehicle types under
different environmental conditions and traffic scenarios. After
that, the videos undergo preprocessing that includes splitting
them up into individual frames and cutting them to fit the
object detection model’s required input size. Processing con-
sistency is ensured by pixel value normalisation.
For object detection, we employ the YOLOv5 model known
for its efficiency and accuracy in real-time applications. Fine-
tuning can be used for better performance, particularly in
difficult conditions, after training on the gathered dataset to
properly detect certain vehicle kinds. Every frame of the video
feed is recognised by the trained YOLOv5 model as a vehicle.
We apply the Deep SORT method to analyse recognised ob-
jects across a series of successive frames and enable counting.
By linking identified bounding boxes with already-existing ob-
ject tracks, this approach ensures continuous vehicle counting
inside a specified area of interest (ROI) and facilitates vehicle
identification monitoring. The car recognition and counting
V. CONCLUSIONS R EFERENCES
The integration of YOLO for object detection with [Agrawal and Joshi, 2022] Agrawal, S. and Joshi, K. D. (2022). Indian
OpenCV’s features has resulted in a highly advanced system commercial truck license plate detection and recognition for weighbridge
automation.
for real-time video analysis, excelling in segmenting video [Arcos-Garcı́a et al., 2018] Arcos-Garcı́a, , Alvarez-Garcia, J., and Soria Mo-
streams and accurately recognizing objects, including license rillo, L. (2018). Deep neural network for traffic sign recognition systems:
plates. By carefully preparing regions of interest (ROIs) and An analysis of spatial transformers and stochastic optimisation methods.
Neural Networks, 99.
using strong timestamping techniques, our approach to licence [Arya et al., 2022] Arya, D., Maeda, H., Ghosh, S. K., Toshniwal, D., and
plate identification makes it easier to build large databases for Sekimoto, Y. (2022). Rdd2022: A multi-national image dataset for
traffic analysis and law enforcement applications. automatic road damage detection.
[Fei, 2018] Fei, M., J. W. . M. W. (2018). A novel compact yet rich key
Our system’s versatility also extends to the analysis of frame creation method for compressed video summarization. In 2021
surveillance footage, allowing for efficient object tracking and International Conference on Robotics and Automation in Industry (ICRAI).
detection. Additionally, the database supports a number of [Selmi et al., 2017] Selmi, Z., Ben Halima, M., and Alimi, A. M. (2017).
Deep learning system for automatic license plate detection and recognition.
auxiliary applications, such as incident recording and vehicle In 2017 14th IAPR International Conference on Document Analysis and
movement tracking, and employs sophisticated de-duplication Recognition (ICDAR), volume 01, pages 1132–1138.
techniques to ensure data integrity. To put it briefly, our [Swathi and Suresh, 2017] Swathi, M. and Suresh, K. V. (2017). Automatic
traffic sign detection and recognition: A review. In 2017 International
comprehensive method provides a technically sound means of Conference on Algorithms, Methodology, Models and Applications in
monitoring and assessing traffic behaviour, giving stakeholders Emerging Technologies (ICAMMAET), pages 1–6.
useful information and instruments to improve security, safety, [Zheng et al., 2012] Zheng, K., Zhao, Y., Gu, J., and Hu, Q. (2012). License
plate detection using haar-like features and histogram of oriented gradients.
and operational effectiveness in transportation networks. In 2012 IEEE International Symposium on Industrial Electronics, pages
1502–1505.

MP IEEE Report

Uploaded by

Copyright:

Available Formats

MP IEEE Report

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MP IEEE Report

Uploaded by

Copyright:

Available Formats

Framework For Analysis and Automatic

Summarization of Surveillance Footage

Fig. 2. Pothole detection

Fig. 5. Sign Detection from video frames

Introducing an additional convolutional layer for classifi-

D. Vehicle Detection and Counter

You might also like