Plagiarism Checker X - Report
Originality Assessment
15%
=
Overall Similarity
Remarks: Vioderate similarity Verify Report:
Date: May 25, 2024
Scan this QR Code
Matches: 2278 / 15289 words _ detected, consider enhancing
Sources: 45 the document if necessary.A Project Report On Bird Species Recognition Application Submitted in partial fulfilment for
the degree of Bachelor of Technology in Data Science Submitted by Chiranjeevi Hole
(2015003) Sakshi Kakade (2015031) Priti Mhatre (2015040) Under the guidance of Prof.
Juhu Tara Road, Santacruz(West), Mumbai-400049 2023-2024
DECLARATION We Chiranjeevi Hole, Sakshi Kakade and Priti Mhatre, hereby declare that
the work presented in this project entitled “Bird Species Recognition Application” is entirely
our own. The content of this project has been generated through our independent efforts,
research, and scholarly contributions. We further declare that: 1. Originality: » The ideas,
concepts, and contributions presented in this work are solely the result of our own
intellectual endeavours. 2. Authenticity: + All data, figures, tables, and findings presented in
this projectare genuine and have not been fabricated or manipulated. 3. No Use of Al
Tools: + We have not used any Al-based tools to generate significant portions of this
Project including but not limited to content, research objectives, hypotheses, and analysis.
4. No Plagiarism: + We have properly cited and referenced all external sources and works.
consulted during the preparation of this [thesis/dissertation/research project). « There is no
instance of plagiarism or unauthorized use of others’ intellectual property. 5. Independent
Work: » This work has been conducted independently, without any collaboration or
assistance that would compromise the originality of the content. 6. Academic Integrity: +
We have adhered to the principles of academic integrity and ethical research throughout
the entire process of producing this [thesis/dissertation/research project]. We understand
reflects the nature and authenticity of our work. Date: 29th May, 2024 Signature:
Chiranjeevi Hole Sakshi Kakade Priti Mhatre
CERTIFICATE This is to certify that Chiranjeevi Hole, Sakshi Kakade and Priti Mhatre has
‘completed the Project Il report on the topic “Bird Species Recognition Application”satisfactorily in EINE! [jj Data Science under the
guidance of Prof.Merrin Mary Soloman as prescribed by Shreemati Nathibai Damodar
Thackersey Women’s University (SNDTWU) Guide Head of Department Prof.Merrin Mary
Soloman Mr.Rajesh Kolte Principal Dr. Yogesh Nerkar Examiner 1 Examiner 2
Contents Abstract i List of Figures iii Nomenclature iv 1 Introduction 1 1.1 Background
11.2 Research Motivation
2.1.3 Research Objectives
42 Review of Literature 8 2.1 Introduction
8 2.2 Scope and Objectives ..
8 3 Methodology 12 3.1 Introduction to Methodology .
3.1.4 Methodology for Model ... 123.1.2 Application
Development . . . 17 3.1.3 Bird Sound Recognition Model .
19 3.2 Research Design -203.2.4
Justification 21 3.3 Research Approach
21 3.4 Sampling Strategy
22 3.5 Data Collection Methods 22.3.6 Data Analysis
Techniques 22 3.7 Research Ethics
23 3.8 Limitations and Assumptions 23
3.9 Timeline and Schedule 24 3.9.1 System
Architecture 25 3.9.2 Design Details
26 3.10 System Development 27i
3.10.1 Programming Languages and Tools 30 3.10.2
Implementation Details 323.11 System Testing
. 33 3.12 Results and Evaluation .....
34 3.12.1 Performance Metrics 353.122Comparison with Requirements 36 3.13 Conclusion
36 3.14 Result/Output OF Project
37 3.14.1 For Bird Species Recognition Through Image Model: 37 3.14.2
For Bird Species Recognition Through Application: 38 3.14.3 For Bird Species
Recognition(Voice) Through Application: 39 4 Conclusion and Future Scope 40 4.1
Conclusion 40 4.2 Future Scope
41 A Brief Bio data of each student 42 B Plagiarism Report 45
C Research paper based on project 46 D References 55 ii
List of Figures 3.1 Project Roadmap 243.2
Predictions for Bird Recognition through Image: Kaggle model testing . 37 3.3 Predictions
for Bird Recognition through Application . 38 3.4 Predictions for Bird
Recognition(Voice) through Application ...... . 39 ii
Nomenelature 1. ff] GUI: Graphical User Interface 2. JDK: Java Development kit 3. SDK:
Software Development Kit 4. RAM: Random Access Memory 5. CPU: Central Processing
Unit 6. GPU: Graphics Processing Unit 7. OS: Operating System 8. EA IDE: Integrated
Development Environment 9. API: Application Programming Interface 10. Ul: User
Interface 11. ML: Machine Learning 12. CNN: Convolutional Neural Network iv
Chapter 1 Introduction There are few things that compare to the delight of viewing and
admiring the many bird species that adorn our skies and landscapes in a world full of life.
The beauty and variety of birds capture our attention and inspire awe in all of us, from the
majestic soaring of eagles to the delicate fluttering of hummingbirds. Being able to
recognize and identify these winged creatures brings a great sense of satisfaction to
individuals who have a passion for omithology, a strong appreciation for nature, or are just
curious about the diverse array of birds that live around us. Our goal to introduce people todriving forces behind our Bird Species Recognition Application. Utilizing state-of-the-art
technology and the capabilities of Convolutional Neural Networks (CNNs), we have
developed an advanced tool that enables users to identify bird species with previously
unheard-of ease and precision. Our program fj is based on the EfficientNet BO
architecture, which is well known for its effectiveness and efficiency in image recognition
tasks. Through the utilization of CNNs, we present a smooth and accurate method for
identifying bird species in photos, giving users an engaging experience that encourages a
closer bond with the natural world. 1.1 Background Given the diversity of avian species
found globally, bird species recognition is essential for a number of professions, including
omithology, biodiversity monitor1
ing, and ecological study. Prior work has concentrated on automating recognition through
the use of highly accurate Convolutional Neural Networks (CNNs). The great diversity of
bird species and the requirement for subtle traits provide challenges. Model adaption has
been enhanced by recent developments in transfer learning. Research examines citizen
science for dataset development, data augmentation, and optimizing pre-trained models.
Preprocessing images, extracting features, and evaluating models using metrics are
importan
leas. The purpose of t
work is to advance techniques for identifying bird
species for use in research and conservation. 1.2 Research Motivation We are driven by
the goal of democratizing bird identification making it available to everyone while
addressing issues that hobbyists and citizen scientists encounter. Particularly for rare
species, traditional techniques like field guides might be restrictive. Our goal is to close this
gap by using state-of-the-art technology o and Convolutional Neural Networks (CNNs) to
enable people to get involved in conservation activities and strengthen their bond with
nature. In addition to promoting scientific literacy and inspiring a new generation of
environment lovers, our research aims to conserve biodiversity worldwide. 2
1.3 Research Objectives The following are the research’s goals: 1. Developing BirdSpecies Recognition Application [f] Using convolutional neural networks (CNNs), develop
an intuitive application that can reliably identify bird species from photos. 2. Using CNNs:
Take advantage of CNNs to overcome obstacles in image recognition tasks and improve
efficiency and accuracy in bird species identification. 3. Integration of EfficientNet BO
Architecture: For optimal performance and accurate species classification, integrate the
EfficientNet B0 architecture. 4. User Accessibility: Give ease of use and user accessibility a
priority by creating an interface that is ff] simple to use and easily identifiable. 5. Validation
and Testing: Make sure the application is accurate by putting it through a rigorous test
against databases of recognized bird species and actual situations. 6. Research
Questions: Discuss important issues pertaining to CNN accuracy, efficacy, contrast with
current techniques, and implications for conservation and birdwatching. The project hopes
to improve bird species recognition technologies and promote a better knowledge of avian
biodiversity by accomplishing these goals. 1.4 Scope of the Study The goal of this project
is to create and assess a Bird Species Recognition Application that uses CNNs—more
especially, the EfficientNet BO architecture—to identify different species from photos. It
covers things like building applications, implementing CNN, acquiring datasets,
preprocessing, training models, evaluating metrics, and deploying mobile platforms.
Limitations: 1. Other modalities such as bird sound recogni
Nn are not included in the
study; it only deals with the identification of bird species from images. 2. It excludes other
CNN variations and limits the investigation of CNN architec3
tures to the EfficientNet BO model. 3. Extensive CNN model fine-tuning and sophisticated
application functionalities are excluded due to resource limitations. 4. The assessment
concentrates on conventional metrics, possibly ignoring intricate assessments such as
robustness against adversarial assaults and transferability. 1.5 Structure of the Document
1. Introduction: Bird species recognition is crucial for various professions, including
omithology, biodiversity monitoring, and ecological study. The beauty and variety of bird
species, from eagles to hummingbirds, capture our attention and inspire awe. The Bird‘Species Recognition Application aims to introduce people to the fascinating world Bo
our mutual fascination with birds. Utilizing state-of-the-art technology
and Convolutional Neural Networks (CNNs), the application provides an advanced tool
for users to identify bird species with unprecedented ease and precision. The program
is based on the EfficientNet B0 architecture, known for its effectiveness in image
recognition tasks. The application presents a smooth and accurate method for identifying
bird species in photos, encouraging a closer bond with the natural world. Recent
developments in transfer learning have enhanced model adaption, and research examines
citizen science for dataset development, data augmentation, and optimizing pre-trained
models. The purpose of this work is to advance techniques for identifying bird species for
use in research and conservation. 2. Literature Review: 1. Bird Image Classification using
CNN Transfer Learning Architectures + Builds an application for bird species detection on
varied dataset using Convolutional Neural Network. + Accuracy of the built application is
75. + Large and quality dataset built with citizen scientists’ assistance. « The app can be
fine-tuned for better accuracy and bird identification. 2. Automatic Bird Species
Identification Using Deep Learning « Bird's dataset built based on Asian bird species. *
Pretrained ResNet model achieved greater accuracy than the based model. Final model
shows 97.98 accuracy. 4
+ Cutting-edge vision automation achieved fast results with zero development costs. 3.
Building a Bird Recognition App and Large-scale Dataset with Citizen Scientists. « Builds
an application for bird species detection on varied dataset. - The app can be fine-tuned for
better accuracy and bird identification. 3. Methodology: - Model Development Methodology
+ Importing Libraries: The project imports critical Python libraries, including TensorFlow, for
machine learning and deep neural networks. + Setting Up Kaggle Environment: The Kaggle
API configuration file is set up to provide access to Kaggle’s extensive dataset collection. +
Downloading and Unzipping Datasets: i Two datasets, "525-western-bird-species” and
"25-indian-bird-species,” are downloaded and organized. + Merging Datasets: The scriptmerges datasets, transferring images from the “indian-bird-species” dataset into the
‘western-bird-species” dataset, expanding the model's classification capabilities. Setting
Random Seed: A random seed within TensorFlow is set to ensure reproducibility of
experiments and assess model performance stability. « Defining Constants: Essential
constants such as batch size, image size, image shape, number of classes, and training
epochs are explicitly defined. + Data Augmentation: Strategies are defined to augment
training data, generating diverse variations of training images. + Creating Data Generators
Two data generators are created to load and preprocess images from training and
validation directories. + Loading Pretrained Model: The EfficientNet80 model is loaded with
weights from the 'imagenet’ dataset, fine-tuning it for the specific problem. « Custom Model
Head: A custom classification head is integrated onto the base model, including global
average pooling, dropout layers, and fully connected layers equipped with ReLU
activation functions. + Compiling the Model: The model is systematically compiled with an
Training and Application Development Model Training: - Utilizes 'EarlyStopping’ callback to
Adam optimizer, categorical cross-entropy
halt training if validation accuracy plateaus. - ‘ModelCheckpoint’ callback ensures optimal
settings are maintained during training, Training the Model: « Utilizes ‘fit’ method to provide
model with training and validation data. §
+ Stores training progress in the ‘history’ variable. Saving Class Names: + Stores class
names in a JSON file for interpreting model predictions. + Loads the best model
configuration from the checkpoint file. Loading Class Names: « Retrieves class names from
the JSON file for human-readable labels. Making Predictions: + Loads, preprocesses, and
passes new images through tne model for classification, + Confidence in predictions is
determined by the model's confidence. Application Development: + Converts model EB.
‘TensorFlow Lite formal for efficient execution on resourceconstrained devices. + Creates a
JSON file for bird class names. * Creates an Android app using Android Studio * Selects
Java as the primary programming language for Android app development. « IntegratesTensorFlow Lite, a framework developed by Google, into the Android app.
Se Importation and Integration: + Importing essential libraries: Pandas (pd),
Numpy (np), Matplotlib.pyplot (pit), Scikit-leam for data manipulation, and Scikit-leam for
dataset splitting. - Training the Audio model using YamNet: YAMNet oll a convolutional
neural network (CNN) that processes audio spectrograms. It uses convolutional filters to
extract features from the signal, learning hierarchical features of audio events.
YAMNet uses gradient descent optimisation and backpropagation to minimize discrepancy
between input audio sample true labels and predicted probabilities. + Integrating ff the
mode! with the Android bird recognition app: This involves selecting a suitable model,
adapting it to the application's needs, collecting and preprocessing data, integrating the
model into the app, designing a user-friendly interface, optimizing for real-time processing,
testing and evaluating performance, gathering user feedback, and continuously improving
the feature based on feedback and performance metrics. 4. Results and Conclusions: -The
project combines mobile app development, deep learning, and data science to create a
flexible bird species classification system. It involves library imports, dataset blending, and
machine learning model honing. The application showcases the potential of machine
leaming in practical applications through expert training, post-training optimizations, and
TensorFlow Lite integration. [FJ] The accuracy of the bird recognition application is around
97 The 6
bird recognition app enriches users’ lives through a fun and educational experience,
contributing to biodiversity conservation and scientific research. As technology advances,
the app’s potential to impact our understanding and appreciation of the natural world is
promising. 5. Discussion: - The bird identification application demonstrated its
effectiveness in key areas, such as accurate bird identification, through extensive testing
and validation. The system's scalability and adaptability allowed it to handle varied image
processing requirements and growing user loads, indicating its ability to handle future
expansion and changing user needs. The application has potential for more extensive usesin research, teaching, and wildlife conservation, increasing public knowledge of bird
biodiversity and enabling citizen involvement in scientific studies. Future directions and
lessons learned include prioritizing iterative development cycles and continuous user
engagement to enhance application effectiveness and user satisfaction. Early and ongoing
user feedback should be prioritized in future iterations to spur advancements and guide
feature development. 6. Conclusion: - The process of this project combined mobile app
development, deep learning, and data science in a seamless way. Each stage built the
foundation for flexible bird species classification, from carefully chosen library imports to
the blending of various datasets and the honing of a strong machine learning model.
Expert training, post-training optimizations, and TensorFlow Lite’s incorporation into an
Android application demonstrated how complex algorithms can be combined with user-
friendly mobile interfaces. This convergence made sure that the user experience was the
main focus while showcasing the potential of machine learning in practical applications. In
essence, the bird recognition app not only enriches the lives of users by providing a fun
and educational experience but also contributes to the broader goals of biodiversity
conservation and scientific research. As technology continues to advance, the potential for
such applications to make a meaningful impact on our understanding and appreciation of
the natural world is truly promising. 7
Chapter 2 Review of Literature 2.1 Introduction o The identification of bird species has
attracted a lot of interest lately because of its consequences for ecological study,
conservation initiatives, and biodiversity monitoring. Thanks to developments in deep
leaming and computer vision algorithms, scientists have investigated a number of
approaches for precisely categorizing different bird species using image and audio data. In
this review of the literature, we examine four studies that use transfer learning
architectures o and convolutional neural networks (CNNs) to identify different species of
birds. Every paper offers a distinct perspective on the evolution of bird recognition systems,
highlighting the significance of selecting appropriate model architectures, high-qualitydatasets, and future approaches for enhancing scalability and accuracy.We seek to get a
thorough grasp of the current state-of-the-art o in bird species recognition and identify
potential for additional research and development by carefully analyzing the techniques,
findings, benefits, and limits of these studies. 2.2 Scope and Objectives The primary goal
of this review of the literature is to investigate approaches for bird species detection that
make use of computer vision algorithms o and convolutional neural networks (CNNs). It
looks into several methods and models used for picture and sound-based bird species
identification and classification. The review will 8
examine the benefits, drawbacks, and potential consequences of these approaches for
conservation initiatives, biodiversity monitoring, and technology development. Examining
previous studies, recognizing CNN architectures and transfer learning strategies,
assessing performance metrics, weighing the benefits and drawbacks of different
approaches, and investigating potential future developments like real-time application
integration and model fine-tuning are some of the specific goals. Review of Literature
Theme 1: PakhiChini: Automatic E Bird Species Identification Using Deep Learning.
AUTHOR NAME: Kazi Md Ragib, Raisa Taraman Shithi, Shihab Ali Haq, Md Hasan, Kazi
Mohammed Sakib, Tanjila Farah DATASET OBTAINED FROM: Using different dataset
available for bird classification based on Western culture. Plus collecting the o uncommon
data of those bird species from various sources and merging them with western dataset.
ALGORITHM USED: The deep leaning algorithm utilized a pretrained CNN model that
consisted of four different variants of ResNet, namely ResNet18, ResNet34, ResNet50,
and ResNet101. + The variants were employed to extract intricate features from input
images. Alongside ResNet, o two fully connected layers were incorporated to further
process these features, enabling the network to capture high-dimensional representations
essential for the task at hand. To mitigate overfitfing, a dropout layer was integrated,
randomly deactivating neurons during training. This comprehensive architecture enabled
the algorithm to effectively analyze and classify images with robustness and accuracy.WEB DEPLOYMENT: Web-based API service was developed using Flask micro-
framework. CONCLUSION: « The bird’s dataset was built based on Asian bird species. «
‘Two different models were proposed which showed that the proposed pretrained ResNet
model achieved greater accuracy in comparison to the based model. 9
+ Their final model shows 97.98% accuracy in identifying the bird species. ADVANTAGES:
‘System used deep learning algorithms with high accuracy and implementation of cutting-
edge vision automation was done to get fast results with zero development costs.
DISADVANTAGES: Due to the shortage of Asian based birds dataset, limited work was
done on this topic. Theme 2: FR] Building a bird recognition app and largescale dataset with
citizen scientists AUTHOR NAME: Grant Van Horn, Jessie Barry, Steve Branson, Panos
Ipeirotis,Ryan Farrell, Pietro Perona, Scott Haber, Serge Belongie DATASET OBTAINED
FROM: uneya fusing conbiaton cio sont, experts, nd Mecha
Surera| imageNet ALGORITHM USED: « The research paper employed a specialized
algorithm known as the edge of CNN for Computer Vision tasks. This algorithm capitalized
on ff] the capabilities of Convolutional Neural Networks (CNNs) to extract features from
images, particularly focusing on detecting edges and patterns. o Through a series of
convolutional layers, the algorithm convolved leamed filters across input images,
effectively capturing local features and patterns. « Pooling layers were then used o to
downsample the feature maps, retaining essential information while reducing
dimensionality. This approach facilitated accurate object detection, H image classification,
and segmentation, making it a valuable contribution to the field of Computer Vision
research. CONCLUSION: This research builds an Application for Bird Species detection on
varied dataset o using Convolutional Neural Network. o The accuracy of the built
application was 75% ADVANTAGES: The proposed application's dataset was built using
surveying and with help of citizen scientists, hence a large and quality dataset was.
considered. DISADVANTAGES: The involvement of citizen scientists in data collection
increased the complexity leading to very low accuracy of 75%. 10Theme 3: Image based Bird Species Identification using Convolutional Neural Network
AUTHORIS NAME; Satyam Raj, Saiaditya Garyali, Sanu Kumar, Sushila Shidnal
DATASET OBTAINED FROM: Microsoft's Bing Image Search API v7 ALGORITHM USED:
+ The research harnessed H Convolutional Neural Networks (CNNs), powerful tools in
deep learning for Computer Vision. CNNs are adept at automatically learning and
extracting features from images, enabling tasks such as image classification, object
detection, and segmentation. + Leveraging their hierarchical structure, CNNs efficiently
capture both low-level and high-level features, facilitating accurate analysis of visual data.
Their versatility and effectiveness make CNNs indispensable in H a wide array of
applications, from medical imaging to autonomous driving. CONCLUSION: In this paper,
Method is proposed to predict the bird species from images using the most sought
algorithm of Deep Learning, Convolutional Neural Network. H The entire experimental
research was carried out on Windows 10 Operating System in Atom Editor with
TensorFlow library. The entire system is built on: Pythons in Atom Editor and deployed in
the Django web framework. They developed the entire CNN Model from scratch, imparted
training to it and finally tested its efficacy. » The application developed is generating results,
with a high accuracy of 93.19% on the training set and 84.91% on the testing set.
ADVANTAGES: Used CNN as it is suitable for implementing advanced algorithms and
gives good numerical precision accuracy, achieved accuracy was 84%- 94%.
DISADVANTAGES: The model faced a great loss of data while training. 11
Chapter 3 Methodology 3.1 Introduction to Methodology [fj Birdwatching is a well-liked
hobby that builds a profound respect for the avian world and helps people connect with
nature. These species catch our hearts and arouse a sense of awe, whether it is through
the sweet song of a songbird, the magnificent flight of a raptor, or the brilliant plumage of a
tropical bird. Bird enthusiasts, scientists, and nature lovers from all around the world are
interested in identifying and learning more about these species, which will advance bothsci-entific understanding and personal enjoyment. However, even for experienced
birdwatchers, recognizing different bird species can frequently be a difficult and time-
consuming task. Since many
species have minute visual variations, carrying field
guides or reference books isn't always practical. Additionally, there are thousands H of
different species of birds throughout the world, making proper identification difficult for
anyone. We seek to accelerate, improve upon, and broaden the accessibility H of bird
species recognition by leveraging the power of artificial intelligence and deep learning
3.1.1 Methodology for Model 1. Importing Libraries: The project initiation involves the
importation of a selection of critical Python libraries. At the forefront of these imports is
TensorFlow, a powerhouse H in the realm of machine learning and deep neural networks.
Additionally, the soript in12
corporates various utility ibraries aimed at streamlining the handling of images and
datasets. The inclusion of these libraries signifies the foundational components on which
our entire project relies. 2. Setting Up Kaggle Environment: In an effort to streamline the
data acquisition process, we proactively set up the Kaggle environment. This preparatory
step involves specifying the directory that houses the Kaggle API configuration file. The
presence of this configuration file is instrumental as it grants us unfettered access to
Kaggle's extensive collection of datasets. This setup enables us to download datasets
directly from Kaggle, a pivotal component in the data collection phase of our project.
3.Downloading and Unzipping Datasets: The subsequent stage of our project focuses on
the download and organization of two distinct datasets: °525-western-bird-species" and
"25-indian-bird-species.” These datasets, sourced directly from Kaggle, are essential to our
project's objectives. Following the download process, the script dutifully unzips the
datasets, meticulously organizing the contents into separate directories. This structured
approach simplifies the management of data and | Eatiosimotaed data
processing steps. 4.Merging Datasets: Recognizing the value of a comprehensive dataset
for bird species classification, the script takes a noteworthy step by merging datasets.Specifically, it orchestrates the transfer ofl images from the “indian-bird-species" dataset
into the "western-birdspecies" dataset. This amalgamation expands the breadth of species
our model can classify, thereby enhancing the project's classification capabilities. 5.Setting
Random Seed: we implement a key measure by setting a random seed within TensorFlow.
This specific step is vital in ensuring that our experiments are reliably reproducible in
subsequent runs. By establishing a fixed random seed, we can confidently compare results
and assess the stability of our model's performance across different iterations. 13,
6 Defining Constants: Explicitly defining essential constants. These constants include
batch size, image size, image shape, the number of classes, and the number of training
epochs. Each of these constants plays a pivotal role in guiding the configuration
parameters used throughout our project. This clear definition of constants serves as a
foundation for model training and data processing, contributing to the project's rigor. 7.Data
Augmentation: Leveraging TensorFlow’s 'ImageDataGenerator,’ we define a set of
strategies to augment our training data. This augmentation process generates an array of
diverse variations of our training images, which is instrumental in improving our model's
capacity to generalize across a range of data scenarios. This approach enhances the
model's robustness [Fj and its ability to adapt to variations in real-world data. 8. Creating
Data Generators: To facilitate the data flow into our model during training, we thoughtfully
create two pivotal data generators using TensorFlow's ‘ImageDataGenerator.’ These data
generators are meticulously designed to load and preprocess images from our training and
validation directories. This crucial step streamlines the data supply process, ensuring that
the model is fed with appropriately processed data during the training phase. 9. Loading
Pretrained Model: harnessesing the power of transfer learning by loading the
EfficientNetB0 model. This model is preloaded with weights from the ‘imagenet’ dataset, a
rich source of pre-existing knowledge. Functioning BB a tho foundation fo} our
classification task, this pre-trained model allows us to build upon a solid base, fine-tuning it
for our specific problem. 10. Custom Model Head: A custom classification head isthoughtfully integrated onto the base model. This head comprises global average pooling,
dropout o layers, and fully connected layers equipped with ReLU activation functions. The
final output layer adopts a softmax activation function, aligning the model for precise
classification. 14
11, Compiling the Model: The model is systematically compiled. This process involves
configuring the model with an Adam optimizer, categorical cross-entropy
Lind or metric. This compilation prepares the model for efficient training while
enabling real-time monitoring of its performance. 12. Defining Callbacks: To enhance o
the effectiveness of our model training, we introduce two pivotal callbacks. The
‘EarlyStopping’ callback is designed to halt training should validation accuracy plateau,
thus preventing unnecessary computational overhead. Simultaneously, the
"ModelCheckpoint’ callback automates the preservation of the best model configuration
during training, ensuring that optimal settings are maintained. 13. Training Bee Model:
With the model architecture, data generators, and callbacks in place, we commence the
training phase. We employ the "fit’ method, providing the model with training and validation
data via the data generators. This procedure systematically tracks and stores training
\cluding loss and accuracy metrics, in the ‘history’ variable. 14. Saving Class
ation of interpreting model predictions, we store class names within a
JSON file. This file functions as a valuable reference, enabling us to establish meaningful
associations between numerical predictions and the actual bird species they represent.
15,Loading the Saved Model: Subsequent to the training phase, we seamlessly load the
best model configuration from the checkpoint file. This configuration encapsulates the
culmination of our training efforts and is poised for making predictions on new data
16.Loading Class Names: To facilitate the interpretation of our model's predictions, we
meticulously retrieve class names from the previously stored JSON file. This step is vital
for providing human-readable labels to our model's outputs. 15.17. Making Predictions: Using new images, we proceed to load, preprocess, and pass
these images through our model for classification. If the model's prediction confidence
surpasses a predefined threshold, typically set at 0.85, we confidently print the predicted
class. Conversely, if the model's confidence falls below this threshold, we explicitly indicate
the mode'’s inability to make a confident prediction, ensuring transparency and precision in
our results. 16
3.1.2 Application Development 1. Deep Learning Model Conversion and Data Preparation:
+ TensorFlow Lite Conversion: To make the model suitable for mobile devices, it was
converted
execution on resource-constrained devices. This conversion reduces the model's size and
‘0 TensorFlow Lite format (tite). TensorFlow Lite is designed for efficient
ensures faster inference. + Bird Class Names: A JSON file was created to store the names
of all the bird classes used for classification. Each class name corresponds to a specific
bird species and is associated with a unique index used in model predictions. 2. Android
App Development Using Android Studio: « IDE Selection: Android Studio, Be official
Integrated Development Environment for Android app development, was chosen for the
project. This IDE provides tools and resources for building Android applications efficiently. »
Project Setup: A new Android project was created in Android Studio. This involved
configuring the target Android version, defining the project's structure, and setting up
essential dependencies, such as Bine Android Gradle Plugin. 3. Programming Language
and Framework: + Programming Language:Java was selected as the primary programming
language for Android app development. Java has been a traditional choice for Android
development, offering stability and a rich ecosystem of libraries. « Framework: TensorFlow
Lite, a framework developed by Google, was integrated into the Android app. TensorFlow
Lite is optimized for mobile and edge devices, providing a seamless way to execute
machine learning models. 4. Integration of TFLite Model and JSON File: + TFLite Model
Integration: The TensorFlow Lite model (.tflite) was integrated into the Android project. This
involved adding the model file to the app's assets folder and creating code to load and runinference using TensorFlow Lite Interpreter. Proper error handling and resource
management were ensured to prevent app crashes and optimize resource usage. 17
+ JSON File Integration: The JSON file containing bird class names was also integrated into
the Android project. The app read and parsed this file to create a mapping between class
indices and human-readable class names. This mapping was crucial for presenting
classification results to the user in an understandable manner. 5. App Testing and
Debugging: + Testing:The app underwent extensive testing on various Android devices and
emulators to ensure its functionality and performance across a range of hardware
configurations. Different test cases were executed to validate the accuracy and reliability
B.. the bird species classification results. + Debugging:Android Studio's built-in
debugging tools were utilized to identify and fix issues. This iterative process helped in
ensuring B a seamless user experience and resolving any runtime errors or unexpected
behavior. 6. User Interface Design and User Experience: * XML Layout:The user interface
(Ul) was designed using XML layouts, which define the arrangement of UI elements such
as buttons, image views, and text views. These elements were placed and styled to create
an intuitive and user-friendly interface. » UX Considerations:User experience (UX)
principles were taken into account during UI design. Feedback mechanisms, such as
progress indicators or informative messages, were implemented to keep users informed
about the image classification process. This attention to UX enhances H user satisfaction
and engagement with the app. 7. Building the Android App: + Build Process:The Android
Studio build system compiled the source code, integrated resources (including the
TensorFlow Lite model and JSON file), and bundled them into an Android application
package (APK) file. + APK Generation:The project was built, and the final APK file was
generated, containing the app itself, the integrated TensorFlow Lite model, and the bird
class names JSON file. This APK file is ready for deployment. 8. Deployment and
Distribution: 18+ Deployment:The APK file was deployed onto Android devices for further testing,
validation, and user feedback. It might have been distributed to a group of beta testers to
collect feedback and identify potential issues. « Distribution:Depending on the deployment
strategy, the app could be distributed through various channels. This may include
publishing the app on official app stores like Google Play or making it available for direct
installation from a website, allowing users to download and install the app on their Android
devices. 3.1.3 Bird Sound Recognition Model 1. Importing Required Libraries: For our fl
gecncerptiienmece) «= imported the following essential libraries: Pandas (pd):
Used for data manipulation, file handling, and managing DataFrames containing features
and labels. Numpy (np): Used for numerical computations, array operations, and
mathematical functions in feature processing. Matplotlib. pyplot (pit): For data visualization
tasks such as plotting spectrograms and audio waveforms for analysis. Scikit-learn For
splitting H the dataset into training and testing sets to evaluate model performance. 2.
Training the Audio model Using YamNet YAMNet is a convolutional neural network (CNN)
that processes audio spectrograms. It uses a spectrogram as input and uses convolutional
filters to o extract features from the signal. [fj The convolutional layers extract abstract
representations of audio features, which are then downsampled to lower their
dimensionality while keeping crucial data intact. This hierarchical processing allows
YAMNet to learn hierarchical features of audio events. The network's top layers are
connected and followed by softmax activation for classification. The output classes,
representing various audio events or categories, are mapped to the learned features. The
softmax function normalizes the output scores across all classes to create a probability
distribution. [J YAMNet uses gradient descent optimisation and 19
backpropagation to modify the weights of its layers, aiming to minimize the discrepancy
between input audio sample true labels and predicted probabilities. This o training
process on the AudioSet dataset makes YAMNet proficient in classifying various audio
events, making it useful for various audio classification tasks. . 3 Integrating Model withApplication:Integrating a voice recognition model into your Android bird recognition app
involves selecting or developing a suitable model, adapting it to your application's needs,
collecting and preprocessing data, integrating the model into your app, designing a
userfriendly interface, optimizing for real-time processing, testing and evaluating
performance, gathering user feedback, and continuously improving the feature B based
on feedback and performance metrics.: Many people find satisfaction in being able to
recognize B and appreciate the beauty of our feathered friends in a world full of
fascinating and unique avian species. i Our Bird Species Recognition Application is
made to get you closer to the fascinating world of birds, whether you are an ornithology
enthusiast, a nature lover, or simply inquisitive about the rich tapestry of birdlife that
surrounds us.We have utilized the potential H of Convolutional Neural Networks (CNNs)
by utilizing the effective-ness of the EfficientNet BO architecture and the power of cutting-
edge technology. 3.2 Research Design The overall research design adopted for the Bird
recognition application is 1. Data Collection and Model Testing: Compiling large datasets
H for the purpose of bird image recognition is the main research project. This dataset is
used to systematically test a variety of machine learning o models, such as convolutional
neural networks (CNNs) and deep learning architectures. By means of thorough
experimentation, the optimal model is determined by performance metrics like processing
speed and accuracy. After the best model is chosen, it is converted to the TensorFlow Lite
(TFLite) format so that ff ne application can be integrated with it effectively. 2. Integration
of Image Recognition Model: Carefully considered integration with the application is carried
out after the model has been chosen and converted. The TFLite model is smoothly
integrated into the application's codebase during this process, guaranteeing compatibility
and top performance 20
in the Android Studio environment. 3.Creation of an Audio Detection Model: The creation of
an audio detection model for bird species is a secondary research focus that runs
concurrently with image recognition. An audio detection model is painstakingly constructedand refined o through the use of methods like spectrogram analysis and machine learning
algorithms. The audio detection model is successfully developed and then smoothly
incorporated into the application's architecture. 3.2.1 Justification This design was selected
because it offers a systematic framework for testing and refining various components of the
bird recognition application. Variables like machine learning models and image processing
algorithms can be adjusted through controlled experiments to determine how well they
identify H different species of birds. Through the collection of quantitative data, like
processing times and accuracy rates, the experimental design makes it possible to
evaluate the application's performance objectively. Additionally, by using an experimental
approach, possible problems can be found and solutions can be improved upon to improve
B the functionality of the app. All things considered, this research design offers a
systematic and organised way to use Android Studio and Java to create a trustworthy bird
identification application. 3.3 Research Approach This approach was chosen based on its,
compatibility with the research objectives and the nature of the available data sources. For
image data, a quantitative approach was employed, utilizing datasets sourced from Kaggle.
These datasets were meticulously curated and merged
dataset suitable for training image recognition models. xj On the other hand, for audio
data, a qualitative approach was adopted, leveraging competition data also obtained from
Kaggle. This qualitative data source provides valuable insights into the acoustic
characteristics of bird species, essential for developing an audio detection model. By
combining both quantitative and 21
qualitative data sources, this mixed-methods approach ensures B a holistic understanding
of bird recognition, encompassing both visual and auditory cues. 3.4 Sampling Strategy
The act of choosing a subset of people or things from a larger population to represent that
population for statistical analysis or research purposes is known as sampling. For bird
recognition model (Image) sampling size was approximately 10,000 -15,000 For bird
recognition model (Audio) sampling size was approximately 15,000 -25,000 3.5 DataCollection Methods The selection of this methodology was predicated on its alignment with
the research goals i and the characteristics of the accessible data sources. A
quantitative method was used for the image data, with datasets obtained from Kaggle.
Carefully selected and combined, these datasets produced an extensive set Ber te that
ian used to train image recognition models. However, a qualitative strategy was used
for the audio data, making use of competition information that was also acquired from
Kaggle. This qualitative data source offers insightful information about the acoustic
properties of different bird species, which is crucial for creating an audio detection model
This mixed-methods approach guarantees H a comprehensive understanding of bird
recognition, encompassing both visual and auditory cues, by combining both quantitative
and qualitative data sources. 3.6 Data Analysis Techniques We used tools like Matplotlib
and SciPy to visualise and analyse the dataset as part of our research on bird data,
analysis. We were able H to find patterns in bird characteristics like size, colour, and
distribution thanks to Matplotlib's array of plot types, which included box plots, histograms,
and scatter plots. By computing metrics like mean, median, and variance, SciPy's
statistical functions allowed us to perform more in-depth analysis in addition to
visualisation. In the end, this method of combining statistical analysis and visual 22
aids with visualisation helped us make well-informed decisions and conduct additional
research for our project by providing insights into the behaviour and distribution of bird
data. 3.7 Research Ethics All procedures related to data collection, analysis, and reporting
are carried out with integrity and transparency because the research closely complies with
ethical guidelines and regulations. This entails getting the required authorizations in order
to use datasets and photos, protecting the privacy and confidentiality of any sensitive data,
and, if necessary, getting informed consent.Furthermore, the study upholds academic
honesty and disapproves of any plagiarism or unethical behaviour. Every source is
correctly referenced, and any usage of previously published work is given due credit,
Additionally, rather than just repeating previous findings, an attempt is made to make surealso conducts its business with professionalism and regard for all parties involved,
including participants, collaborators, and coworkers. 3.8 Limitations and Assumptions
While our bird recognition application excels in o identifying bird species based on visual
cues, there are some limitations to consider. Firstly, the accuracy of bird voice recognition
within the app may have room for improvement. While the voice recognition mode!
performs reasonably well, there is a chance for enhancement through further refinement
and training with a larger and more diverse dataset of bird vocalizations. Additionally, due
to the limited availability of labeled data, we were unable o to achieve high accuracy in
voice recognition. Furthermore, while the user interface (Ul) of the app provides basic
functionality, there is potential for improvement to enhance user experience and
engagement. Future iterations of the app will focus on addressing these limitations to
provide users with a more comprehensive and intuitive bird recognition tool. 23
3.9 Timeline and Schedule Figure 3.1: Project Roadmap 24
3.9.1 System Architecture Overview of the System Architecture 1.Android Application
Layer: Handles user interface and interaction. Captures user inputs and displays
classification results. 2.Deep Learning Model Layer: Contains TensorFlow Lite model for
bird sound recognition. Executes inference on audio recordings to predict bird species.
3.Data Preprocessing Layer: Preprocesses audio data before feeding it into the model
Includes modules for feature extraction and data normalization. 4.External Dependencies
Layer: Manages external resources like the JSON file for class names. Integrates with
system-level functionalities for audio capture. Modules and Their Interactions: User
Interface Module: « Receives user inputs and displays information. « Interacts with other
modules to initiate recording and present results. - Audio Processing Module: * Captures
audio from microphone or files. « Interfaces with Data Preprocessing Module for feature
extraction. - Data Preprocessing Module: + Converts audio to spectrograms andnormalizes data. » Prepares audio data for model inference. ~ Deep Learning Model
Interface Module: + Loads TensorFlow Lite model and executes inference. « Receives
predictions and communicates with User Interface Module. — External Resources Module:
+ Handles loading and parsing of class names JSON file. + Provides information for
interpreting classification results. 25
Interaction Flow: 1. User interacts with UI Module, initiating audio recording or selection. 2.
Audio Processing Module captures audio and passes it for preprocessing. 3. Preprocessed
data is sent to Model Interface Module for inference. 4. Model predicts bird species, and
results are communicated to U! Module. 5. Ul Module retrieves class names from External
Resources Module and displays results to the user. 3.9.2 Design Details Android
Application Layer 1. Design Considerations: To guarantee aesthetic appeal and user-
friendliness across a range of device sizes and orientations, the user interface design
conforms to Android's Material Design standards. Because of its flexible design, layouts
can adjust B to a variety of screen sizes and aspect ratios. 2. Technologies Used: To
define the user interface elements and their properties, XML layouts are used. The
application logic is implemented and user interactions are managed using Java. Features of
the Android SDK are used to access device functionalities including file storage and
microphone input. Layer of the Deep Learning Model 1. Design Considerations:In order
to balance computing efficiency and efficacy in bird sound detection, EB. deep learning
model architecture was carefully chosen.In order to guarantee seamless deployment and
operation on mobile devices, model optimization is a top concern. 2. Technologies
Employed: The foundation for model conversion and integration into the Android
application is TensorFlow Lite. Comprehensive performance evaluations and benchmarks
are used to guide the selection of the mode's architecture and parameters. Layer of Data
Preprocessing 1. Design Considerations: Preprocessing methods are carefully selected to
improve the model's capacity to manage differences in audio recordings.To 26preserve efficacy, compatibility between preprocessing processes and the selected
architecture of B the deep leaming model is guaranteed. 2. Technologies
Employed:Spectrophotograms are produced from unprocessed audio data using the Short-
Time Fourier Transform (STFT).Libraries such as NumPy are used to implement advanced
normalization techniques, such as mean subtraction and standard deviation division, for
processing efficiency. External Dependencies Layer 1. Design Considerations: To simplify
application functioning, efforts are focused on the effective administration of external
resources, such as the JSON file holding class names.Prioritizing good integration with
system-level operations allows for easy access to device functionality. 2. Technologies
Used: The class names file is loaded and parsed using JSON parsing libraries. Features of
the Android SDK make it easier to integrate devices with features like file system access
and microphone input. Flow of Interaction 1. Design Considerations:From audio input to
categorization results, the user experience is made seamless by the well designed
interaction flow. At every step of the process, feedback channels are put in place to keep
people informed and involved. 2. Technologies Used: Java's event-driven programming
paradigms are used to manage user input and coordinate program operations. Reactivity is,
guaranteed in resource-intensive operations like audio processing and model inference by
using asynchronous programming approaches. 3.10 System Development 1.Introduction
The goal of the system development project is to use deep learning techniques to provide
a reliable solution focB ne classification of bird species. Our 27
main goals are to create an accurate deep learning model, build an effective system
architecture, and incorporate it into an intuitive application. Developers, data scientists,
subject matter experts, and possible end users are among the project's stakeholders. 2.
Phase of Planning 1. Requirement gathering involved holding Hioeoen
feecene ascertain the data sources, model specifications, and functionality of the
system. 2. Design of System Architecture: created a scalable system architecture with
application interfaces, deep learning model components, and modules for datapreprocessing.Decided to use TensorFlow as the main framework for creating and
deploying models. 3. Gathering and Organizing Data 1. Dataset Selection: To ensure
diversity and relevance to the categorization challenge, two extensive datasets on o bird
species were identified and chosen from Kaggle. 2. Data preprocessing involves.
organizing and cleaning datasets to get rid of noise and irregularities so that the models
are ready to be trained. 4. Developing the Model 1. Bringing in Libraries: Import necessary
Python libraries for deep learning and data manipulation, such as pandas, numpy, and
TensorFlow. 2. Setting Up the Environment: Made sure the development environment is
compatible with TensorFlow and other libraries by configuring it with the required tools and
dependencies. 3. Downloading and Preprocessing Datasets: To improve model
performance, a selection of datasets were downloaded and preprocessed, including data
augmentation and normalization procedures. 4. Model Architecture Design: Using the
EfficientNetB0 model, which was pretrained on the ImageNet dataset, H a convolutional
neural network (CNN) architecture was designed using transfer learning principles. 28
5, Model Training: Using preprocessed datasets, the CNN model was trained, and its
accuracy and efficiency were maximized by adjusting its hyperparameters. 6. Model
Evaluation: Using validation datasets, the trained model's performance was assessed by
calculating metrics like accuracy and precision. 7. Model Optimization: Enhanced the
model's effectiveness and efficiency,finetuning architecture and hyperparameters as
needed. 5. System Integration 1. Model-Application Integration: Created a seamless
‘communication channel between the frontend and backend components by integrating the
trained model with the application backend. 2. User Interface Design: With an emphasis on
usability and simple navigation, user-friendly interfaces were created for the application
frontend. 6. Quality Control and Testing: 1. Unit Testing: To make sure every unit operates
as intended, unit tests were carried out for separate modules and components. 2.
Integration Testing: Tested the integrated system to confirm data flow and communication
routes, as well as interactions between components. 3. System Testing: To assess thesystem's overall functionality, responsiveness, and dependability, end-to-end testing was
cooperation with stakeholders, obtaining input for incremental changes. 7. Implementation
and Upkeep 1. Deployment Planning: Developed deployment settings and tactics to
guarantee a seamless deployment with the least amount of downtime. 2. Deployment
Execution: Set up the system in staging or production environments, keeping an eye on the
deployment procedure, and resolving any problems that may arise. 3. Post-Deployment
‘Support: Offered continuing assistance and upkeep for the system that was put into use,
quickly responding to user comments and 29
bug complaints. 4. Performance Monitoring and Optimization: Over time, components were
optimized for increased scalability and efficiency by keeping an eye on system
performance and resource utilization. 8. Documentation and Training 1. System
Documentation: User manuals and technical documentation were created for stakeholders,
and system architecture, design choices, and implementation specifics were documented.
2. Training and Knowledge Transfer: Taught administrators and ves SE
system and made sure they understood its features. 9. Project Evaluation and Final
Thoughts 1. Project Review: Evaluating success criteria and performance measures, the
project goals and objectives were reviewed in relation to the delivered system. 2. Lessons
Learned: Recorded accomplishments, difficulties, and opportunities for development,
together with important insights and lessons discovered throughout the system
development process. 3.10.1 Programming Languages and Tools 1. Programming
Languages 1. Python: Because of its broad support for libraries and frameworks, especially
in the fields of machine learning and deep learning, Python was the dominant
programming language used for system development. It made jobs like developing models,
preparing data, and implementing backend application logic easier. 2. Java: Because of its
stability, platform neutrality, and robust library ecosystem, Java was chosen as the
programming language for creating Android apps. By using it, the creation of reliable andexpandable mobile applications was guaranteed. 30
2. Frameworks 1. TensorFlow: The fundamental foundation for creating and refining deep
leaming models was TensorFlow, Its robust community support and extensive APIs made
model building, optimization, and deployment incredibly efficient. 2. TensorFlow Lite: The
Android app's incorporation of TensorFlow Lite enabled the effective operation of machine
leaming models on mobile devices. This framework made the most use of available
resources and made it easier to integrate deep learning features into the mobile
application. 3. Libraries and Tools 1. Kaggle API: By providing direct access to and
downloads of bird species datasets from the Kaggle platform, the Kaggle API expedited the
data collecting process. This made it easier to include various datasets into the system. 2.
Android Studio: Providing a full range of tools and resources for creating Android
applications, from project setup to deployment, fF Android Studio is the official Integrated
Development Environment (IDE) for Android app development. 3. Pandas: Data
cleaning,organizing, and feature extraction were among the preprocessing and
manipulation tasks that made use of the Pandas library. Its user-friendly operations and
data structures increased efficiency when preparing data. 4. NumPy: NumPy was essential
for performing mathematical operations, array computations, and data processing and
model training. Its quick array operations expedited a range of data manipulation tasks. 4
Additional Instruments 1. Git: By providing effective code management, version tracking,
and team collaboration, the Git version control system promoted collaborative
development. 2. GitHub: To enable code sharing, teamwork, and version control among
team members, GitHub functioned as the central repository for project code,
documentation, and resources. 31
3. Android Emulator: Using the Android Emulator, one could test an Android application on
a virtual device to make sure it worked and was compatible with various Android versions
and setups. 3.10.2 Implementation Details 1. Library Importation: TensorFlow, a Pythonlibrary well-known for its capabilities in deep neural networks and machine learning, Bi:
one of the most important ones to be included when starting a project. To expedite the
handling of images and datasets, auxiliary utility ibraries are also added, providing the
framework for the project's advancement. 2. Setting up the Kaggle environment proactively
entails identifying the directory that contains the Kaggle API configuration file. This
preliminary phase provides unrestricted access to Kaggle’s vast dataset collection,
expediting the data acquisition procedure for a smooth system integration. 3. Downloading
and Unzipping the Datasets: The next step involves downloading and carefully organizing
two different datasets that were obtained from Kaggle: "525-western-bird-species” and "25-
indian-bird-species.” After the download, our script works diligently to unzip the datasets
and partition the contents into different directories, making data management easier and
making processing stages easier. 4. Dataset Merging: Our project arranges for the
combining of datasets since it recognizes the importance of having a complete dataset for
i the classification of bird species. It specifically moves photos i from the "indian-bird-
species” dataset into the "western-bird-species” dataset, enhancing and broadening the
range of species that our algorithm is capable of classifying. 5. Setting Random Seed: To
guarantee that experiments are repeatable across several runs, TensorFlow implements a
crucial step by randomly selecting a seed. By setting up a fixed random seed in this stage,
it becomes possible to compare findings with confidence and evaluate how stable the
model's per32
formance will be in later iterations. 6. Constant Definition: Explicit definitions are provided
for critical constants, which include batch size, picture size, image shape, number of
classes, and number of training epochs. Project rigor is increased by each constant's
crucial role in directing configuration parameters throughout the undertaking and offering a
strong basis for data processing and model training. 7. Data Augmentation: Our project
provides ways to enhance training data by utilizing TensorFlow's 'ImageDataGenerator’.
By producing B a wide variety of training image variations, this augmentation procedurehelps the model become more robust and adaptable to differences in real-world data. It
also improves [BJ] the model's ability to generalize across a variety of data circumstances.
3.11 System Testing 1. Unit Testing:In the unit testing phase, rigorous evaluation of the
bird recognition model (Image) was conducted using diverse datasets encompassing
various angles of bird images. The testing protocol encompassed scenarios that involved
distant and zoomed-in bird images in order to thoroughly evaluate the model's
performance at various spatial scales. The the goal was to confirm that the model could
reliably and robustly identify birds from a H variety of viewing angles in real-world
deployment scenarios. 2. integration Testing: During the integration testing phase, we
seamlessly integrated the Indian and Wester bird recognition models, leveraging their
complementary strengths. Subsequently, the unified model underwent comprehensive
evaluation using a diverse range of datasets containing images of birds from both
regions. This collaborative approach allowed us to validate the model's effectiveness in
accurately identifying birds from different geographical areas. Moreover, the testing
protocol encompassed scenarios involving zoomed-in and distant bird images to assess
the model's performance across varying spatial scales. By subjecting the integrated model
to such diverse conditions, 33
We ensured its robustness and reliability in accurately recognizing birds irrespective of their
proximity to the camera. 3.System Testing:During the system testing phase, we deployed
the integrated bird recognition model on the Android Studio platform and rigorously
evaluated its performance H within the context of the Android application. This involved
thorough testing of the Android app's functionality, user interface responsiveness, and
overall integration with the deployed model. 4. User Interface Testing:Interface
Intuitiveness: Users were presented with the application interface and are asked to perform
specific tasks, such as uploading a bird image, viewing identification results, or navigating
through different sections of the app. Observers note any difficulties or confusion
encountered by users during these tasks, which may indicate areas of the interface thatneed improvement for better intuitiveness. 3.12 Results and Evaluation 1. Accuracy and
Robustness: In identifying bird species from a variety of datasets and viewing angles, the
system showed excellent accuracy and robustness. It was confirmed through unit testing
that the bird recognition model functioned consistently, even in difficult situations like
photos of far-off or zoomed-in birds. Integration testing demonstrated the model's
adaptability in identifying birds from various regions and further validated its efficacy,
particularly when working with Indian and Western models. 2. Scalability and Adaptability
The integrated model's deployment on the Android platform demonstrated scalability and
adaptability during system testing, guaranteeing smooth operation across a range of
Android devices and screen sizes Bitte oysters abiiy ta manage higher user loads
without sacrificing performance was demonstrated by how well it handled picture uploads
and produced identification results quickly. 3. User Experience and Usability: The
application interface’s intuitiveness and usability were greatly enhanced by user
experience testing. User feedback indicated areas that needed work, like expediting the
uploading of images, making the results of identification more readable, and enhancing the
app's navigation. By taking these recommendations into consideration, the application's
user experience would be improved overall, increasing B user satisfaction and
engagement. 34
3.12.1 Performance Metrics Response Time: After users upload bird photos, the
application reliably satisfies the predetermined response time requirements, giving them
identification results in a matter of seconds. Accuracy: The application outperforms the
predetermined standards for accurate bird identi
ations, achieving a high accuracy rate.
‘The application can be trusted to provide accurate identification results, which will
improve users’ research and birdwatching experiences. Scalibility: Performance testing has
shown that the application is robustly scalable, able to handle higher user loads without
experiencing appreciable performance degradation. It guarantees continuous service
during periods of high usage by meeting or surpassing predetermined standards forsupporting concurrent users and image processing requests. 3.12.2 Comparison with
Requirements The bird recognition application has effectively met its initial project
requirements and objectives, d
ering accurate bird identifications and a userfriendly
interface. However, the addition H of features, such as regional model integration and Ul
enhancements, extended the project timeline slightly. Despite this, these additions have
significantly enhanced the application's effectiveness and user satisfaction. [fy Continuous
improvement based on user feedback underscores the project's commitment to delivering
value and meeting evolving user needs. 35
3.13 Conclusion The system realization and implementation process of the bird recognition
application exemplifies | Greternegrecend developing a sophisticated yet
accessible tool for bird identification. By prioritizing accuracy and user experience, the
application has successfully garnered praise from users who find it to be both reliable and
intuitive. Despite the challenges posed by the integration of additional features, the
commitment to ongoing improvement ensures that the application [Ene
FIRED tectinotogica advancements in bird recognition. This adaptability and
responsiveness to user needs underscore the project's dedication to delivering a solution
that B not only meets but exceeds expectations. Moreover, the collaborative efforts of the
development team, alongside valuable input from users, have resulted in a truly valuable
asset for bird enthusiasts and researchers alike. Moving forward, the application stands
poised to continue its legacy of innovation, serving H as a cornerstone in the pursuit of
avian biodiversity conservation and research. . 36
3.14 Result/Output OF Project 3.14.1 o For Bird Species Recognition Through Image
Model: Figure 3.2: Predictions for Bird Recognition through Image: Kaggle model testing
Model Predictions (Kaggle Model Testing) Source: Kaggle Model Testing Interface
Description: The image shows three different bird species as part of the model's testing
phase, each accompanied by its classification result and confidence score. Image 1: BirdSpecies: Indian Grey Hombill Confidence Score: 0.99955136 Description: The model
identified the Indian Grey Hornbill with extremely high confidence. Image 2: Bird Species:
Alexandrine Parakeet Confidence Score: 0.910354 Description: The model identified the
Alexandrine Parakeet with high confidence. Image 3: Bird Species: Jungle Babbler
Confidence Score: 0.998193 Description: The model identified the Jungle Babbler with
very high confidence. 37
3.14.2 |B] For Bird Species Recognition Through Application: Figure 3.3: Predictions for
Bird Recognition through Application BeakBook Classification (Image) Application Name:
BeakBook Interface Description: The interface displays a group of birds with a black
background and the app name "BeakBook’ at the top. Classification Result: The app has
classified the birds in the image as "Sarus Crane”. Accuracy: The classification accuracy is
displayed as 0.990, indicating very high confidence in the result. User Options: Record
Audio: Allows the user to record audio for bird sound identification. Take Picture:
Enables the user to take a picture for visual bird identification. Upload File: Provides an
option to upload an existing file for classification. 38
3.14.3 For Bird Species Recognition(Voice) Through Application: Figure 3.4: Predictions
for Bird Recognition(Voice) through Application BeakBook Classification (Audio)
Application Name: BeakBook Interface Description: The user interface consists of a black
background with the app name "BeakBook’ displayed at the top. Classification Result: The
app has classified an input as.
Eurasian Hoopoe”. Accuracy: The classification accuracy
is displayed as 0.919, indicating high confidence in the result. User Options: Record Audio’
Allows the user to record audio for bird sound identification. Take Picture: fj Enables the
user to take a picture for visual bird identification. Upload File: Provides an option to upload
an existing file for classification. 39
Chapter 4 Conclusion and Future Scope 4.1 Conclusion The process of this projectcombined mobile app development, deep learning, and data science in a seamless way.
Each stage built the foundation for flexible bird species classification, from carefully chosen
library imports to the blending of various datasets and the honing of a strong machine
leaming model. Expert training, posttraining optimizations, and TensorFlow Lite's
incorporation into an Android application demonstrated how complex algorithms can be
combined with user-friendly mobile interfaces. This convergence made sure that the user
experience was the main focus while H showcasing the potential of machine learning in
practical applications. In essence, the bird recogni
nN app not only enriches the lives of
users by providing a fun and educational ff experience but also contributes to the broader
goals of biodiversity conservation and scientific research. As technology continues to
advance, the potential for such applications to make a meaningful impact on our
understanding and B appreciation of the natural world is truly promising. 40
4.2 Future Scope 1. User-Centric Enhancements: Constantly improving the user interface
of the mobile application, taking user comments into account, and adding additional user-
friendly features to captivate and inform users about different bird species.Combining
Environmental Data Integration 2. Fusion with Environmental Data: Adding environmental
data to the model, such as location, climate, or habitat details, could improve its overall
grasp of the distribution and behavior of bird species. 3. Extension to Other Domains:Using
the established infrastructure for a variety of applications extending the application beyond
the classification of bird species to include broader domains like or ecological
monitoring. 4. Fine-tuning and Expansion: By continuously adding new data to the current
model, [B} i is possible to improve its accuracy and increase its capacity to classify a
broader range of bird species. Classification accuracy may be further improved by
combining transfer learning with more recent pre-trained models or architectural designs.
a1
Appendix A Brief Bio data of each student Name: Chiranjivi Hole - Educational Backgroundand Pas:
n: * Pursuing Bachelor's degree in Data Science at Usha Mittal Institute of
Technology. * Passionate about technology, particularly drawn to machine learning, data
analytics, and software development. - Skills and Competencies: * Proficient in data
analytics, database management systems (DBMS), and Linux. * Skilled in using
databases, and present insights for informed decision-making. - Current Endeavor:
Immersed in pursuing the AWS Cloud Practitioner Certificate, learning about cloud
fundamentals, AWS services, security protocols, and pricing structures to proficiently
leverage AWS resources for various projects and initiatives. 42
Name: Sakshi Kakade - Educational Background and Interests: * Completing Bachelor of
Technology degree at Usha Mittal Institute of Technology. * Profound interest in Machine
Learning, Data Analytics, and Data Engineering. - Skills: * Python, SQL, ETL (Extract,
Transform, Load) * Data Visualization: Power BI, Tableau * Data Analytics, Data Analysis,
NLP (Natural Language Processing) - Competencies:
Analyzing and processing complex
datasets * Developing and optimizing data pipelines * Creating insightful visualizations for
informed decision-making, - Certifications: * Python Masterclass (Udemy) * Machine
Learning Internship (Suven) 43,
Name: Priti Mhatre - Educational Background and Internship: * Currently pursuing a
Bachelor's degree in Data Science at Usha Mittal Institute of Technology. * Engaged as an
inter at Colgate Global Business Services, contributing to HR applications. -
Specialization and Achievements: * Expertise in data analysis, with a strong focus on
extracting insights from large datasets. * Completed the Google Data Analytics
Professional Certificate, enhancing analytical skills. - Skills and Strengths: * Proficient in
Python, SQL, ETL, data visualization (Power BI, Tableau), NLP, data analytics, and data
analysis. * Strong communication skills in spoken and written English, analytical thinking,Appendix B Plagiarism Report 45
Appendix C Research paper based on project 113.46
21347
Powered by TCPDF (www.tepdf.org) 3/348
International Jounal of Research Publication and Reviews, Vol 5, no 4, pp 4482-4487 April
2024 International Journal of Research Publication and Reviews Journal homepage:
www.iirpr.com ISSN 2582-7421 i BeakBook: Bird Species Recognition Mobile
Applications Chiranjivi Hole, Sakshi Kakade, Priti Mhatre Department of Data Science,
with grace and significance, serving as vital indicators of environ- mental changes. Given
the imminent threat of extinction that many bird species face, it is critical to implement
effective conservation techniques. fl With the use of cutting-edge deep learning
algorithms, visual categorization based on bird photos becomes a viable method for
species identification. [FJ] The goal of this research is to create an effective model for rapid
identification of bird species by utilising deep learning and neural networks. The neural
network receives intense training to get excellent classification accuracy by using large
datasets that include photos and sounds of birds. [fJJ For bird voice recognition libraries like
librosa are used for classification Advanced methodologies of Neural networks are used
to extract complex information from bird photos, improving the model's capacity to identify
minute species distinctions. This project intends to make a substantial contribution to
scientific study and conservation efforts by automating the process of bird species
identification. This will enable a deeper knowledge of avian biodiversity and assist in thepreservation of our natural environment. In addition to speeding up the identification
procedure, this paradigm shift towards automated bird species recognition promotes a
greater comprehension of avian biodiver- sity. Effective deep leaming models can help
focus conservation efforts more precisely and effectively, reducing the threats that
endangered bird populations must contend with Bhrrousn the democratisation of access
to cutting-edge technology, this project enables people all around the world to take part in
scientific research and bird conservation projects. The success of these initiatives marks
the beginning of a new chapter in animal conservation, one in which technology acts as a
catalyst for favourable alterations in the environment. [fJj We work together and with
creativity to create a future where all bird species coexist peacefully with their natural
environments. Index Terms—Neural Networks, Deep Learning, EfficientNet, YamNet, ,Bird
species recognition, Birds voice recognition. |. Introduction Many people find satisfaction in
being able to recognize and appreciate the beauty of our feathered friends in a world full of
fascinating and unique avian species. Our Bird Species Recognition Application is made to
get you closer to the fascinating world of birds, whether you are an ornithology enthusiast,
a nature lover, or simply inquisitive about the rich tapestry of birdlife that surrounds us.We
have utilized the potential o of Convolutional Neural Networks (CNNs) by utilizing the
effectiveness of the EfficientNet BO arc!
scture and the power of cutting-edge technology.
i We can provide you with a seamless and precise tool for recognizing bird species from
photographs thanks to this novel approach. For both professionals and bird enthusiasts,
our application offers a convenient and accessible platform. 1) Birdwatching is a popular
hobby promoting appreciation for nature and birds because there are so many different
kinds of birds and their minute visual distinctions, it can be difficult to identify them and for
birdwatchers, carrying field guides or reference books is frequently impracticable. 2)
Identifying and learning about bird species is a global interest for enthusiasts, scientists,
and nature lovers. . 3) Recognizing different bird species can be challenging due to visual
variations and the vast number of bird species. 4) The aim is to improve bird species
recognition using Neural Network and deep learning. II. Problem Defination: Birdwatchingis a well-liked hobby that builds a profound respect for the avian world and helps people
connect with nature. Whether it's the dazzling plumage of a tropical bird, the amazing flight
of a raptor, or the lovely singing of a songbird, these species captivate our hearts and
inspire wonder. The identification and study of these species will benefit scientific.
knowledge and human delight, and bird enthusiasts, scientists, and nature lovers
worldwide will be interested. 49
E International Journal of Research Publication and Reviews, Vol 5, no 4, pp 4482-4487
April 2024 4483 But distinguishing between various bird species
‘can some: times prove to be a challenging and time-consuming endeav- our, even for
seasoned birdwatchers. i It's not always practicable to bring field guides or reference
books since many bird species have subtle visual changes. Ill. Survey Of Literature In the
process of developing this project, an extensive review of various papers and journals were
conducted In [1] research paper four different transfer learning models namely
InceptionV3, ResNet152V2, DenseNet201, and MobileNetV2 was implemented on the
identified dataset. Out of this ResNet152V2 provide with maximum accuracy 95.45. Thus in
these research paper a model was built to identify bird species i using Convolutional
Neural Network Inf2Jresearch paper the project builds an Application for Bird Species
detection on varied dataset using Convolutional Neural Network i The accuracy of the
built application was 75 In[3]the bird's dataset was built based on Asian bird species. Two
different models were proposed which showed that the proposed pretrained ResNet model
achieved greater accuracy in comparison to the based model. Their final model shows
97.98 accuracy in identifying the bird species [4] Reserach paper served as our primary
reference throughout the project. In[4] a CNN based architecture was developed and
evaluated for 525 different bird species categorization. By combining strong data
augmentation tactics with transfer learning approaches, this method produced impressive
accuracy rates. i When it came to classifying bird species, the model performed well and
showed characteristics that could be applied to other contexts without experiencingoverfitting. Interestingly, the test set showed a similar accuracy of 86.7%, but the valid:
set indicated an accuracy of 87% i The extension of current databases has been
identified as one path for future refinement, with a particular focus on addressing
imbalances among underrepresented species and gender groups. To improve model
performance, more research into different approaches to data augmentation is necessary.
The work In[5]investigates the use of the Na’ive Bayes algorithm to the recognition of bird
species using acoustic characteristics, with a 91.58% accuracy rate. i The research
highlights how different the vocal tracts of different birds produce sounds, as well as the
difficulties that come with memory control and signal-to-noise ratio optimisation. An
iterative procedure for non-real-time bird speech recognition is outlined, including steps like
choosing an audio clip and classifying the data using the Na‘ive Bayes method. Inf6]the
paper details an experiment that used Human Factor Cepstral Coefficients (HFCs) and
Hidden Markov Models (HMMs) to automatically classify the vocalisations of eighteen
different bird species. An interspecific success rate of 81.2% and a classification success
rate of 90.45% for families were attained in the experiment; data from other models might
be taken into account for possible improvement. The quality and processing of the input
data are critical to the experiment's success; possible enhancements include more precise
recording segmentation and the use B of deep neural networks for the identification of
bird sounds. TABLE | Features extracted for Bird recognition NO Species Color Shape
Features 1 Fairy Tern White, Black, Grey Small, Slim body, Long wings 2 Zebra Dove
Tan,White,Black Small, Plump body, Short beak 3 Harpy Eagle Dark brown, Gray Large,
Robust body, fj Huge wings. 4 Canary Yellow,Gray Small, Slender body, Short beak IV.
Established Operational Infrastructure. 1. Merlin Bird ID: The Merlin Bird ID from the
Cornell Lab of Ornithology is renowned for its extensive database and user-friendly
interface. Using machine learning algorithms to identify bird species based on user-
submitted images or descriptions, it assists birdwatchers in rapidly and correctly identifying
different bird species. Key Features: Offers a large database of bird species, simple photo-
based identification, the capacity to identify bird cries, and comprehensive speciesinformation, making it a priceless tool for omithologists and enthusiasts alike. 2. eBird: 50
International Journal of Research Publi
ion and Reviews, Vol 5, no 4, pp 4482-4487
April 2024 4484 Recording bird sightings as part of a citizen
science pro- gramme is the goal of eBird, a robust platform. i Users can contribute
photos, audio recordings, or descriptions of their observations of birds to a worldwide
dataset used for research on bird populations and conservation efforts. Key Features:
Provides tools for species identification, allows users to submit photos and audio
recordings, encourages community participation in birdwatching and conservation, and
provides insights on bird populations. 3. iBird: For bird species, iBird is a comprehensive
digital field guide. It has a vast collection of bird information, including detailed descriptions
of characteristics, habitats, and behaviours along with images and sounds. V. Algorithmic
framework design: A. Birds Image Recognition EfficientNet, a convolutional neural network
architecture has emerged as a prominent framework for image classifi- cation tasks owing
to its balance between performance and computational efficiency. The EfficientNet-BO
variant, chosen as the foundation for our bird species classification endeavor, is
characterized by a compound scaling method that simulta- neously adjusts network depth,
width, and resolution. Let X represent the input image with dimensions WxHxC, where W
is width, H is height, and C denotes the number of channels. The [fj architecture
comprises the following key elements: Input Stem: Initial processing layers, including
convolutional and pooling operations, extract fundamental features from the input image.
Efficient Blocks: These blocks, iteratively applied throughout the network, feature
convolutional layers with varying kernel sizes and expansion ratios. Each block's output
undergoes recalibration via a squeeze-and-excitation (SE) mechanism to enhance feature
responsiveness. Global Average Pooling: Following the last block, global average pooling
aggregates feature maps into a fixed-size representation. Fully Connected Layer: A fully
connected layer, followed by a softmax activation function, facilitates classification. B. Birds
Voice Recognition YAMNet is o a convolutional neural network (CNN) that pro- cessesaudio spectrograms. It uses a spectrogram as input and uses convolutional filters B to
extract features from the signal. [fj The convolutional layers extract abstract
representations of audio features, which are then downsampled to lower their
dimensionality while keeping crucial data intact, This hierar- chical processing allows
YAMNet to learn hierarchical features of audio events. The network's top layers are
connected and followed by softmax activation for classification. [fJj The output classes,
representing various audio events or categories, are mapped to the learned features. The
softmax function normal- izes the output scores across all classes to create a probability
distribution. YAMNet uses gradient descent optimisation and backpropagation to modify
the weights of its layers, aiming to minimize the discrepancy between input audio sample
true labels and predicted probabilities. This o training process on the AudioSet dataset
makes YAMNet proficient in classifying various audio events, making it useful for various
audio classification tasks. VI. Experiments A. Methodology for Bird Image Recognition
Model 1) Data Source: : The subsequent stages for project involes downloading and
organizing two datasets, "525-westem-bird-species” and "25- indian-bird-species,” sourced
from Kaggle i The script unzips the datasets and organizes them into separate
directories, sim- plifying data management and setting the stage for data pro- cessing. The
soript also merges datasets, transferring images from the “indian-bird-species” dataset into
the "western-bird- species’ dataset, expanding the scope of species the model can
classify, enhancing the project's classification capabilities. The preview of dataset used for
project: Fig. 1. A preview of birds in dataset 51
International Journal of Research Publication and Reviews, Vol 5, no 4, pp 4482-4487
April 2024 4485 2) Data Preprocessing : A. Data Augmentation:
Leveraging TensorFlow’s 'ImageDataGenerator,’ we define a set of strategies to augment
our training data. [fj This augmen- tation process generates an array of diverse variations
of our training images, which is instrumental in improving our model's capacity to
generalize across a range of data scenarios. This approach enhances the model'srobustness B and its ability to adapt to variations in real-world data. B. Creating Data
Generators: To facilitate the data flow into our model during training, we thoughtfully create
two pivotal data generators using TensorFlow’s ‘ImageDataGenerator.’ These data
generators are meticulously designed to load and preprocess images from our training and
validation directories. This crucial step streamlines the data supply process, ensuring that
the model is fed with appropriately processed data during the training phase... 3) Training.
The training phase involves constructing a model using the ‘fit’ method, which provides B
training and validation data through data generators. The model's progress is tracked in
the ‘history’ variable. Class names are saved in a JSON file for interpreting predictions.
The saved model configuration is then loaded from the checkpoint file, encapsulating the
training efforts and ready for predictions on new data. Class names are then retrieved from
the JSON file to provide human-readable labels to the model's outputs. This process
ensures meaningful associations between numerical predictions and the actual bird
species they represent. 4) Results:: The process involves loading, preprocessing, and
passing new images through a model for classification. If the model's prediction confidence
exceeds a predefined threshold, the class is printed, and ifit falls below, the model is
explicitly indicated, ensuring transparency and precision in the results. Fig. 2. Predictions
for Bird Recognition through Image B. Methodolgy For Bird Voice Recognition : Fig. 3.
Visualizing bird audio data 1) Data Source: : Explore the data, noting that
are divided into train and test folders, with each split containing folders named after bird
codes. All audios are mono and have [/) 2 sample rale of 16kHz 2) Data Preprocessing
Import Model Maker, TensorFlow, and any additional li- braries required for manipulating
audio, and creating visuali- sations. 3) Training : : To train the model E using Model
Maker for audio, begin with a model spec, which serves as the base model for extracting
information and learning about the new classes. It also de- termines how the dataset will
be transformed to comply with the model's specifications, such as sample rate and number
of channels. YAMNet, H ‘an audio event classifier trained on the AudioSet dataset, is used
for predicting audio events from the AudioSet ontology. The create method in the audio