Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf
Download as pdf
You are on page 1of 50
Plagiarism Checker X - Report Originality Assessment 15% = Overall Similarity Remarks: Vioderate similarity Verify Report: Date: May 25, 2024 Scan this QR Code Matches: 2278 / 15289 words _ detected, consider enhancing Sources: 45 the document if necessary. A Project Report On Bird Species Recognition Application Submitted in partial fulfilment for the degree of Bachelor of Technology in Data Science Submitted by Chiranjeevi Hole (2015003) Sakshi Kakade (2015031) Priti Mhatre (2015040) Under the guidance of Prof. Juhu Tara Road, Santacruz(West), Mumbai-400049 2023-2024 DECLARATION We Chiranjeevi Hole, Sakshi Kakade and Priti Mhatre, hereby declare that the work presented in this project entitled “Bird Species Recognition Application” is entirely our own. The content of this project has been generated through our independent efforts, research, and scholarly contributions. We further declare that: 1. Originality: » The ideas, concepts, and contributions presented in this work are solely the result of our own intellectual endeavours. 2. Authenticity: + All data, figures, tables, and findings presented in this projectare genuine and have not been fabricated or manipulated. 3. No Use of Al Tools: + We have not used any Al-based tools to generate significant portions of this Project including but not limited to content, research objectives, hypotheses, and analysis. 4. No Plagiarism: + We have properly cited and referenced all external sources and works. consulted during the preparation of this [thesis/dissertation/research project). « There is no instance of plagiarism or unauthorized use of others’ intellectual property. 5. Independent Work: » This work has been conducted independently, without any collaboration or assistance that would compromise the originality of the content. 6. Academic Integrity: + We have adhered to the principles of academic integrity and ethical research throughout the entire process of producing this [thesis/dissertation/research project]. We understand reflects the nature and authenticity of our work. Date: 29th May, 2024 Signature: Chiranjeevi Hole Sakshi Kakade Priti Mhatre CERTIFICATE This is to certify that Chiranjeevi Hole, Sakshi Kakade and Priti Mhatre has ‘completed the Project Il report on the topic “Bird Species Recognition Application” satisfactorily in EINE! [jj Data Science under the guidance of Prof.Merrin Mary Soloman as prescribed by Shreemati Nathibai Damodar Thackersey Women’s University (SNDTWU) Guide Head of Department Prof.Merrin Mary Soloman Mr.Rajesh Kolte Principal Dr. Yogesh Nerkar Examiner 1 Examiner 2 Contents Abstract i List of Figures iii Nomenclature iv 1 Introduction 1 1.1 Background 11.2 Research Motivation 2.1.3 Research Objectives 42 Review of Literature 8 2.1 Introduction 8 2.2 Scope and Objectives .. 8 3 Methodology 12 3.1 Introduction to Methodology . 3.1.4 Methodology for Model ... 123.1.2 Application Development . . . 17 3.1.3 Bird Sound Recognition Model . 19 3.2 Research Design -203.2.4 Justification 21 3.3 Research Approach 21 3.4 Sampling Strategy 22 3.5 Data Collection Methods 22.3.6 Data Analysis Techniques 22 3.7 Research Ethics 23 3.8 Limitations and Assumptions 23 3.9 Timeline and Schedule 24 3.9.1 System Architecture 25 3.9.2 Design Details 26 3.10 System Development 27i 3.10.1 Programming Languages and Tools 30 3.10.2 Implementation Details 323.11 System Testing . 33 3.12 Results and Evaluation ..... 34 3.12.1 Performance Metrics 353.122 Comparison with Requirements 36 3.13 Conclusion 36 3.14 Result/Output OF Project 37 3.14.1 For Bird Species Recognition Through Image Model: 37 3.14.2 For Bird Species Recognition Through Application: 38 3.14.3 For Bird Species Recognition(Voice) Through Application: 39 4 Conclusion and Future Scope 40 4.1 Conclusion 40 4.2 Future Scope 41 A Brief Bio data of each student 42 B Plagiarism Report 45 C Research paper based on project 46 D References 55 ii List of Figures 3.1 Project Roadmap 243.2 Predictions for Bird Recognition through Image: Kaggle model testing . 37 3.3 Predictions for Bird Recognition through Application . 38 3.4 Predictions for Bird Recognition(Voice) through Application ...... . 39 ii Nomenelature 1. ff] GUI: Graphical User Interface 2. JDK: Java Development kit 3. SDK: Software Development Kit 4. RAM: Random Access Memory 5. CPU: Central Processing Unit 6. GPU: Graphics Processing Unit 7. OS: Operating System 8. EA IDE: Integrated Development Environment 9. API: Application Programming Interface 10. Ul: User Interface 11. ML: Machine Learning 12. CNN: Convolutional Neural Network iv Chapter 1 Introduction There are few things that compare to the delight of viewing and admiring the many bird species that adorn our skies and landscapes in a world full of life. The beauty and variety of birds capture our attention and inspire awe in all of us, from the majestic soaring of eagles to the delicate fluttering of hummingbirds. Being able to recognize and identify these winged creatures brings a great sense of satisfaction to individuals who have a passion for omithology, a strong appreciation for nature, or are just curious about the diverse array of birds that live around us. Our goal to introduce people to driving forces behind our Bird Species Recognition Application. Utilizing state-of-the-art technology and the capabilities of Convolutional Neural Networks (CNNs), we have developed an advanced tool that enables users to identify bird species with previously unheard-of ease and precision. Our program fj is based on the EfficientNet BO architecture, which is well known for its effectiveness and efficiency in image recognition tasks. Through the utilization of CNNs, we present a smooth and accurate method for identifying bird species in photos, giving users an engaging experience that encourages a closer bond with the natural world. 1.1 Background Given the diversity of avian species found globally, bird species recognition is essential for a number of professions, including omithology, biodiversity monitor1 ing, and ecological study. Prior work has concentrated on automating recognition through the use of highly accurate Convolutional Neural Networks (CNNs). The great diversity of bird species and the requirement for subtle traits provide challenges. Model adaption has been enhanced by recent developments in transfer learning. Research examines citizen science for dataset development, data augmentation, and optimizing pre-trained models. Preprocessing images, extracting features, and evaluating models using metrics are importan leas. The purpose of t work is to advance techniques for identifying bird species for use in research and conservation. 1.2 Research Motivation We are driven by the goal of democratizing bird identification making it available to everyone while addressing issues that hobbyists and citizen scientists encounter. Particularly for rare species, traditional techniques like field guides might be restrictive. Our goal is to close this gap by using state-of-the-art technology o and Convolutional Neural Networks (CNNs) to enable people to get involved in conservation activities and strengthen their bond with nature. In addition to promoting scientific literacy and inspiring a new generation of environment lovers, our research aims to conserve biodiversity worldwide. 2 1.3 Research Objectives The following are the research’s goals: 1. Developing Bird Species Recognition Application [f] Using convolutional neural networks (CNNs), develop an intuitive application that can reliably identify bird species from photos. 2. Using CNNs: Take advantage of CNNs to overcome obstacles in image recognition tasks and improve efficiency and accuracy in bird species identification. 3. Integration of EfficientNet BO Architecture: For optimal performance and accurate species classification, integrate the EfficientNet B0 architecture. 4. User Accessibility: Give ease of use and user accessibility a priority by creating an interface that is ff] simple to use and easily identifiable. 5. Validation and Testing: Make sure the application is accurate by putting it through a rigorous test against databases of recognized bird species and actual situations. 6. Research Questions: Discuss important issues pertaining to CNN accuracy, efficacy, contrast with current techniques, and implications for conservation and birdwatching. The project hopes to improve bird species recognition technologies and promote a better knowledge of avian biodiversity by accomplishing these goals. 1.4 Scope of the Study The goal of this project is to create and assess a Bird Species Recognition Application that uses CNNs—more especially, the EfficientNet BO architecture—to identify different species from photos. It covers things like building applications, implementing CNN, acquiring datasets, preprocessing, training models, evaluating metrics, and deploying mobile platforms. Limitations: 1. Other modalities such as bird sound recogni Nn are not included in the study; it only deals with the identification of bird species from images. 2. It excludes other CNN variations and limits the investigation of CNN architec3 tures to the EfficientNet BO model. 3. Extensive CNN model fine-tuning and sophisticated application functionalities are excluded due to resource limitations. 4. The assessment concentrates on conventional metrics, possibly ignoring intricate assessments such as robustness against adversarial assaults and transferability. 1.5 Structure of the Document 1. Introduction: Bird species recognition is crucial for various professions, including omithology, biodiversity monitoring, and ecological study. The beauty and variety of bird species, from eagles to hummingbirds, capture our attention and inspire awe. The Bird ‘Species Recognition Application aims to introduce people to the fascinating world Bo our mutual fascination with birds. Utilizing state-of-the-art technology and Convolutional Neural Networks (CNNs), the application provides an advanced tool for users to identify bird species with unprecedented ease and precision. The program is based on the EfficientNet B0 architecture, known for its effectiveness in image recognition tasks. The application presents a smooth and accurate method for identifying bird species in photos, encouraging a closer bond with the natural world. Recent developments in transfer learning have enhanced model adaption, and research examines citizen science for dataset development, data augmentation, and optimizing pre-trained models. The purpose of this work is to advance techniques for identifying bird species for use in research and conservation. 2. Literature Review: 1. Bird Image Classification using CNN Transfer Learning Architectures + Builds an application for bird species detection on varied dataset using Convolutional Neural Network. + Accuracy of the built application is 75. + Large and quality dataset built with citizen scientists’ assistance. « The app can be fine-tuned for better accuracy and bird identification. 2. Automatic Bird Species Identification Using Deep Learning « Bird's dataset built based on Asian bird species. * Pretrained ResNet model achieved greater accuracy than the based model. Final model shows 97.98 accuracy. 4 + Cutting-edge vision automation achieved fast results with zero development costs. 3. Building a Bird Recognition App and Large-scale Dataset with Citizen Scientists. « Builds an application for bird species detection on varied dataset. - The app can be fine-tuned for better accuracy and bird identification. 3. Methodology: - Model Development Methodology + Importing Libraries: The project imports critical Python libraries, including TensorFlow, for machine learning and deep neural networks. + Setting Up Kaggle Environment: The Kaggle API configuration file is set up to provide access to Kaggle’s extensive dataset collection. + Downloading and Unzipping Datasets: i Two datasets, "525-western-bird-species” and "25-indian-bird-species,” are downloaded and organized. + Merging Datasets: The script merges datasets, transferring images from the “indian-bird-species” dataset into the ‘western-bird-species” dataset, expanding the model's classification capabilities. Setting Random Seed: A random seed within TensorFlow is set to ensure reproducibility of experiments and assess model performance stability. « Defining Constants: Essential constants such as batch size, image size, image shape, number of classes, and training epochs are explicitly defined. + Data Augmentation: Strategies are defined to augment training data, generating diverse variations of training images. + Creating Data Generators Two data generators are created to load and preprocess images from training and validation directories. + Loading Pretrained Model: The EfficientNet80 model is loaded with weights from the 'imagenet’ dataset, fine-tuning it for the specific problem. « Custom Model Head: A custom classification head is integrated onto the base model, including global average pooling, dropout layers, and fully connected layers equipped with ReLU activation functions. + Compiling the Model: The model is systematically compiled with an Training and Application Development Model Training: - Utilizes 'EarlyStopping’ callback to Adam optimizer, categorical cross-entropy halt training if validation accuracy plateaus. - ‘ModelCheckpoint’ callback ensures optimal settings are maintained during training, Training the Model: « Utilizes ‘fit’ method to provide model with training and validation data. § + Stores training progress in the ‘history’ variable. Saving Class Names: + Stores class names in a JSON file for interpreting model predictions. + Loads the best model configuration from the checkpoint file. Loading Class Names: « Retrieves class names from the JSON file for human-readable labels. Making Predictions: + Loads, preprocesses, and passes new images through tne model for classification, + Confidence in predictions is determined by the model's confidence. Application Development: + Converts model EB. ‘TensorFlow Lite formal for efficient execution on resourceconstrained devices. + Creates a JSON file for bird class names. * Creates an Android app using Android Studio * Selects Java as the primary programming language for Android app development. « Integrates TensorFlow Lite, a framework developed by Google, into the Android app. Se Importation and Integration: + Importing essential libraries: Pandas (pd), Numpy (np), Matplotlib.pyplot (pit), Scikit-leam for data manipulation, and Scikit-leam for dataset splitting. - Training the Audio model using YamNet: YAMNet oll a convolutional neural network (CNN) that processes audio spectrograms. It uses convolutional filters to extract features from the signal, learning hierarchical features of audio events. YAMNet uses gradient descent optimisation and backpropagation to minimize discrepancy between input audio sample true labels and predicted probabilities. + Integrating ff the mode! with the Android bird recognition app: This involves selecting a suitable model, adapting it to the application's needs, collecting and preprocessing data, integrating the model into the app, designing a user-friendly interface, optimizing for real-time processing, testing and evaluating performance, gathering user feedback, and continuously improving the feature based on feedback and performance metrics. 4. Results and Conclusions: -The project combines mobile app development, deep learning, and data science to create a flexible bird species classification system. It involves library imports, dataset blending, and machine learning model honing. The application showcases the potential of machine leaming in practical applications through expert training, post-training optimizations, and TensorFlow Lite integration. [FJ] The accuracy of the bird recognition application is around 97 The 6 bird recognition app enriches users’ lives through a fun and educational experience, contributing to biodiversity conservation and scientific research. As technology advances, the app’s potential to impact our understanding and appreciation of the natural world is promising. 5. Discussion: - The bird identification application demonstrated its effectiveness in key areas, such as accurate bird identification, through extensive testing and validation. The system's scalability and adaptability allowed it to handle varied image processing requirements and growing user loads, indicating its ability to handle future expansion and changing user needs. The application has potential for more extensive uses in research, teaching, and wildlife conservation, increasing public knowledge of bird biodiversity and enabling citizen involvement in scientific studies. Future directions and lessons learned include prioritizing iterative development cycles and continuous user engagement to enhance application effectiveness and user satisfaction. Early and ongoing user feedback should be prioritized in future iterations to spur advancements and guide feature development. 6. Conclusion: - The process of this project combined mobile app development, deep learning, and data science in a seamless way. Each stage built the foundation for flexible bird species classification, from carefully chosen library imports to the blending of various datasets and the honing of a strong machine learning model. Expert training, post-training optimizations, and TensorFlow Lite’s incorporation into an Android application demonstrated how complex algorithms can be combined with user- friendly mobile interfaces. This convergence made sure that the user experience was the main focus while showcasing the potential of machine learning in practical applications. In essence, the bird recognition app not only enriches the lives of users by providing a fun and educational experience but also contributes to the broader goals of biodiversity conservation and scientific research. As technology continues to advance, the potential for such applications to make a meaningful impact on our understanding and appreciation of the natural world is truly promising. 7 Chapter 2 Review of Literature 2.1 Introduction o The identification of bird species has attracted a lot of interest lately because of its consequences for ecological study, conservation initiatives, and biodiversity monitoring. Thanks to developments in deep leaming and computer vision algorithms, scientists have investigated a number of approaches for precisely categorizing different bird species using image and audio data. In this review of the literature, we examine four studies that use transfer learning architectures o and convolutional neural networks (CNNs) to identify different species of birds. Every paper offers a distinct perspective on the evolution of bird recognition systems, highlighting the significance of selecting appropriate model architectures, high-quality datasets, and future approaches for enhancing scalability and accuracy.We seek to get a thorough grasp of the current state-of-the-art o in bird species recognition and identify potential for additional research and development by carefully analyzing the techniques, findings, benefits, and limits of these studies. 2.2 Scope and Objectives The primary goal of this review of the literature is to investigate approaches for bird species detection that make use of computer vision algorithms o and convolutional neural networks (CNNs). It looks into several methods and models used for picture and sound-based bird species identification and classification. The review will 8 examine the benefits, drawbacks, and potential consequences of these approaches for conservation initiatives, biodiversity monitoring, and technology development. Examining previous studies, recognizing CNN architectures and transfer learning strategies, assessing performance metrics, weighing the benefits and drawbacks of different approaches, and investigating potential future developments like real-time application integration and model fine-tuning are some of the specific goals. Review of Literature Theme 1: PakhiChini: Automatic E Bird Species Identification Using Deep Learning. AUTHOR NAME: Kazi Md Ragib, Raisa Taraman Shithi, Shihab Ali Haq, Md Hasan, Kazi Mohammed Sakib, Tanjila Farah DATASET OBTAINED FROM: Using different dataset available for bird classification based on Western culture. Plus collecting the o uncommon data of those bird species from various sources and merging them with western dataset. ALGORITHM USED: The deep leaning algorithm utilized a pretrained CNN model that consisted of four different variants of ResNet, namely ResNet18, ResNet34, ResNet50, and ResNet101. + The variants were employed to extract intricate features from input images. Alongside ResNet, o two fully connected layers were incorporated to further process these features, enabling the network to capture high-dimensional representations essential for the task at hand. To mitigate overfitfing, a dropout layer was integrated, randomly deactivating neurons during training. This comprehensive architecture enabled the algorithm to effectively analyze and classify images with robustness and accuracy. WEB DEPLOYMENT: Web-based API service was developed using Flask micro- framework. CONCLUSION: « The bird’s dataset was built based on Asian bird species. « ‘Two different models were proposed which showed that the proposed pretrained ResNet model achieved greater accuracy in comparison to the based model. 9 + Their final model shows 97.98% accuracy in identifying the bird species. ADVANTAGES: ‘System used deep learning algorithms with high accuracy and implementation of cutting- edge vision automation was done to get fast results with zero development costs. DISADVANTAGES: Due to the shortage of Asian based birds dataset, limited work was done on this topic. Theme 2: FR] Building a bird recognition app and largescale dataset with citizen scientists AUTHOR NAME: Grant Van Horn, Jessie Barry, Steve Branson, Panos Ipeirotis,Ryan Farrell, Pietro Perona, Scott Haber, Serge Belongie DATASET OBTAINED FROM: uneya fusing conbiaton cio sont, experts, nd Mecha Surera| imageNet ALGORITHM USED: « The research paper employed a specialized algorithm known as the edge of CNN for Computer Vision tasks. This algorithm capitalized on ff] the capabilities of Convolutional Neural Networks (CNNs) to extract features from images, particularly focusing on detecting edges and patterns. o Through a series of convolutional layers, the algorithm convolved leamed filters across input images, effectively capturing local features and patterns. « Pooling layers were then used o to downsample the feature maps, retaining essential information while reducing dimensionality. This approach facilitated accurate object detection, H image classification, and segmentation, making it a valuable contribution to the field of Computer Vision research. CONCLUSION: This research builds an Application for Bird Species detection on varied dataset o using Convolutional Neural Network. o The accuracy of the built application was 75% ADVANTAGES: The proposed application's dataset was built using surveying and with help of citizen scientists, hence a large and quality dataset was. considered. DISADVANTAGES: The involvement of citizen scientists in data collection increased the complexity leading to very low accuracy of 75%. 10 Theme 3: Image based Bird Species Identification using Convolutional Neural Network AUTHORIS NAME; Satyam Raj, Saiaditya Garyali, Sanu Kumar, Sushila Shidnal DATASET OBTAINED FROM: Microsoft's Bing Image Search API v7 ALGORITHM USED: + The research harnessed H Convolutional Neural Networks (CNNs), powerful tools in deep learning for Computer Vision. CNNs are adept at automatically learning and extracting features from images, enabling tasks such as image classification, object detection, and segmentation. + Leveraging their hierarchical structure, CNNs efficiently capture both low-level and high-level features, facilitating accurate analysis of visual data. Their versatility and effectiveness make CNNs indispensable in H a wide array of applications, from medical imaging to autonomous driving. CONCLUSION: In this paper, Method is proposed to predict the bird species from images using the most sought algorithm of Deep Learning, Convolutional Neural Network. H The entire experimental research was carried out on Windows 10 Operating System in Atom Editor with TensorFlow library. The entire system is built on: Pythons in Atom Editor and deployed in the Django web framework. They developed the entire CNN Model from scratch, imparted training to it and finally tested its efficacy. » The application developed is generating results, with a high accuracy of 93.19% on the training set and 84.91% on the testing set. ADVANTAGES: Used CNN as it is suitable for implementing advanced algorithms and gives good numerical precision accuracy, achieved accuracy was 84%- 94%. DISADVANTAGES: The model faced a great loss of data while training. 11 Chapter 3 Methodology 3.1 Introduction to Methodology [fj Birdwatching is a well-liked hobby that builds a profound respect for the avian world and helps people connect with nature. These species catch our hearts and arouse a sense of awe, whether it is through the sweet song of a songbird, the magnificent flight of a raptor, or the brilliant plumage of a tropical bird. Bird enthusiasts, scientists, and nature lovers from all around the world are interested in identifying and learning more about these species, which will advance both sci-entific understanding and personal enjoyment. However, even for experienced birdwatchers, recognizing different bird species can frequently be a difficult and time- consuming task. Since many species have minute visual variations, carrying field guides or reference books isn't always practical. Additionally, there are thousands H of different species of birds throughout the world, making proper identification difficult for anyone. We seek to accelerate, improve upon, and broaden the accessibility H of bird species recognition by leveraging the power of artificial intelligence and deep learning 3.1.1 Methodology for Model 1. Importing Libraries: The project initiation involves the importation of a selection of critical Python libraries. At the forefront of these imports is TensorFlow, a powerhouse H in the realm of machine learning and deep neural networks. Additionally, the soript in12 corporates various utility ibraries aimed at streamlining the handling of images and datasets. The inclusion of these libraries signifies the foundational components on which our entire project relies. 2. Setting Up Kaggle Environment: In an effort to streamline the data acquisition process, we proactively set up the Kaggle environment. This preparatory step involves specifying the directory that houses the Kaggle API configuration file. The presence of this configuration file is instrumental as it grants us unfettered access to Kaggle's extensive collection of datasets. This setup enables us to download datasets directly from Kaggle, a pivotal component in the data collection phase of our project. 3.Downloading and Unzipping Datasets: The subsequent stage of our project focuses on the download and organization of two distinct datasets: °525-western-bird-species" and "25-indian-bird-species.” These datasets, sourced directly from Kaggle, are essential to our project's objectives. Following the download process, the script dutifully unzips the datasets, meticulously organizing the contents into separate directories. This structured approach simplifies the management of data and | Eatiosimotaed data processing steps. 4.Merging Datasets: Recognizing the value of a comprehensive dataset for bird species classification, the script takes a noteworthy step by merging datasets. Specifically, it orchestrates the transfer ofl images from the “indian-bird-species" dataset into the "western-birdspecies" dataset. This amalgamation expands the breadth of species our model can classify, thereby enhancing the project's classification capabilities. 5.Setting Random Seed: we implement a key measure by setting a random seed within TensorFlow. This specific step is vital in ensuring that our experiments are reliably reproducible in subsequent runs. By establishing a fixed random seed, we can confidently compare results and assess the stability of our model's performance across different iterations. 13, 6 Defining Constants: Explicitly defining essential constants. These constants include batch size, image size, image shape, the number of classes, and the number of training epochs. Each of these constants plays a pivotal role in guiding the configuration parameters used throughout our project. This clear definition of constants serves as a foundation for model training and data processing, contributing to the project's rigor. 7.Data Augmentation: Leveraging TensorFlow’s 'ImageDataGenerator,’ we define a set of strategies to augment our training data. This augmentation process generates an array of diverse variations of our training images, which is instrumental in improving our model's capacity to generalize across a range of data scenarios. This approach enhances the model's robustness [Fj and its ability to adapt to variations in real-world data. 8. Creating Data Generators: To facilitate the data flow into our model during training, we thoughtfully create two pivotal data generators using TensorFlow's ‘ImageDataGenerator.’ These data generators are meticulously designed to load and preprocess images from our training and validation directories. This crucial step streamlines the data supply process, ensuring that the model is fed with appropriately processed data during the training phase. 9. Loading Pretrained Model: harnessesing the power of transfer learning by loading the EfficientNetB0 model. This model is preloaded with weights from the ‘imagenet’ dataset, a rich source of pre-existing knowledge. Functioning BB a tho foundation fo} our classification task, this pre-trained model allows us to build upon a solid base, fine-tuning it for our specific problem. 10. Custom Model Head: A custom classification head is thoughtfully integrated onto the base model. This head comprises global average pooling, dropout o layers, and fully connected layers equipped with ReLU activation functions. The final output layer adopts a softmax activation function, aligning the model for precise classification. 14 11, Compiling the Model: The model is systematically compiled. This process involves configuring the model with an Adam optimizer, categorical cross-entropy Lind or metric. This compilation prepares the model for efficient training while enabling real-time monitoring of its performance. 12. Defining Callbacks: To enhance o the effectiveness of our model training, we introduce two pivotal callbacks. The ‘EarlyStopping’ callback is designed to halt training should validation accuracy plateau, thus preventing unnecessary computational overhead. Simultaneously, the "ModelCheckpoint’ callback automates the preservation of the best model configuration during training, ensuring that optimal settings are maintained. 13. Training Bee Model: With the model architecture, data generators, and callbacks in place, we commence the training phase. We employ the "fit’ method, providing the model with training and validation data via the data generators. This procedure systematically tracks and stores training \cluding loss and accuracy metrics, in the ‘history’ variable. 14. Saving Class ation of interpreting model predictions, we store class names within a JSON file. This file functions as a valuable reference, enabling us to establish meaningful associations between numerical predictions and the actual bird species they represent. 15,Loading the Saved Model: Subsequent to the training phase, we seamlessly load the best model configuration from the checkpoint file. This configuration encapsulates the culmination of our training efforts and is poised for making predictions on new data 16.Loading Class Names: To facilitate the interpretation of our model's predictions, we meticulously retrieve class names from the previously stored JSON file. This step is vital for providing human-readable labels to our model's outputs. 15. 17. Making Predictions: Using new images, we proceed to load, preprocess, and pass these images through our model for classification. If the model's prediction confidence surpasses a predefined threshold, typically set at 0.85, we confidently print the predicted class. Conversely, if the model's confidence falls below this threshold, we explicitly indicate the mode'’s inability to make a confident prediction, ensuring transparency and precision in our results. 16 3.1.2 Application Development 1. Deep Learning Model Conversion and Data Preparation: + TensorFlow Lite Conversion: To make the model suitable for mobile devices, it was converted execution on resource-constrained devices. This conversion reduces the model's size and ‘0 TensorFlow Lite format (tite). TensorFlow Lite is designed for efficient ensures faster inference. + Bird Class Names: A JSON file was created to store the names of all the bird classes used for classification. Each class name corresponds to a specific bird species and is associated with a unique index used in model predictions. 2. Android App Development Using Android Studio: « IDE Selection: Android Studio, Be official Integrated Development Environment for Android app development, was chosen for the project. This IDE provides tools and resources for building Android applications efficiently. » Project Setup: A new Android project was created in Android Studio. This involved configuring the target Android version, defining the project's structure, and setting up essential dependencies, such as Bine Android Gradle Plugin. 3. Programming Language and Framework: + Programming Language:Java was selected as the primary programming language for Android app development. Java has been a traditional choice for Android development, offering stability and a rich ecosystem of libraries. « Framework: TensorFlow Lite, a framework developed by Google, was integrated into the Android app. TensorFlow Lite is optimized for mobile and edge devices, providing a seamless way to execute machine learning models. 4. Integration of TFLite Model and JSON File: + TFLite Model Integration: The TensorFlow Lite model (.tflite) was integrated into the Android project. This involved adding the model file to the app's assets folder and creating code to load and run inference using TensorFlow Lite Interpreter. Proper error handling and resource management were ensured to prevent app crashes and optimize resource usage. 17 + JSON File Integration: The JSON file containing bird class names was also integrated into the Android project. The app read and parsed this file to create a mapping between class indices and human-readable class names. This mapping was crucial for presenting classification results to the user in an understandable manner. 5. App Testing and Debugging: + Testing:The app underwent extensive testing on various Android devices and emulators to ensure its functionality and performance across a range of hardware configurations. Different test cases were executed to validate the accuracy and reliability B.. the bird species classification results. + Debugging:Android Studio's built-in debugging tools were utilized to identify and fix issues. This iterative process helped in ensuring B a seamless user experience and resolving any runtime errors or unexpected behavior. 6. User Interface Design and User Experience: * XML Layout:The user interface (Ul) was designed using XML layouts, which define the arrangement of UI elements such as buttons, image views, and text views. These elements were placed and styled to create an intuitive and user-friendly interface. » UX Considerations:User experience (UX) principles were taken into account during UI design. Feedback mechanisms, such as progress indicators or informative messages, were implemented to keep users informed about the image classification process. This attention to UX enhances H user satisfaction and engagement with the app. 7. Building the Android App: + Build Process:The Android Studio build system compiled the source code, integrated resources (including the TensorFlow Lite model and JSON file), and bundled them into an Android application package (APK) file. + APK Generation:The project was built, and the final APK file was generated, containing the app itself, the integrated TensorFlow Lite model, and the bird class names JSON file. This APK file is ready for deployment. 8. Deployment and Distribution: 18 + Deployment:The APK file was deployed onto Android devices for further testing, validation, and user feedback. It might have been distributed to a group of beta testers to collect feedback and identify potential issues. « Distribution:Depending on the deployment strategy, the app could be distributed through various channels. This may include publishing the app on official app stores like Google Play or making it available for direct installation from a website, allowing users to download and install the app on their Android devices. 3.1.3 Bird Sound Recognition Model 1. Importing Required Libraries: For our fl gecncerptiienmece) «= imported the following essential libraries: Pandas (pd): Used for data manipulation, file handling, and managing DataFrames containing features and labels. Numpy (np): Used for numerical computations, array operations, and mathematical functions in feature processing. Matplotlib. pyplot (pit): For data visualization tasks such as plotting spectrograms and audio waveforms for analysis. Scikit-learn For splitting H the dataset into training and testing sets to evaluate model performance. 2. Training the Audio model Using YamNet YAMNet is a convolutional neural network (CNN) that processes audio spectrograms. It uses a spectrogram as input and uses convolutional filters to o extract features from the signal. [fj The convolutional layers extract abstract representations of audio features, which are then downsampled to lower their dimensionality while keeping crucial data intact. This hierarchical processing allows YAMNet to learn hierarchical features of audio events. The network's top layers are connected and followed by softmax activation for classification. The output classes, representing various audio events or categories, are mapped to the learned features. The softmax function normalizes the output scores across all classes to create a probability distribution. [J YAMNet uses gradient descent optimisation and 19 backpropagation to modify the weights of its layers, aiming to minimize the discrepancy between input audio sample true labels and predicted probabilities. This o training process on the AudioSet dataset makes YAMNet proficient in classifying various audio events, making it useful for various audio classification tasks. . 3 Integrating Model with Application:Integrating a voice recognition model into your Android bird recognition app involves selecting or developing a suitable model, adapting it to your application's needs, collecting and preprocessing data, integrating the model into your app, designing a userfriendly interface, optimizing for real-time processing, testing and evaluating performance, gathering user feedback, and continuously improving the feature B based on feedback and performance metrics.: Many people find satisfaction in being able to recognize B and appreciate the beauty of our feathered friends in a world full of fascinating and unique avian species. i Our Bird Species Recognition Application is made to get you closer to the fascinating world of birds, whether you are an ornithology enthusiast, a nature lover, or simply inquisitive about the rich tapestry of birdlife that surrounds us.We have utilized the potential H of Convolutional Neural Networks (CNNs) by utilizing the effective-ness of the EfficientNet BO architecture and the power of cutting- edge technology. 3.2 Research Design The overall research design adopted for the Bird recognition application is 1. Data Collection and Model Testing: Compiling large datasets H for the purpose of bird image recognition is the main research project. This dataset is used to systematically test a variety of machine learning o models, such as convolutional neural networks (CNNs) and deep learning architectures. By means of thorough experimentation, the optimal model is determined by performance metrics like processing speed and accuracy. After the best model is chosen, it is converted to the TensorFlow Lite (TFLite) format so that ff ne application can be integrated with it effectively. 2. Integration of Image Recognition Model: Carefully considered integration with the application is carried out after the model has been chosen and converted. The TFLite model is smoothly integrated into the application's codebase during this process, guaranteeing compatibility and top performance 20 in the Android Studio environment. 3.Creation of an Audio Detection Model: The creation of an audio detection model for bird species is a secondary research focus that runs concurrently with image recognition. An audio detection model is painstakingly constructed and refined o through the use of methods like spectrogram analysis and machine learning algorithms. The audio detection model is successfully developed and then smoothly incorporated into the application's architecture. 3.2.1 Justification This design was selected because it offers a systematic framework for testing and refining various components of the bird recognition application. Variables like machine learning models and image processing algorithms can be adjusted through controlled experiments to determine how well they identify H different species of birds. Through the collection of quantitative data, like processing times and accuracy rates, the experimental design makes it possible to evaluate the application's performance objectively. Additionally, by using an experimental approach, possible problems can be found and solutions can be improved upon to improve B the functionality of the app. All things considered, this research design offers a systematic and organised way to use Android Studio and Java to create a trustworthy bird identification application. 3.3 Research Approach This approach was chosen based on its, compatibility with the research objectives and the nature of the available data sources. For image data, a quantitative approach was employed, utilizing datasets sourced from Kaggle. These datasets were meticulously curated and merged dataset suitable for training image recognition models. xj On the other hand, for audio data, a qualitative approach was adopted, leveraging competition data also obtained from Kaggle. This qualitative data source provides valuable insights into the acoustic characteristics of bird species, essential for developing an audio detection model. By combining both quantitative and 21 qualitative data sources, this mixed-methods approach ensures B a holistic understanding of bird recognition, encompassing both visual and auditory cues. 3.4 Sampling Strategy The act of choosing a subset of people or things from a larger population to represent that population for statistical analysis or research purposes is known as sampling. For bird recognition model (Image) sampling size was approximately 10,000 -15,000 For bird recognition model (Audio) sampling size was approximately 15,000 -25,000 3.5 Data Collection Methods The selection of this methodology was predicated on its alignment with the research goals i and the characteristics of the accessible data sources. A quantitative method was used for the image data, with datasets obtained from Kaggle. Carefully selected and combined, these datasets produced an extensive set Ber te that ian used to train image recognition models. However, a qualitative strategy was used for the audio data, making use of competition information that was also acquired from Kaggle. This qualitative data source offers insightful information about the acoustic properties of different bird species, which is crucial for creating an audio detection model This mixed-methods approach guarantees H a comprehensive understanding of bird recognition, encompassing both visual and auditory cues, by combining both quantitative and qualitative data sources. 3.6 Data Analysis Techniques We used tools like Matplotlib and SciPy to visualise and analyse the dataset as part of our research on bird data, analysis. We were able H to find patterns in bird characteristics like size, colour, and distribution thanks to Matplotlib's array of plot types, which included box plots, histograms, and scatter plots. By computing metrics like mean, median, and variance, SciPy's statistical functions allowed us to perform more in-depth analysis in addition to visualisation. In the end, this method of combining statistical analysis and visual 22 aids with visualisation helped us make well-informed decisions and conduct additional research for our project by providing insights into the behaviour and distribution of bird data. 3.7 Research Ethics All procedures related to data collection, analysis, and reporting are carried out with integrity and transparency because the research closely complies with ethical guidelines and regulations. This entails getting the required authorizations in order to use datasets and photos, protecting the privacy and confidentiality of any sensitive data, and, if necessary, getting informed consent.Furthermore, the study upholds academic honesty and disapproves of any plagiarism or unethical behaviour. Every source is correctly referenced, and any usage of previously published work is given due credit, Additionally, rather than just repeating previous findings, an attempt is made to make sure also conducts its business with professionalism and regard for all parties involved, including participants, collaborators, and coworkers. 3.8 Limitations and Assumptions While our bird recognition application excels in o identifying bird species based on visual cues, there are some limitations to consider. Firstly, the accuracy of bird voice recognition within the app may have room for improvement. While the voice recognition mode! performs reasonably well, there is a chance for enhancement through further refinement and training with a larger and more diverse dataset of bird vocalizations. Additionally, due to the limited availability of labeled data, we were unable o to achieve high accuracy in voice recognition. Furthermore, while the user interface (Ul) of the app provides basic functionality, there is potential for improvement to enhance user experience and engagement. Future iterations of the app will focus on addressing these limitations to provide users with a more comprehensive and intuitive bird recognition tool. 23 3.9 Timeline and Schedule Figure 3.1: Project Roadmap 24 3.9.1 System Architecture Overview of the System Architecture 1.Android Application Layer: Handles user interface and interaction. Captures user inputs and displays classification results. 2.Deep Learning Model Layer: Contains TensorFlow Lite model for bird sound recognition. Executes inference on audio recordings to predict bird species. 3.Data Preprocessing Layer: Preprocesses audio data before feeding it into the model Includes modules for feature extraction and data normalization. 4.External Dependencies Layer: Manages external resources like the JSON file for class names. Integrates with system-level functionalities for audio capture. Modules and Their Interactions: User Interface Module: « Receives user inputs and displays information. « Interacts with other modules to initiate recording and present results. - Audio Processing Module: * Captures audio from microphone or files. « Interfaces with Data Preprocessing Module for feature extraction. - Data Preprocessing Module: + Converts audio to spectrograms and normalizes data. » Prepares audio data for model inference. ~ Deep Learning Model Interface Module: + Loads TensorFlow Lite model and executes inference. « Receives predictions and communicates with User Interface Module. — External Resources Module: + Handles loading and parsing of class names JSON file. + Provides information for interpreting classification results. 25 Interaction Flow: 1. User interacts with UI Module, initiating audio recording or selection. 2. Audio Processing Module captures audio and passes it for preprocessing. 3. Preprocessed data is sent to Model Interface Module for inference. 4. Model predicts bird species, and results are communicated to U! Module. 5. Ul Module retrieves class names from External Resources Module and displays results to the user. 3.9.2 Design Details Android Application Layer 1. Design Considerations: To guarantee aesthetic appeal and user- friendliness across a range of device sizes and orientations, the user interface design conforms to Android's Material Design standards. Because of its flexible design, layouts can adjust B to a variety of screen sizes and aspect ratios. 2. Technologies Used: To define the user interface elements and their properties, XML layouts are used. The application logic is implemented and user interactions are managed using Java. Features of the Android SDK are used to access device functionalities including file storage and microphone input. Layer of the Deep Learning Model 1. Design Considerations:In order to balance computing efficiency and efficacy in bird sound detection, EB. deep learning model architecture was carefully chosen.In order to guarantee seamless deployment and operation on mobile devices, model optimization is a top concern. 2. Technologies Employed: The foundation for model conversion and integration into the Android application is TensorFlow Lite. Comprehensive performance evaluations and benchmarks are used to guide the selection of the mode's architecture and parameters. Layer of Data Preprocessing 1. Design Considerations: Preprocessing methods are carefully selected to improve the model's capacity to manage differences in audio recordings.To 26 preserve efficacy, compatibility between preprocessing processes and the selected architecture of B the deep leaming model is guaranteed. 2. Technologies Employed:Spectrophotograms are produced from unprocessed audio data using the Short- Time Fourier Transform (STFT).Libraries such as NumPy are used to implement advanced normalization techniques, such as mean subtraction and standard deviation division, for processing efficiency. External Dependencies Layer 1. Design Considerations: To simplify application functioning, efforts are focused on the effective administration of external resources, such as the JSON file holding class names.Prioritizing good integration with system-level operations allows for easy access to device functionality. 2. Technologies Used: The class names file is loaded and parsed using JSON parsing libraries. Features of the Android SDK make it easier to integrate devices with features like file system access and microphone input. Flow of Interaction 1. Design Considerations:From audio input to categorization results, the user experience is made seamless by the well designed interaction flow. At every step of the process, feedback channels are put in place to keep people informed and involved. 2. Technologies Used: Java's event-driven programming paradigms are used to manage user input and coordinate program operations. Reactivity is, guaranteed in resource-intensive operations like audio processing and model inference by using asynchronous programming approaches. 3.10 System Development 1.Introduction The goal of the system development project is to use deep learning techniques to provide a reliable solution focB ne classification of bird species. Our 27 main goals are to create an accurate deep learning model, build an effective system architecture, and incorporate it into an intuitive application. Developers, data scientists, subject matter experts, and possible end users are among the project's stakeholders. 2. Phase of Planning 1. Requirement gathering involved holding Hioeoen feecene ascertain the data sources, model specifications, and functionality of the system. 2. Design of System Architecture: created a scalable system architecture with application interfaces, deep learning model components, and modules for data preprocessing.Decided to use TensorFlow as the main framework for creating and deploying models. 3. Gathering and Organizing Data 1. Dataset Selection: To ensure diversity and relevance to the categorization challenge, two extensive datasets on o bird species were identified and chosen from Kaggle. 2. Data preprocessing involves. organizing and cleaning datasets to get rid of noise and irregularities so that the models are ready to be trained. 4. Developing the Model 1. Bringing in Libraries: Import necessary Python libraries for deep learning and data manipulation, such as pandas, numpy, and TensorFlow. 2. Setting Up the Environment: Made sure the development environment is compatible with TensorFlow and other libraries by configuring it with the required tools and dependencies. 3. Downloading and Preprocessing Datasets: To improve model performance, a selection of datasets were downloaded and preprocessed, including data augmentation and normalization procedures. 4. Model Architecture Design: Using the EfficientNetB0 model, which was pretrained on the ImageNet dataset, H a convolutional neural network (CNN) architecture was designed using transfer learning principles. 28 5, Model Training: Using preprocessed datasets, the CNN model was trained, and its accuracy and efficiency were maximized by adjusting its hyperparameters. 6. Model Evaluation: Using validation datasets, the trained model's performance was assessed by calculating metrics like accuracy and precision. 7. Model Optimization: Enhanced the model's effectiveness and efficiency,finetuning architecture and hyperparameters as needed. 5. System Integration 1. Model-Application Integration: Created a seamless ‘communication channel between the frontend and backend components by integrating the trained model with the application backend. 2. User Interface Design: With an emphasis on usability and simple navigation, user-friendly interfaces were created for the application frontend. 6. Quality Control and Testing: 1. Unit Testing: To make sure every unit operates as intended, unit tests were carried out for separate modules and components. 2. Integration Testing: Tested the integrated system to confirm data flow and communication routes, as well as interactions between components. 3. System Testing: To assess the system's overall functionality, responsiveness, and dependability, end-to-end testing was cooperation with stakeholders, obtaining input for incremental changes. 7. Implementation and Upkeep 1. Deployment Planning: Developed deployment settings and tactics to guarantee a seamless deployment with the least amount of downtime. 2. Deployment Execution: Set up the system in staging or production environments, keeping an eye on the deployment procedure, and resolving any problems that may arise. 3. Post-Deployment ‘Support: Offered continuing assistance and upkeep for the system that was put into use, quickly responding to user comments and 29 bug complaints. 4. Performance Monitoring and Optimization: Over time, components were optimized for increased scalability and efficiency by keeping an eye on system performance and resource utilization. 8. Documentation and Training 1. System Documentation: User manuals and technical documentation were created for stakeholders, and system architecture, design choices, and implementation specifics were documented. 2. Training and Knowledge Transfer: Taught administrators and ves SE system and made sure they understood its features. 9. Project Evaluation and Final Thoughts 1. Project Review: Evaluating success criteria and performance measures, the project goals and objectives were reviewed in relation to the delivered system. 2. Lessons Learned: Recorded accomplishments, difficulties, and opportunities for development, together with important insights and lessons discovered throughout the system development process. 3.10.1 Programming Languages and Tools 1. Programming Languages 1. Python: Because of its broad support for libraries and frameworks, especially in the fields of machine learning and deep learning, Python was the dominant programming language used for system development. It made jobs like developing models, preparing data, and implementing backend application logic easier. 2. Java: Because of its stability, platform neutrality, and robust library ecosystem, Java was chosen as the programming language for creating Android apps. By using it, the creation of reliable and expandable mobile applications was guaranteed. 30 2. Frameworks 1. TensorFlow: The fundamental foundation for creating and refining deep leaming models was TensorFlow, Its robust community support and extensive APIs made model building, optimization, and deployment incredibly efficient. 2. TensorFlow Lite: The Android app's incorporation of TensorFlow Lite enabled the effective operation of machine leaming models on mobile devices. This framework made the most use of available resources and made it easier to integrate deep learning features into the mobile application. 3. Libraries and Tools 1. Kaggle API: By providing direct access to and downloads of bird species datasets from the Kaggle platform, the Kaggle API expedited the data collecting process. This made it easier to include various datasets into the system. 2. Android Studio: Providing a full range of tools and resources for creating Android applications, from project setup to deployment, fF Android Studio is the official Integrated Development Environment (IDE) for Android app development. 3. Pandas: Data cleaning,organizing, and feature extraction were among the preprocessing and manipulation tasks that made use of the Pandas library. Its user-friendly operations and data structures increased efficiency when preparing data. 4. NumPy: NumPy was essential for performing mathematical operations, array computations, and data processing and model training. Its quick array operations expedited a range of data manipulation tasks. 4 Additional Instruments 1. Git: By providing effective code management, version tracking, and team collaboration, the Git version control system promoted collaborative development. 2. GitHub: To enable code sharing, teamwork, and version control among team members, GitHub functioned as the central repository for project code, documentation, and resources. 31 3. Android Emulator: Using the Android Emulator, one could test an Android application on a virtual device to make sure it worked and was compatible with various Android versions and setups. 3.10.2 Implementation Details 1. Library Importation: TensorFlow, a Python library well-known for its capabilities in deep neural networks and machine learning, Bi: one of the most important ones to be included when starting a project. To expedite the handling of images and datasets, auxiliary utility ibraries are also added, providing the framework for the project's advancement. 2. Setting up the Kaggle environment proactively entails identifying the directory that contains the Kaggle API configuration file. This preliminary phase provides unrestricted access to Kaggle’s vast dataset collection, expediting the data acquisition procedure for a smooth system integration. 3. Downloading and Unzipping the Datasets: The next step involves downloading and carefully organizing two different datasets that were obtained from Kaggle: "525-western-bird-species” and "25- indian-bird-species.” After the download, our script works diligently to unzip the datasets and partition the contents into different directories, making data management easier and making processing stages easier. 4. Dataset Merging: Our project arranges for the combining of datasets since it recognizes the importance of having a complete dataset for i the classification of bird species. It specifically moves photos i from the "indian-bird- species” dataset into the "western-bird-species” dataset, enhancing and broadening the range of species that our algorithm is capable of classifying. 5. Setting Random Seed: To guarantee that experiments are repeatable across several runs, TensorFlow implements a crucial step by randomly selecting a seed. By setting up a fixed random seed in this stage, it becomes possible to compare findings with confidence and evaluate how stable the model's per32 formance will be in later iterations. 6. Constant Definition: Explicit definitions are provided for critical constants, which include batch size, picture size, image shape, number of classes, and number of training epochs. Project rigor is increased by each constant's crucial role in directing configuration parameters throughout the undertaking and offering a strong basis for data processing and model training. 7. Data Augmentation: Our project provides ways to enhance training data by utilizing TensorFlow's 'ImageDataGenerator’. By producing B a wide variety of training image variations, this augmentation procedure helps the model become more robust and adaptable to differences in real-world data. It also improves [BJ] the model's ability to generalize across a variety of data circumstances. 3.11 System Testing 1. Unit Testing:In the unit testing phase, rigorous evaluation of the bird recognition model (Image) was conducted using diverse datasets encompassing various angles of bird images. The testing protocol encompassed scenarios that involved distant and zoomed-in bird images in order to thoroughly evaluate the model's performance at various spatial scales. The the goal was to confirm that the model could reliably and robustly identify birds from a H variety of viewing angles in real-world deployment scenarios. 2. integration Testing: During the integration testing phase, we seamlessly integrated the Indian and Wester bird recognition models, leveraging their complementary strengths. Subsequently, the unified model underwent comprehensive evaluation using a diverse range of datasets containing images of birds from both regions. This collaborative approach allowed us to validate the model's effectiveness in accurately identifying birds from different geographical areas. Moreover, the testing protocol encompassed scenarios involving zoomed-in and distant bird images to assess the model's performance across varying spatial scales. By subjecting the integrated model to such diverse conditions, 33 We ensured its robustness and reliability in accurately recognizing birds irrespective of their proximity to the camera. 3.System Testing:During the system testing phase, we deployed the integrated bird recognition model on the Android Studio platform and rigorously evaluated its performance H within the context of the Android application. This involved thorough testing of the Android app's functionality, user interface responsiveness, and overall integration with the deployed model. 4. User Interface Testing:Interface Intuitiveness: Users were presented with the application interface and are asked to perform specific tasks, such as uploading a bird image, viewing identification results, or navigating through different sections of the app. Observers note any difficulties or confusion encountered by users during these tasks, which may indicate areas of the interface that need improvement for better intuitiveness. 3.12 Results and Evaluation 1. Accuracy and Robustness: In identifying bird species from a variety of datasets and viewing angles, the system showed excellent accuracy and robustness. It was confirmed through unit testing that the bird recognition model functioned consistently, even in difficult situations like photos of far-off or zoomed-in birds. Integration testing demonstrated the model's adaptability in identifying birds from various regions and further validated its efficacy, particularly when working with Indian and Western models. 2. Scalability and Adaptability The integrated model's deployment on the Android platform demonstrated scalability and adaptability during system testing, guaranteeing smooth operation across a range of Android devices and screen sizes Bitte oysters abiiy ta manage higher user loads without sacrificing performance was demonstrated by how well it handled picture uploads and produced identification results quickly. 3. User Experience and Usability: The application interface’s intuitiveness and usability were greatly enhanced by user experience testing. User feedback indicated areas that needed work, like expediting the uploading of images, making the results of identification more readable, and enhancing the app's navigation. By taking these recommendations into consideration, the application's user experience would be improved overall, increasing B user satisfaction and engagement. 34 3.12.1 Performance Metrics Response Time: After users upload bird photos, the application reliably satisfies the predetermined response time requirements, giving them identification results in a matter of seconds. Accuracy: The application outperforms the predetermined standards for accurate bird identi ations, achieving a high accuracy rate. ‘The application can be trusted to provide accurate identification results, which will improve users’ research and birdwatching experiences. Scalibility: Performance testing has shown that the application is robustly scalable, able to handle higher user loads without experiencing appreciable performance degradation. It guarantees continuous service during periods of high usage by meeting or surpassing predetermined standards for supporting concurrent users and image processing requests. 3.12.2 Comparison with Requirements The bird recognition application has effectively met its initial project requirements and objectives, d ering accurate bird identifications and a userfriendly interface. However, the addition H of features, such as regional model integration and Ul enhancements, extended the project timeline slightly. Despite this, these additions have significantly enhanced the application's effectiveness and user satisfaction. [fy Continuous improvement based on user feedback underscores the project's commitment to delivering value and meeting evolving user needs. 35 3.13 Conclusion The system realization and implementation process of the bird recognition application exemplifies | Greternegrecend developing a sophisticated yet accessible tool for bird identification. By prioritizing accuracy and user experience, the application has successfully garnered praise from users who find it to be both reliable and intuitive. Despite the challenges posed by the integration of additional features, the commitment to ongoing improvement ensures that the application [Ene FIRED tectinotogica advancements in bird recognition. This adaptability and responsiveness to user needs underscore the project's dedication to delivering a solution that B not only meets but exceeds expectations. Moreover, the collaborative efforts of the development team, alongside valuable input from users, have resulted in a truly valuable asset for bird enthusiasts and researchers alike. Moving forward, the application stands poised to continue its legacy of innovation, serving H as a cornerstone in the pursuit of avian biodiversity conservation and research. . 36 3.14 Result/Output OF Project 3.14.1 o For Bird Species Recognition Through Image Model: Figure 3.2: Predictions for Bird Recognition through Image: Kaggle model testing Model Predictions (Kaggle Model Testing) Source: Kaggle Model Testing Interface Description: The image shows three different bird species as part of the model's testing phase, each accompanied by its classification result and confidence score. Image 1: Bird Species: Indian Grey Hombill Confidence Score: 0.99955136 Description: The model identified the Indian Grey Hornbill with extremely high confidence. Image 2: Bird Species: Alexandrine Parakeet Confidence Score: 0.910354 Description: The model identified the Alexandrine Parakeet with high confidence. Image 3: Bird Species: Jungle Babbler Confidence Score: 0.998193 Description: The model identified the Jungle Babbler with very high confidence. 37 3.14.2 |B] For Bird Species Recognition Through Application: Figure 3.3: Predictions for Bird Recognition through Application BeakBook Classification (Image) Application Name: BeakBook Interface Description: The interface displays a group of birds with a black background and the app name "BeakBook’ at the top. Classification Result: The app has classified the birds in the image as "Sarus Crane”. Accuracy: The classification accuracy is displayed as 0.990, indicating very high confidence in the result. User Options: Record Audio: Allows the user to record audio for bird sound identification. Take Picture: Enables the user to take a picture for visual bird identification. Upload File: Provides an option to upload an existing file for classification. 38 3.14.3 For Bird Species Recognition(Voice) Through Application: Figure 3.4: Predictions for Bird Recognition(Voice) through Application BeakBook Classification (Audio) Application Name: BeakBook Interface Description: The user interface consists of a black background with the app name "BeakBook’ displayed at the top. Classification Result: The app has classified an input as. Eurasian Hoopoe”. Accuracy: The classification accuracy is displayed as 0.919, indicating high confidence in the result. User Options: Record Audio’ Allows the user to record audio for bird sound identification. Take Picture: fj Enables the user to take a picture for visual bird identification. Upload File: Provides an option to upload an existing file for classification. 39 Chapter 4 Conclusion and Future Scope 4.1 Conclusion The process of this project combined mobile app development, deep learning, and data science in a seamless way. Each stage built the foundation for flexible bird species classification, from carefully chosen library imports to the blending of various datasets and the honing of a strong machine leaming model. Expert training, posttraining optimizations, and TensorFlow Lite's incorporation into an Android application demonstrated how complex algorithms can be combined with user-friendly mobile interfaces. This convergence made sure that the user experience was the main focus while H showcasing the potential of machine learning in practical applications. In essence, the bird recogni nN app not only enriches the lives of users by providing a fun and educational ff experience but also contributes to the broader goals of biodiversity conservation and scientific research. As technology continues to advance, the potential for such applications to make a meaningful impact on our understanding and B appreciation of the natural world is truly promising. 40 4.2 Future Scope 1. User-Centric Enhancements: Constantly improving the user interface of the mobile application, taking user comments into account, and adding additional user- friendly features to captivate and inform users about different bird species.Combining Environmental Data Integration 2. Fusion with Environmental Data: Adding environmental data to the model, such as location, climate, or habitat details, could improve its overall grasp of the distribution and behavior of bird species. 3. Extension to Other Domains:Using the established infrastructure for a variety of applications extending the application beyond the classification of bird species to include broader domains like or ecological monitoring. 4. Fine-tuning and Expansion: By continuously adding new data to the current model, [B} i is possible to improve its accuracy and increase its capacity to classify a broader range of bird species. Classification accuracy may be further improved by combining transfer learning with more recent pre-trained models or architectural designs. a1 Appendix A Brief Bio data of each student Name: Chiranjivi Hole - Educational Background and Pas: n: * Pursuing Bachelor's degree in Data Science at Usha Mittal Institute of Technology. * Passionate about technology, particularly drawn to machine learning, data analytics, and software development. - Skills and Competencies: * Proficient in data analytics, database management systems (DBMS), and Linux. * Skilled in using databases, and present insights for informed decision-making. - Current Endeavor: Immersed in pursuing the AWS Cloud Practitioner Certificate, learning about cloud fundamentals, AWS services, security protocols, and pricing structures to proficiently leverage AWS resources for various projects and initiatives. 42 Name: Sakshi Kakade - Educational Background and Interests: * Completing Bachelor of Technology degree at Usha Mittal Institute of Technology. * Profound interest in Machine Learning, Data Analytics, and Data Engineering. - Skills: * Python, SQL, ETL (Extract, Transform, Load) * Data Visualization: Power BI, Tableau * Data Analytics, Data Analysis, NLP (Natural Language Processing) - Competencies: Analyzing and processing complex datasets * Developing and optimizing data pipelines * Creating insightful visualizations for informed decision-making, - Certifications: * Python Masterclass (Udemy) * Machine Learning Internship (Suven) 43, Name: Priti Mhatre - Educational Background and Internship: * Currently pursuing a Bachelor's degree in Data Science at Usha Mittal Institute of Technology. * Engaged as an inter at Colgate Global Business Services, contributing to HR applications. - Specialization and Achievements: * Expertise in data analysis, with a strong focus on extracting insights from large datasets. * Completed the Google Data Analytics Professional Certificate, enhancing analytical skills. - Skills and Strengths: * Proficient in Python, SQL, ETL, data visualization (Power BI, Tableau), NLP, data analytics, and data analysis. * Strong communication skills in spoken and written English, analytical thinking, Appendix B Plagiarism Report 45 Appendix C Research paper based on project 113.46 21347 Powered by TCPDF (www.tepdf.org) 3/348 International Jounal of Research Publication and Reviews, Vol 5, no 4, pp 4482-4487 April 2024 International Journal of Research Publication and Reviews Journal homepage: www.iirpr.com ISSN 2582-7421 i BeakBook: Bird Species Recognition Mobile Applications Chiranjivi Hole, Sakshi Kakade, Priti Mhatre Department of Data Science, with grace and significance, serving as vital indicators of environ- mental changes. Given the imminent threat of extinction that many bird species face, it is critical to implement effective conservation techniques. fl With the use of cutting-edge deep learning algorithms, visual categorization based on bird photos becomes a viable method for species identification. [FJ] The goal of this research is to create an effective model for rapid identification of bird species by utilising deep learning and neural networks. The neural network receives intense training to get excellent classification accuracy by using large datasets that include photos and sounds of birds. [fJJ For bird voice recognition libraries like librosa are used for classification Advanced methodologies of Neural networks are used to extract complex information from bird photos, improving the model's capacity to identify minute species distinctions. This project intends to make a substantial contribution to scientific study and conservation efforts by automating the process of bird species identification. This will enable a deeper knowledge of avian biodiversity and assist in the preservation of our natural environment. In addition to speeding up the identification procedure, this paradigm shift towards automated bird species recognition promotes a greater comprehension of avian biodiver- sity. Effective deep leaming models can help focus conservation efforts more precisely and effectively, reducing the threats that endangered bird populations must contend with Bhrrousn the democratisation of access to cutting-edge technology, this project enables people all around the world to take part in scientific research and bird conservation projects. The success of these initiatives marks the beginning of a new chapter in animal conservation, one in which technology acts as a catalyst for favourable alterations in the environment. [fJj We work together and with creativity to create a future where all bird species coexist peacefully with their natural environments. Index Terms—Neural Networks, Deep Learning, EfficientNet, YamNet, ,Bird species recognition, Birds voice recognition. |. Introduction Many people find satisfaction in being able to recognize and appreciate the beauty of our feathered friends in a world full of fascinating and unique avian species. Our Bird Species Recognition Application is made to get you closer to the fascinating world of birds, whether you are an ornithology enthusiast, a nature lover, or simply inquisitive about the rich tapestry of birdlife that surrounds us.We have utilized the potential o of Convolutional Neural Networks (CNNs) by utilizing the effectiveness of the EfficientNet BO arc! scture and the power of cutting-edge technology. i We can provide you with a seamless and precise tool for recognizing bird species from photographs thanks to this novel approach. For both professionals and bird enthusiasts, our application offers a convenient and accessible platform. 1) Birdwatching is a popular hobby promoting appreciation for nature and birds because there are so many different kinds of birds and their minute visual distinctions, it can be difficult to identify them and for birdwatchers, carrying field guides or reference books is frequently impracticable. 2) Identifying and learning about bird species is a global interest for enthusiasts, scientists, and nature lovers. . 3) Recognizing different bird species can be challenging due to visual variations and the vast number of bird species. 4) The aim is to improve bird species recognition using Neural Network and deep learning. II. Problem Defination: Birdwatching is a well-liked hobby that builds a profound respect for the avian world and helps people connect with nature. Whether it's the dazzling plumage of a tropical bird, the amazing flight of a raptor, or the lovely singing of a songbird, these species captivate our hearts and inspire wonder. The identification and study of these species will benefit scientific. knowledge and human delight, and bird enthusiasts, scientists, and nature lovers worldwide will be interested. 49 E International Journal of Research Publication and Reviews, Vol 5, no 4, pp 4482-4487 April 2024 4483 But distinguishing between various bird species ‘can some: times prove to be a challenging and time-consuming endeav- our, even for seasoned birdwatchers. i It's not always practicable to bring field guides or reference books since many bird species have subtle visual changes. Ill. Survey Of Literature In the process of developing this project, an extensive review of various papers and journals were conducted In [1] research paper four different transfer learning models namely InceptionV3, ResNet152V2, DenseNet201, and MobileNetV2 was implemented on the identified dataset. Out of this ResNet152V2 provide with maximum accuracy 95.45. Thus in these research paper a model was built to identify bird species i using Convolutional Neural Network Inf2Jresearch paper the project builds an Application for Bird Species detection on varied dataset using Convolutional Neural Network i The accuracy of the built application was 75 In[3]the bird's dataset was built based on Asian bird species. Two different models were proposed which showed that the proposed pretrained ResNet model achieved greater accuracy in comparison to the based model. Their final model shows 97.98 accuracy in identifying the bird species [4] Reserach paper served as our primary reference throughout the project. In[4] a CNN based architecture was developed and evaluated for 525 different bird species categorization. By combining strong data augmentation tactics with transfer learning approaches, this method produced impressive accuracy rates. i When it came to classifying bird species, the model performed well and showed characteristics that could be applied to other contexts without experiencing overfitting. Interestingly, the test set showed a similar accuracy of 86.7%, but the valid: set indicated an accuracy of 87% i The extension of current databases has been identified as one path for future refinement, with a particular focus on addressing imbalances among underrepresented species and gender groups. To improve model performance, more research into different approaches to data augmentation is necessary. The work In[5]investigates the use of the Na’ive Bayes algorithm to the recognition of bird species using acoustic characteristics, with a 91.58% accuracy rate. i The research highlights how different the vocal tracts of different birds produce sounds, as well as the difficulties that come with memory control and signal-to-noise ratio optimisation. An iterative procedure for non-real-time bird speech recognition is outlined, including steps like choosing an audio clip and classifying the data using the Na‘ive Bayes method. Inf6]the paper details an experiment that used Human Factor Cepstral Coefficients (HFCs) and Hidden Markov Models (HMMs) to automatically classify the vocalisations of eighteen different bird species. An interspecific success rate of 81.2% and a classification success rate of 90.45% for families were attained in the experiment; data from other models might be taken into account for possible improvement. The quality and processing of the input data are critical to the experiment's success; possible enhancements include more precise recording segmentation and the use B of deep neural networks for the identification of bird sounds. TABLE | Features extracted for Bird recognition NO Species Color Shape Features 1 Fairy Tern White, Black, Grey Small, Slim body, Long wings 2 Zebra Dove Tan,White,Black Small, Plump body, Short beak 3 Harpy Eagle Dark brown, Gray Large, Robust body, fj Huge wings. 4 Canary Yellow,Gray Small, Slender body, Short beak IV. Established Operational Infrastructure. 1. Merlin Bird ID: The Merlin Bird ID from the Cornell Lab of Ornithology is renowned for its extensive database and user-friendly interface. Using machine learning algorithms to identify bird species based on user- submitted images or descriptions, it assists birdwatchers in rapidly and correctly identifying different bird species. Key Features: Offers a large database of bird species, simple photo- based identification, the capacity to identify bird cries, and comprehensive species information, making it a priceless tool for omithologists and enthusiasts alike. 2. eBird: 50 International Journal of Research Publi ion and Reviews, Vol 5, no 4, pp 4482-4487 April 2024 4484 Recording bird sightings as part of a citizen science pro- gramme is the goal of eBird, a robust platform. i Users can contribute photos, audio recordings, or descriptions of their observations of birds to a worldwide dataset used for research on bird populations and conservation efforts. Key Features: Provides tools for species identification, allows users to submit photos and audio recordings, encourages community participation in birdwatching and conservation, and provides insights on bird populations. 3. iBird: For bird species, iBird is a comprehensive digital field guide. It has a vast collection of bird information, including detailed descriptions of characteristics, habitats, and behaviours along with images and sounds. V. Algorithmic framework design: A. Birds Image Recognition EfficientNet, a convolutional neural network architecture has emerged as a prominent framework for image classifi- cation tasks owing to its balance between performance and computational efficiency. The EfficientNet-BO variant, chosen as the foundation for our bird species classification endeavor, is characterized by a compound scaling method that simulta- neously adjusts network depth, width, and resolution. Let X represent the input image with dimensions WxHxC, where W is width, H is height, and C denotes the number of channels. The [fj architecture comprises the following key elements: Input Stem: Initial processing layers, including convolutional and pooling operations, extract fundamental features from the input image. Efficient Blocks: These blocks, iteratively applied throughout the network, feature convolutional layers with varying kernel sizes and expansion ratios. Each block's output undergoes recalibration via a squeeze-and-excitation (SE) mechanism to enhance feature responsiveness. Global Average Pooling: Following the last block, global average pooling aggregates feature maps into a fixed-size representation. Fully Connected Layer: A fully connected layer, followed by a softmax activation function, facilitates classification. B. Birds Voice Recognition YAMNet is o a convolutional neural network (CNN) that pro- cesses audio spectrograms. It uses a spectrogram as input and uses convolutional filters B to extract features from the signal. [fj The convolutional layers extract abstract representations of audio features, which are then downsampled to lower their dimensionality while keeping crucial data intact, This hierar- chical processing allows YAMNet to learn hierarchical features of audio events. The network's top layers are connected and followed by softmax activation for classification. [fJj The output classes, representing various audio events or categories, are mapped to the learned features. The softmax function normal- izes the output scores across all classes to create a probability distribution. YAMNet uses gradient descent optimisation and backpropagation to modify the weights of its layers, aiming to minimize the discrepancy between input audio sample true labels and predicted probabilities. This o training process on the AudioSet dataset makes YAMNet proficient in classifying various audio events, making it useful for various audio classification tasks. VI. Experiments A. Methodology for Bird Image Recognition Model 1) Data Source: : The subsequent stages for project involes downloading and organizing two datasets, "525-westem-bird-species” and "25- indian-bird-species,” sourced from Kaggle i The script unzips the datasets and organizes them into separate directories, sim- plifying data management and setting the stage for data pro- cessing. The soript also merges datasets, transferring images from the “indian-bird-species” dataset into the "western-bird- species’ dataset, expanding the scope of species the model can classify, enhancing the project's classification capabilities. The preview of dataset used for project: Fig. 1. A preview of birds in dataset 51 International Journal of Research Publication and Reviews, Vol 5, no 4, pp 4482-4487 April 2024 4485 2) Data Preprocessing : A. Data Augmentation: Leveraging TensorFlow’s 'ImageDataGenerator,’ we define a set of strategies to augment our training data. [fj This augmen- tation process generates an array of diverse variations of our training images, which is instrumental in improving our model's capacity to generalize across a range of data scenarios. This approach enhances the model's robustness B and its ability to adapt to variations in real-world data. B. Creating Data Generators: To facilitate the data flow into our model during training, we thoughtfully create two pivotal data generators using TensorFlow’s ‘ImageDataGenerator.’ These data generators are meticulously designed to load and preprocess images from our training and validation directories. This crucial step streamlines the data supply process, ensuring that the model is fed with appropriately processed data during the training phase... 3) Training. The training phase involves constructing a model using the ‘fit’ method, which provides B training and validation data through data generators. The model's progress is tracked in the ‘history’ variable. Class names are saved in a JSON file for interpreting predictions. The saved model configuration is then loaded from the checkpoint file, encapsulating the training efforts and ready for predictions on new data. Class names are then retrieved from the JSON file to provide human-readable labels to the model's outputs. This process ensures meaningful associations between numerical predictions and the actual bird species they represent. 4) Results:: The process involves loading, preprocessing, and passing new images through a model for classification. If the model's prediction confidence exceeds a predefined threshold, the class is printed, and ifit falls below, the model is explicitly indicated, ensuring transparency and precision in the results. Fig. 2. Predictions for Bird Recognition through Image B. Methodolgy For Bird Voice Recognition : Fig. 3. Visualizing bird audio data 1) Data Source: : Explore the data, noting that are divided into train and test folders, with each split containing folders named after bird codes. All audios are mono and have [/) 2 sample rale of 16kHz 2) Data Preprocessing Import Model Maker, TensorFlow, and any additional li- braries required for manipulating audio, and creating visuali- sations. 3) Training : : To train the model E using Model Maker for audio, begin with a model spec, which serves as the base model for extracting information and learning about the new classes. It also de- termines how the dataset will be transformed to comply with the model's specifications, such as sample rate and number of channels. YAMNet, H ‘an audio event classifier trained on the AudioSet dataset, is used for predicting audio events from the AudioSet ontology. The create method in the audio

You might also like