Paper 1

Uploaded by

Ishika Kale

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Paper 1

Uploaded by

Ishika Kale

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

e-ISSN: 2582-5208

International Research Journal of Modernization in Engineering Technology and Science

( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:04/April-2024 Impact Factor- 7.868 www.irjmets.com
VIDEO TO TEXT CONVERTER
Ajay N. Tembhare*1, Aastha P. Godange*2, Mohit S. Tondre*3, Pritam J. Satpute*4,
Sameer S. Selokar*5, Vaibhav Tembhurkar*6
*1,2,3,4,5,6Students, Department of Computer Science and Engineering, Guru Nanak Institute of Technology,
Nagpur, Maharashtra, India.
DOI : https://www.doi.org/10.56726/IRJMETS54631
ABSTRACT
The 'Video to Text Converter' project aims to develop an automated system capable of converting spoken
words in video content into textual transcripts efficiently. Leveraging advanced speech recognition and natural
language processing technologies, the system processes video content to extract audio tracks, which are then
transcribed into text using deep learning models. Post-processing techniques, including punctuation insertion
and spell checking, enhance transcription accuracy.
The system supports English transcription and finds applications in education, law enforcement, and content
creation industries. Overall, the project addresses the growing demand for tools that make video content more
accessible and searchable, offering valuable benefits across various domains.
Keywords: Video processing, Speech recognition, Natural language processing, Automated transcription, Deep
learning models etc.
I. INTRODUCTION
A video to text converter is a software tool or system that automatically transcribes spoken audio content from
a video file into written text. This technology utilizes speech recognition algorithms to analyze the audio track
of the video and convert it into a textual format.
The resulting text can then be edited, searched, indexed, or used for various purposes such as creating subtitles,
generating transcripts for accessibility purposes, or extracting information from video content for analysis or
documentation. The importance of video to text converters extends across various domains and applications. In
educational settings, these converters facilitate the creation of transcripts for instructional videos and lectures,
enhancing learning outcomes by providing searchable and indexed textual content.
In the realm of digital marketing, they play a vital role in improving search engine optimization (SEO) efforts by
making video content more discoverable through indexed transcripts. Moreover, video to text converters are
invaluable tools for content analysis, allowing researchers, marketers, and content creators to extract valuable
insights from video content. Techniques such as sentiment analysis, keyword extraction, and topic modeling
can be applied to video transcripts to glean actionable information and trends. In addition to their utility in
accessibility and content analysis, video to text converters also have significant implications in legal and
compliance contexts. Transcripts generated by these converters serve as official records in legal proceedings,
compliance audits, and regulatory requirements, ensuring accuracy and accountability.
II. METHODOLOGY
For a 'Video to Text Converter’ project, you'll need a combination of tools and platforms for various tasks such
as data preprocessing, model development, training, evaluation, and deployment. Here's a list of commonly
used tools and platforms for each stage of the project:
1. Data Collection and Preprocessing: Library Speech: A popular dataset for training speech recognition
models, containing read speech from audiobooks.
2. Model Development and Training: An open-source machine learning framework developed by Google,
widely used for building and training deep learning models, including speech recognition models.
3. Model Evaluation and Testing: Various open-source tools are available for calculating Word Error Rate
(WER) and other evaluation metrics for assessing the performance- of speech recognition models.
4. Deployment and Integration: A system for serving machine learning models in production, including
speech recognition models trained using TensorFlow.

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[11232]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:04/April-2024 Impact Factor- 7.868 www.irjmets.com
5. Development Environments: Interactive computing environment for developing and prototyping
machine learning models, including speech recognition models.
6. Version Control and Collaboration: Distributed version control system for tracking changes to code and
collaborating with team members on the development of speech recognition models.
III. MODELING AND ANALYSIS
Flow-Chart: Video to Text Converter.

IV. RESULTS AND DISCUSSION

1. First, we run the Python file, it will launch a window that looks just the UI we created.
2. When a user clicks on the menu item, we launch a file dialog box for the user to select the appropriate file.
3. Once converted into audio format, we can now start the transcription. Here we get the filename from the
text the user entered the output file name text box.
4. The image below shows the final version. The progress bar updates to show the transcription progress.
When complete we load the text file contents into the text area, which automatically adds scroll bars if
needed.

Figure 1: Video To Text Converter

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[11233]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:04/April-2024 Impact Factor- 7.868 www.irjmets.com
V. CONCLUSION
In conclusion, this research presents a comprehensive study on the development and evaluation of a video-to-
text converter using Python programming language. The implemented converter leverages advanced machine
learning and natural language processing techniques to accurately transcribe spoken dialogue and extract
meaningful textual representations of visual content from videos.
The experimental results obtained from evaluating the converter demonstrate its effectiveness, reliability, and
real-world applicability in accurately converting videos to textual format. The achieved transcription accuracy,
object detection performance, text summarization quality, and processing efficiency validate the suitability of
the converter for various practical applications in digital media analysis, multimedia content management,
educational technology, and accessibility enhancement.
Looking ahead future research and directions include [potential areas for improvement or extension], such as
[list of future research directions]. By addressing these challenges and advancing the state-of-the-art in video-
to-text conversion technology, we can further enhance the accessibility, usability, and utility of digital video
content for diverse user populations and applications.
ACKNOWLEDGEMENTS
We would like to take this opportunity to thank all the people who were part of this seminar in numerous ways,
people who gave un-ending support right from the initial stage.
We wish to thank Prof. Trupti Ghate as an internal project guide who gave their co-operation timely and
precious guidance without which this project would not have been a success. We thank them for reviewing the
entire project with painstaking efforts and more of his, unbanning ability to spot mistakes. We would like to
thank our Prof. Jagruti Ghatole (HoD) for her continuous encouragement, support, and guidance at every stage
of the project.
And finally, we would like to thank all my friends who were associated with me and helped me in preparing my
project. The project named “Video to Text Converter” would not be possible without the extensive support of
people who were directly or indirectly involved in its successful execution.
VI. REFERENCES
[1] "Transcription Functions | Transcriber". General Transcription Functions and Conventions, Audio
Transcriptions. 2017-06-08. Retrieved 2019-02-15.
[2] Bhatt, Medha. "What is AI Transcription? Everything You Need to Know". fireflies.ai. Retrieved 3 June
[3] "Use Live Transcribe - Android Accessibility Help". support.google.com. Retrieved 2021-06-14.
[4] Butler, Sydney (2019-12-09). "How to transcribe speech using Google's Live Transcribe app".
9to5Google. Retrieved 2021-06-14.
[5] "Google Chrome's new Live Caption feature will transcribe speech in videos". techxplore.com. Retrieved
2021-06-14.
[6] "Now you can transcribe speech with Google Translate". Google. 2020-03-17. Retrieved 2021-06-14.
[7] Krasnoff, Barbara (2020-08-14). "How to use Google's free transcription tools". The Verge. Retrieved
2021-06-14.
[8] "Live Transcribe & Sound Notifications - Apps on Google Play". play.google.com. Retrieved 2021-06-14.
[9] Golla, Ramsri Goutham (2023-03-06). "Here Are Six Practical Use Cases for the New Whisper API". Slator.
Archived from the original on 2023-03-25. Retrieved 2023-08-12.

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[11234]

Touchpad Plus Ver. 4.0 Class 8
From Everand
Touchpad Plus Ver. 4.0 Class 8
Nidhi Gupta
No ratings yet
Job Titles in The Public Sector
No ratings yet
Job Titles in The Public Sector
3 pages
Learning Software Engineering
From Everand
Learning Software Engineering
IT Campus Academy
No ratings yet
Learn IoT Programming Using Node-RED: Begin to Code Full Stack IoT Apps and Edge Devices with Raspberry Pi, NodeJS, and Grafana
From Everand
Learn IoT Programming Using Node-RED: Begin to Code Full Stack IoT Apps and Edge Devices with Raspberry Pi, NodeJS, and Grafana
Bernardo Ronquillo Japón
No ratings yet
SRS - How to build a Pen Test and Hacking Platform
From Everand
SRS - How to build a Pen Test and Hacking Platform
alasdair gilchrist
2/5 (1)
Android Design Patterns and Best Practice
From Everand
Android Design Patterns and Best Practice
Mew Kyle
4.5/5 (2)
DevOps Bootcamp
From Everand
DevOps Bootcamp
Mitesh Soni
No ratings yet
Test-Driven iOS Development with Swift: Create fully-featured and highly functional iOS apps by writing tests first
From Everand
Test-Driven iOS Development with Swift: Create fully-featured and highly functional iOS apps by writing tests first
Dr. Dominik Hauser
5/5 (2)
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
From Everand
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
Anthony Adams
4.5/5 (6)
3412E Piston and Connecting Rod Assembly
100% (3)
3412E Piston and Connecting Rod Assembly
11 pages
.NET Mastery: The .NET Interview Questions and Answers
From Everand
.NET Mastery: The .NET Interview Questions and Answers
Chetan Singh
No ratings yet
Seminar Report 6657
No ratings yet
Seminar Report 6657
32 pages
Learning WebRTC
From Everand
Learning WebRTC
Dan Ristic
No ratings yet
Video Captioning Using Neural Networks
No ratings yet
Video Captioning Using Neural Networks
13 pages
Generating Video Descriptions With Attention-Driven LSTM Models in Hindi Language
No ratings yet
Generating Video Descriptions With Attention-Driven LSTM Models in Hindi Language
9 pages
Mastering ServiceStack: Utilize ServiceStack as the rock solid foundation of your distributed system
From Everand
Mastering ServiceStack: Utilize ServiceStack as the rock solid foundation of your distributed system
Andreas Niedermair
No ratings yet
Paper Name: Oid:27992:59096188 Similarity Report ID
No ratings yet
Paper Name: Oid:27992:59096188 Similarity Report ID
49 pages
Visual Assist
No ratings yet
Visual Assist
53 pages
Basic Guide to Programming Languages Python, JavaScript, and Ruby
From Everand
Basic Guide to Programming Languages Python, JavaScript, and Ruby
Kiet Huynh
No ratings yet
Basics of Programming: A Comprehensive Guide for Beginners: Essential Coputer Skills, #1
From Everand
Basics of Programming: A Comprehensive Guide for Beginners: Essential Coputer Skills, #1
DG. Junior
No ratings yet
133-138, Tesma0810,IJEAST
No ratings yet
133-138, Tesma0810,IJEAST
6 pages
Code, Bytes, Algorithms, And Innovation: Software & Engineering
From Everand
Code, Bytes, Algorithms, And Innovation: Software & Engineering
Tobi Makinde
No ratings yet
Paper 4
No ratings yet
Paper 4
5 pages
DL Based Speech To Text Converter For Audio Visual Applications
No ratings yet
DL Based Speech To Text Converter For Audio Visual Applications
4 pages
Web App Development Made Simple with Streamlit: A web developer's guide to effortless web app development, deployment, and scalability
From Everand
Web App Development Made Simple with Streamlit: A web developer's guide to effortless web app development, deployment, and scalability
Rosario Moscato
No ratings yet
Kinect in Motion – Audio and Visual Tracking by Example
From Everand
Kinect in Motion – Audio and Visual Tracking by Example
Clemente Giorio
No ratings yet
DOC-20241111-WA0002.
No ratings yet
DOC-20241111-WA0002.
10 pages
COBOL Software Modernization: From Principles to Implementation with the BLU AGE Method
From Everand
COBOL Software Modernization: From Principles to Implementation with the BLU AGE Method
Franck Barbier
1/5 (1)
A Practical Guide for IoT Solution Architects
From Everand
A Practical Guide for IoT Solution Architects
Dr Mehmet Yildiz
5/5 (2)
Python Clean Code: Best Practices and Techniques for Writing Clear, Concise, and Maintainable Code
From Everand
Python Clean Code: Best Practices and Techniques for Writing Clear, Concise, and Maintainable Code
Nash Maverick
No ratings yet
Python OOP Step by Step: A Practical Guide with Examples
From Everand
Python OOP Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Full Stack Web Development with Fastify: Building High-Performance Modern Applications from Frontend to Backend
From Everand
Full Stack Web Development with Fastify: Building High-Performance Modern Applications from Frontend to Backend
Aarav Joshi
No ratings yet
"Careers in Information Technology: DevOps Engineer": GoodMan, #1
From Everand
"Careers in Information Technology: DevOps Engineer": GoodMan, #1
Patrick Mukosha
No ratings yet
Full-Stack Web Development with TypeScript 5: Craft modern full-stack projects with Bun, PostgreSQL, Svelte, TypeScript, and OpenAI
From Everand
Full-Stack Web Development with TypeScript 5: Craft modern full-stack projects with Bun, PostgreSQL, Svelte, TypeScript, and OpenAI
Mykyta Chernenko
No ratings yet
Learning .NET High-performance Programming
From Everand
Learning .NET High-performance Programming
Antonio Esposito
No ratings yet
IEEE_Conference_Template__1
No ratings yet
IEEE_Conference_Template__1
4 pages
Programming Best Practices for New Developers: A Practical Guide with Examples
From Everand
Programming Best Practices for New Developers: A Practical Guide with Examples
William E. Clark
No ratings yet
Generative AI From Beginner to Paid Professional, Part 2: Master Prompt Design, Gemini Multimodal in Vertex AI Studio, LangChain, Launching & Deploying Generative AI Projects
From Everand
Generative AI From Beginner to Paid Professional, Part 2: Master Prompt Design, Gemini Multimodal in Vertex AI Studio, LangChain, Launching & Deploying Generative AI Projects
Bolakale Aremu
No ratings yet
Deep Learning-Based Video Captioning Technique Using Transformer
No ratings yet
Deep Learning-Based Video Captioning Technique Using Transformer
4 pages
Node Web Development, Second Edition
From Everand
Node Web Development, Second Edition
David Herron
No ratings yet
InduSoft Application Design and SCADA Deployment Recommendations for Industrial Control System Security
From Everand
InduSoft Application Design and SCADA Deployment Recommendations for Industrial Control System Security
Richard Clark
No ratings yet
Building AI Applications with OpenAI APIs: Leverage ChatGPT, Whisper, and DALL-E APIs to build 10 innovative AI projects
From Everand
Building AI Applications with OpenAI APIs: Leverage ChatGPT, Whisper, and DALL-E APIs to build 10 innovative AI projects
Martin Yanev
No ratings yet
Heaven Sent Me
From Everand
Heaven Sent Me
Mary Jinkens
No ratings yet
PhoneGap and AngularJS for Cross-platform Development
From Everand
PhoneGap and AngularJS for Cross-platform Development
Yuxian
No ratings yet
Mastering the Craft: Unleashing the Art of Software Engineering
From Everand
Mastering the Craft: Unleashing the Art of Software Engineering
Kiran Nagesh
No ratings yet
91.IMAGETEXTTOSPEECHCONVERSIONIN
No ratings yet
91.IMAGETEXTTOSPEECHCONVERSIONIN
11 pages
Python The Complete Reference: Comprehensive Guide to Mastering Python Programming from Fundamentals to Advanced Techniques
From Everand
Python The Complete Reference: Comprehensive Guide to Mastering Python Programming from Fundamentals to Advanced Techniques
Aarav Joshi
No ratings yet
T2V (1)
No ratings yet
T2V (1)
5 pages
Minimal APIs in ASP.NET 9: Design, implement, and optimize robust APIs in C# with .NET 9
From Everand
Minimal APIs in ASP.NET 9: Design, implement, and optimize robust APIs in C# with .NET 9
Nick Proud
No ratings yet
Professional Test Driven Development with C#: Developing Real World Applications with TDD
From Everand
Professional Test Driven Development with C#: Developing Real World Applications with TDD
James Bender
No ratings yet
Microservices Architecture Handbook: Non-Programmer's Guide for Building Microservices
From Everand
Microservices Architecture Handbook: Non-Programmer's Guide for Building Microservices
Stephen Fleming
4/5 (5)
Getting Started with Review Board
From Everand
Getting Started with Review Board
Sandeep Rawat
No ratings yet
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
From Everand
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
Eric Vargas
No ratings yet
ASP.NET 3.5 Application Architecture and Design
From Everand
ASP.NET 3.5 Application Architecture and Design
Vivek Thakur
No ratings yet
Cloud Computing: Master the Concepts, Architecture and Applications with Real-world examples and Case studies
From Everand
Cloud Computing: Master the Concepts, Architecture and Applications with Real-world examples and Case studies
Ruchi Doshi
No ratings yet
Python Tools for Visual Studio
From Everand
Python Tools for Visual Studio
Martino Sabia
No ratings yet
Full Stack Web Development: Master Front-End and Back-End Development Techniques
From Everand
Full Stack Web Development: Master Front-End and Back-End Development Techniques
Oliver Sutherland
No ratings yet
Getting Started with WebRTC
From Everand
Getting Started with WebRTC
Rob Manson
No ratings yet
Survey Paper On Youtube Transcript Summarizer: Eesha Inamdar, Varada Kalaskar, Vaidehi Zade
No ratings yet
Survey Paper On Youtube Transcript Summarizer: Eesha Inamdar, Varada Kalaskar, Vaidehi Zade
4 pages
Practical Guide to Python: From Basics to Advanced Programming
From Everand
Practical Guide to Python: From Basics to Advanced Programming
Arcadia J. Darell
No ratings yet
Real-Time Video To Text Transcription Android App (Using Video Processing and Multimedia)
No ratings yet
Real-Time Video To Text Transcription Android App (Using Video Processing and Multimedia)
32 pages
Text Summarization and Conversion of Speech To Text
No ratings yet
Text Summarization and Conversion of Speech To Text
5 pages
20240901-Estado de Cuenta Bancario
No ratings yet
20240901-Estado de Cuenta Bancario
8 pages
Thermography Report
No ratings yet
Thermography Report
6 pages
Home UB-Mannheim-tesseract Wiki GitHub
No ratings yet
Home UB-Mannheim-tesseract Wiki GitHub
4 pages
DTI Directory of Key Officials As of 24 June 2024
No ratings yet
DTI Directory of Key Officials As of 24 June 2024
33 pages
A Novel Healthy and Time Aware Food Recommender Syst - 2023 - Expert Systems Wit PDF
No ratings yet
A Novel Healthy and Time Aware Food Recommender Syst - 2023 - Expert Systems Wit PDF
22 pages
References
No ratings yet
References
15 pages
FX3U Modbus Manual
No ratings yet
FX3U Modbus Manual
4 pages
Chap 07 Spreadsheet Models
No ratings yet
Chap 07 Spreadsheet Models
43 pages
Unit 3
No ratings yet
Unit 3
8 pages
The Ultimate Guide: Meta-Description
No ratings yet
The Ultimate Guide: Meta-Description
9 pages
Photoelectric Vs Ionization Detectors - A Review of The Literature
No ratings yet
Photoelectric Vs Ionization Detectors - A Review of The Literature
37 pages
Lesson Plan - FRMT - Ds
No ratings yet
Lesson Plan - FRMT - Ds
4 pages
Pedagogy of Mathematics:Part 2
No ratings yet
Pedagogy of Mathematics:Part 2
22 pages
TN SET NET JRF Unit 4 and 10 Study Material English Medium PDF Download
100% (1)
TN SET NET JRF Unit 4 and 10 Study Material English Medium PDF Download
15 pages
LCOE 2019-Final March 2019
No ratings yet
LCOE 2019-Final March 2019
51 pages
ADVANCED FOOTSTEP POWER GENERATION SYSTEM - 4 Steps - Instructables
No ratings yet
ADVANCED FOOTSTEP POWER GENERATION SYSTEM - 4 Steps - Instructables
26 pages
Robot API RCDesign
No ratings yet
Robot API RCDesign
14 pages
Garuda Indonesia TBK
No ratings yet
Garuda Indonesia TBK
12 pages
Prelims Lab Exercise #2 - M1U2
No ratings yet
Prelims Lab Exercise #2 - M1U2
22 pages
1934-Suket KTP Dki 2
No ratings yet
1934-Suket KTP Dki 2
1 page
Rear End Module
100% (1)
Rear End Module
103 pages
Lit - So I Watched The - A Gentle Introduction To Mencius - Literature - 4chan
No ratings yet
Lit - So I Watched The - A Gentle Introduction To Mencius - Literature - 4chan
1 page
SAP_PM_Overview
No ratings yet
SAP_PM_Overview
2 pages
YES BANK - NIYO Prepaid Card Terms N Conditions and Schedule of Charges
No ratings yet
YES BANK - NIYO Prepaid Card Terms N Conditions and Schedule of Charges
11 pages
Hfe Pioneer Xc-f10 Service En
No ratings yet
Hfe Pioneer Xc-f10 Service En
43 pages
Mastering End
No ratings yet
Mastering End
17 pages
Avr Study Plan
No ratings yet
Avr Study Plan
4 pages
1 Introduction To Biostatistics Last
No ratings yet
1 Introduction To Biostatistics Last
19 pages

Paper 1

Uploaded by

Paper 1

Uploaded by

e-ISSN: 2582-5208

International Research Journal of Modernization in Engineering Technology and Science

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

IV. RESULTS AND DISCUSSION

Figure 1: Video To Text Converter

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

You might also like