The most popular source of data on the Internet is video which has a lot of information. Automating the administration, indexing, and retrieval of movies is the goal of video structure analysis, which uses content-based video indexing and retrieval. Video analysis requires the ability to recognize shot changes since video shot boundary recognition is a preliminary stage in the indexing, browsing, and retrieval of video material. A method for shot boundary detection (SBD) is suggested in this situation. This work proposes a shot boundary detection system with three stages. In the first stage, multiple images are read in temporal sequence and transformed into grayscale images. Based on correlation value comparison, the number of redundant frames in the same shots is decreased, from this point on, the amount of time and computational complexity is reduced. Then, in the second stage, a candidate transition is identified by comparing the objects of successive frames and analyzing the differences between the objects using the standard deviation metric. In the last stage, the cut transition is decided upon by matching key points using a scale-invariant feature transform (SIFT). The proposed system achieved an accuracy of 0.97 according to the F-score while minimizing time consumption.
Comparative Study of Different Video Shot Boundary Detection Techniquesijtsrd
Video shot boundary detection is the crucial step in the field of research of video processing. This makes the task of video retrieval based on contents, indexing and browsing. This paper contains the review of different techniques and methods which implemented to achieve the task of SBD along with key performance measurement parameters. This explains different preprocessing techniques, feature extraction methodologies, similarity computation techniques, etc. The outcomes of different approaches are framed with reference to accuracy, speed of computation along with comparison of precision, recall and F1 score. Swati Hadke | Ravi Mishra "Comparative Study of Different Video Shot Boundary Detection Techniques" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-5 , August 2022, URL: https://www.ijtsrd.com/papers/ijtsrd50630.pdf Paper URL: https://www.ijtsrd.com/engineering/electronics-and-communication-engineering/50630/comparative-study-of-different-video-shot-boundary-detection-techniques/swati-hadke
Efficient video indexing for fast motion videoijcga
Due to advances in recent multimedia technologies, various digital video contents become available from different multimedia sources. Efficient management, storage, coding, and indexing of video are required because video contains lots of visual information and requires a large amount of memory. This paper proposes an efficient video indexing method for video with rapid motion or fast illumination change, in which motion information and feature points of specific objects are used. For accurate shot boundary detection, we make use of two steps: block matching algorithm to obtain accurate motion information and modified displaced frame difference to compensate for the error in existing methods. We also propose an object matching algorithm based on the scale invariant feature transform, which uses feature points to group shots semantically. Computer simulation with five fast-motion video shows the effectiveness of the proposed video indexing method.
Video Shot Boundary Detection Using The Scale Invariant Feature Transform and...IJECEIAES
Segmentation of the video sequence by detecting shot changes is essential for video analysis, indexing and retrieval. In this context, a shot boundary detection algorithm is proposed in this paper based on the scale invariant feature transform (SIFT). The first step of our method consists on a top down search scheme to detect the locations of transitions by comparing the ratio of matched features extracted via SIFT for every RGB channel of video frames. The overview step provides the locations of boundaries. Secondly, a moving average calculation is performed to determine the type of transition. The proposed method can be used for detecting gradual transitions and abrupt changes without requiring any training of the video content in advance. Experiments have been conducted on a multi type video database and show that this algorithm achieves well performances.
Propose shot boundary detection methods by using visual hybrid featuresIJECEIAES
Shot boundary detection is the fundamental technique that plays an important role in a variety of video processing tasks such as summarization, retrieval, object tracking, and so on. This technique involves segmenting a video sequence into shots, each of which is a sequence of interrelated temporal frames. This paper introduces two methods, where the first is for detecting the cut shot boundary via employing visual hybrid features, while the second method is to compare between them. This enhances the effectiveness of the performance of detecting the shot by selecting the strongest features. The first method was performed by utilizing hybrid features, which included statistics histogram of hue-saturation-value color space and grey level co-occurrence matrix. The second method was performed by utilizing hybrid features that include discrete wavelet transform and grey level co-occurrence matrix. The frame size decreased. This process had the advantage of reducing the computation time. Also used local adaptive thresholds, which enhanced the method’s performance. The tested videos were obtained from the BBC archive, which included BBC Learning English and BBC News. Experimental results have indicated that the second method has achieved (97.618%) accuracy performance, which was higher than the first and other methods using evaluation metrics.
The document summarizes a research paper that proposes a method to summarize parking surveillance footage. The method first pre-processes the raw footage to extract only frames containing vehicles. These frames are then classified using a CNN model to detect vehicles and recognize license plates. The classified objects and license plate numbers are used to generate a textual summary of the vehicles in the footage, making it easier for users to review large amounts of surveillance video. The paper discusses related work on video summarization techniques and provides details of the proposed methodology, which includes preprocessing footage, extracting features from frames containing vehicles, using CNNs for object detection and license plate recognition, and generating a summarized video and text report.
This document proposes a method for video copy detection using segmentation, MPEG-7 descriptors, and graph-based sequence matching. It extracts key frames from videos, extracts features from the frames using descriptors like CEDD, FCTH, SCD, EHD and CLD, and stores them in a database. When a query video is input, its features are extracted and compared to the database to detect if it matches any videos already in the database. Graph-based sequence matching is also used to find the optimal matching between video sequences despite transformations like changed frame rates or ordering. The method is shown to perform better than previous techniques at detecting copied videos through transformations.
Video Content Identification using Video Signature: SurveyIRJET Journal
This document summarizes previous research on video content identification using video signatures. It discusses three types of video signatures (spatial, temporal, and spatio-temporal) that have been used to generate unique descriptors to identify identical video scenes. The document then reviews several existing methods for video signature extraction and matching, including techniques based on ordinal signatures, motion signatures, color histograms, local descriptors using interest points, and compressed video shot matching using dominant color profiles. It concludes by proposing a new temporal signature-based method that aims to accurately detect a video segment embedded in a longer unrelated video by extracting frame-level features, generating fine and coarse signatures, and performing frame-by-frame signature matching.
This document is a project report for video shot boundary detection using HOG (Histogram of Oriented Gradients) submitted by Anveshkumar Kolluri to the Department of Information Technology at GITAM University in India. It introduces the motivation and challenges of shot boundary detection and provides an overview of the literature reviewed, system design, modules, software used, and implementation of the project to detect shot boundaries in videos using HOG features.
In this work, we define a new method for indexing and retrieving non-geotagged
video sequences based on the visual content only by using the Local Binary Pattern
(LBP) and Singular Value Decomposition (SVD) techniques. The main question of our
system, Is it possible to determine the geographic location of a video film on the GISmap
from just its pixels of frames?. The proposed system is introduced to answer the
questions like that. The GIS database was constructed by storing the reference images
on the intersection between segment roads in the map. The Local Binary Pattern
(LBP) is used to extract the features form images. The Singular Value Decomposition
(SVD) technique is used for compress the length of features and indexing the images
in the database. The input to the system is a video taken from the camera puts on a
vehicle as forward facing camera. The output of the proposed system is the geolocation
of keyframes of video which correspond the geo-tagged images retrieved
from the GIS database.
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Ijripublishers Ijri
This document discusses techniques for improving video compression efficiency for surveillance videos. It proposes modifying the architecture of scalable video coding to make it surveillance-centric by allowing adaptive rate-distortion optimization at the GOP level based on whether events of interest are present. Experimental results show foreground detection and updating of background adaptively over time to improve compression. Future work includes further enhancing selective motion estimation techniques to improve processing efficiency without degrading video quality.
IRJET-Feature Extraction from Video Data for Indexing and Retrieval IRJET Journal
This document summarizes techniques for feature extraction from video data to enable effective indexing and retrieval of video content. It discusses common approaches for segmenting video into shots and scenes, extracting key frames, and determining various visual features like color, texture, objects and motion. Feature extraction is an important but time-consuming step in content-based video retrieval. The document also reviews methods for video representation, mining patterns from video data, classifying video content, and generating semantic annotations to support search and retrieval of relevant videos.
Key Frame Extraction in Video Stream using Two Stage Method with Colour and S...ijtsrd
Key Frame Extraction is the summarization of videos for different applications like video object recognition and classification, video retrieval and archival and surveillance is an active research area in computer vision. In this paper describe a new criterion for well presentative key frames and correspondingly, create a key frame selection algorithm based Two stage Method. A two stage method is used to extract accurate key frames to cover the content for the whole video sequence. Firstly, an alternative sequence is got based on color characteristic difference between adjacent frames from original sequence. Secondly, by analyzing structural characteristic difference between adjacent frames from the alternative sequence, the final key frame sequence is obtained. And then, an optimization step is added based on the number of final key frames in order to ensure the effectiveness of key frame extraction. Khaing Thazin Min | Wit Yee Swe | Yi Yi Aung | Khin Chan Myae Zin "Key Frame Extraction in Video Stream using Two-Stage Method with Colour and Structure" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd27971.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-processing/27971/key-frame-extraction-in-video-stream-using-two-stage-method-with-colour-and-structure/khaing-thazin-min
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Ijripublishers Ijri
Global interconnect planning becomes a challenge as semiconductor technology continuously scales. Because of the increasing wire resistance and higher capacitive coupling in smaller features, the delay of global interconnects becomes large compared with the delay of a logic gate, introducing a huge performance gap that needs to be resolved A novel equalized global link architecture and driver– receiver co design flow are proposed for high-speed and low-energy on-chip communication by utilizing a continuous-time linear equalizer (CTLE). The proposed global link is analyzed using a linear system method, and the formula of CTLE eye opening is derived to provide high-level design guidelines and insights.
Compared with the separate driver–receiver design flow, over 50% energy reduction is observed.
IRJET - A Research on Video Forgery Detection using Machine LearningIRJET Journal
The document presents a research on detecting video forgery using machine learning. It proposes a novel approach that uses optical flow and coarse-to-fine detection strategy to detect copy-move image forgery in videos. The approach first divides video frames into overlapping blocks, then extracts GLCM features from blocks. It identifies duplicate blocks using k-means clustering and Euclidean distance calculation. Finally, it detects forged regions in frames by highlighting the duplicate blocks. The approach was implemented and experiments showed it could successfully detect forged regions in videos.
Content based video retrieval using discrete cosine transformnooriasukmaningtyas
A content based video retrieval (CBVR) framework is built in this paper.
One of the essential features of video retrieval process and CBVR is a color
value. The discrete cosine transform (DCT) is used to extract a query video
features to compare with the video features stored in our database. Average
result of 0.6475 was obtained by using the DCT after implementing it to the
database we created and collected, and on all categories. This technique was
applied on our database of video, check 100 database videos, 5 videos in
Keywords: each category.
IRJET - Applications of Image and Video Deduplication: A SurveyIRJET Journal
This document discusses applications of image and video deduplication techniques. It begins by providing background on the growth of multimedia data and need for deduplication to reduce redundant data. It then describes key aspects of image and video deduplication, including extracting fingerprints from images and frames to identify duplicates. The document reviews several studies on image and video deduplication applications, such as identifying near-duplicate images on social media, detecting spoofed face images, verifying image copy detection, and eliminating near-duplicates from visual sensor networks. Overall, the document surveys various real-world implementations of image and video deduplication.
Video saliency has a profound effect on our lives with its compression efficiency and precision. There have been several types of research done on image saliency but not on video saliency. This paper proposes a modified high efficiency video coding (HEVC) algorithm with background modelling and the implication of classification into coding blocks. This solution first employs the G-picture in the fourth frame as a long-term reference and then it is quantized based on the algorithm that segregates using the background features of the image. Then coding blocks are introduced to decrease the complexity of the HEVC code, reduce time consumption and overall speed up the process of saliency. The solution is experimented upon with the dynamic human fixation 1K (DHF1K) dataset and compared with several other stateof-the-art saliency methods to showcase the reliability and efficiency of the proposed solution.
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
In recent years, the modeling of human behaviors and patterns of activity for recognition or detection of special events has attracted considerable research interest. Various methods abounding to build intelligent vision systems aimed at understanding the scene and making correct semantic inferences from the observed dynamics of moving targets. Many systems include detection, storage of video information, and human-computer interfaces. Here we present not only an update that expands previous similar surveys but also a emphasis on contextual abnormal detection of human activity , especially in video surveillance applications. The main purpose of this survey is to identify existing methods extensively, and to characterize the literature in a manner that brings to attention key challenges.
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonCSCJournals
The document presents a novel method for extracting key frames from videos using unsupervised clustering and mutual comparison. It assigns weights of 70% to color (HSV histogram) and 30% to texture (GLCM) when computing frame similarity for clustering. It then performs mutual comparison of extracted key frames to remove near duplicates, improving accuracy. The algorithm is computationally simple and able to detect unique key frames, improving concept detection performance as validated on open databases.
Block matching algorithm (BMA) for motion estimation is extremely normally utilized in current video coding standard like H.26x and MPEG-x as a result of its simplicity and performance and also it is a very important content in video compression the motion estimation is becoming a problem in many video applications as it to estimate the motion of the object. There are homography between 2 frames within the video sequences captured by pan-tilt (PT) cameras in their unnatural movements and therefore the geometric relationship is used to reduce the spatial redundancy in the video. In this paper, I present a homography based motion estimation algorithm and a comparative study of different algorithms. Also I introduce a unique homography-based motion for block motion estimation. This study is to provide an idea about the important tradeoff between computational complexity, result quality and various applications. This algorithm can be done on Matlab.
Video Compression Using Block By Block Basis Salience DetectionIRJET Journal
This document presents a method for video compression using block-by-block salience detection. It aims to reduce noticeable coding artifacts in non-region of interest (ROI) parts of video frames by optimizing the saliency-related Lagrange parameter possibly on a block-by-block basis. The proposed method detects ROI using a visual saliency model and encodes ROI blocks with higher quality than non-ROI blocks. It then separates each frame into blocks and uses a conjugate gradient algorithm to iteratively update weight coefficients and minimize a cost function, compressing each block losslessly based on its saliency. An experiment found the proposed method improved visual quality over other perceptual video coding methods according to metrics like eye-tracking weighted PSNR and
A Segmentation Based Sequential Pattern Matching for Efficient Video Copy Det...Best Jobs
This document discusses a video copy detection system that uses segmentation based sequential pattern matching of SIFT features for efficient detection. It divides videos into homogeneous segments and extracts SIFT features from keyframes of each segment. The SIFT features are then quantized into visual words for optimized matching between video segments. By performing visual word matching at the cluster level followed by feature level similarity measures, the system is able to detect copied video segments in a time-efficient manner while achieving improved accuracy over other methods.
Multi-View Video Coding Algorithms/Techniques: A Comprehensive StudyIJERA Editor
This document summarizes recent developments in multi-view video coding techniques. It begins with an introduction to multi-view video and multi-view video coding. It then discusses exploiting temporal and inter-view similarities for efficient compression. Several existing multi-view video coding methods and algorithms are reviewed, including predictive coding, subband coding, motion and disparity compensation, and wavelet-based approaches. The benefits and requirements of multi-view video compression are also outlined.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
This document presents a method for extracting key frames from videos using discrete wavelet transform (DWT) statistics. It begins with background on video frames, scene changes, and DWT wavelet frequency components. The proposed method extracts key frames in 4 steps: 1) applying DWT to consecutive frames and calculating differences between detail coefficients, 2) computing mean and standard deviation of differences, 3) estimating thresholds using mean and standard deviation, 4) comparing differences to thresholds to identify key frames where differences exceed thresholds. Experimental results on test videos demonstrate the method can detect key frames to represent scene changes for video summarization.
This document presents a method for extracting key frames from videos using discrete wavelet transform (DWT) statistics. It begins with background on video frames, scene changes, and DWT wavelet frequency components. It then describes the proposed key frame extraction algorithm, which involves: 1) applying DWT to consecutive video frames, 2) calculating differences between detail coefficients, 3) computing mean and standard deviation of differences, 4) setting thresholds based on mean and standard deviation, and 5) identifying frames where two difference values exceed thresholds as key frames. The method is applied to sample videos and parameters are tuned to effectively detect key frames for video summarization.
Design and Analysis of Quantization Based Low Bit Rate Encoding Systemijtsrd
This document summarizes research on developing a low bit rate encoding system for video compression using vector quantization. It first discusses how vector quantization can achieve high compression ratios and has been used widely in image and speech coding. It then describes the methodology used, which involves taking video frames as input, downsampling the frames to extract pixels, applying vector quantization, and detecting edges on the compressed frames to check compression quality. Finally, it discusses the results of testing the approach on MATLAB and presents conclusions on the advantages of the proposed algorithm for very low bit rate video coding applications.
Key frame extraction is an essential technique in the computer vision field. The extracted key frames should brief the salient events with an excellent feasibility, great efficiency, and with a high-level of robustness. Thus, it is not an easy problem to solve because it is attributed to many visual features. This paper intends to solve this problem by investigating the relationship between these features detection and the accuracy of key frames extraction techniques using TRIZ. An improved algorithm for key frame extraction was then proposed based on an accumulative optical flow with a self-adaptive threshold (AOF_ST) as recommended in TRIZ inventive principles. Several video shots including original and forgery videos with complex conditions are used to verify the experimental results. The comparison of our results with the-state-of-the-art algorithms results showed that the proposed extraction algorithm can accurately brief the videos and generated a meaningful compact count number of key frames. On top of that, our proposed algorithm achieves 124.4 and 31.4 for best and worst case in KTH dataset extracted key frames in terms of compression rate, while the-state-of-the-art algorithms achieved 8.90 in the best case.
Vector space model, term frequency-inverse document frequency with linear sea...CSITiaesprime
For Muslims, the Hadith ranks as the secondary legal authority following the Quran. This research leverages hadith data to streamline the search process within the nine imams’ compendium using the vector space model (VSM) approach. The primary objective of this research is to enhance the efficiency and effectiveness of the search process within Hadith collections by implementing pre-filtering techniques. This study aims to demonstrate the potential of linear search and Django object-relational mapping (ORM) filters in reducing search times and improving retrieval performance, thereby facilitating quicker and more accurate access to relevant Hadiths. Prior studies have indicated that VSM is efficient for large data sets because it assigns weights to every term across all documents, regardless of whether they include the search keywords. Consequently, the more documents there are, the more protracted the weighting phase becomes. To address this, the current research pre-filters documents prior to weighting, utilizing linear search and Django ORM as filters. Testing on 62,169 hadiths with 20 keywords revealed that the average VSM search duration was 51 seconds. However, with the implementation of linear and Django ORM filters, the times were reduced to 7.93 and 8.41 seconds, respectively. The recall@10 rates were 79% and 78.5%, with MAP scores of 0.819 and 0.814, accordingly.
Electro-capacitive cancer therapy using wearable electric field detector: a r...CSITiaesprime
Electro-capacitive cancer therapy (ECCT), a less invasive and more targeted approach using wearable electric field detectors, is revolutionizing cancer therapy, a complex process involving traditional methods like surgery, chemotherapy, and radiation. The review aims to investigate the safety and efficacy of electric field exposure in vital organs, particularly in cancer therapy, to improve medical advancements. It will investigate the impact on cytokines and insulation integrity, as well as contribute to improving diagnostic techniques and safety measures in medical and engineering fields. Wearable electric field detectors have revolutionized cancer therapy by offering a non-invasive and personalized approach to treatment. These devices, such as smart caps or patches, measure changes in electric fields by detecting capacitance alterations. Their lightweight, comfortable, and easy to-wear nature allows for real-time monitoring, providing valuable data for personalized treatment plans. The portability of wearable detectors allows for long-term surveillance outside clinical settings, increasing therapy efficacy. The ability to collect data over extended periods provides a comprehensive view of electric field dynamics, aiding researchers in understanding tumor growth and progression. Technology advancements in electro-capacitive therapy, including wearable devices, have revolutionized cancer treatment by adjusting electric field intensity in real-time, enhancing personalized medicine, and improving treatment outcomes and patient quality of life.
More Related Content
Similar to Video shot boundary detection based on frames objects comparison and scale-invariant feature transform technique (20)
In this work, we define a new method for indexing and retrieving non-geotagged
video sequences based on the visual content only by using the Local Binary Pattern
(LBP) and Singular Value Decomposition (SVD) techniques. The main question of our
system, Is it possible to determine the geographic location of a video film on the GISmap
from just its pixels of frames?. The proposed system is introduced to answer the
questions like that. The GIS database was constructed by storing the reference images
on the intersection between segment roads in the map. The Local Binary Pattern
(LBP) is used to extract the features form images. The Singular Value Decomposition
(SVD) technique is used for compress the length of features and indexing the images
in the database. The input to the system is a video taken from the camera puts on a
vehicle as forward facing camera. The output of the proposed system is the geolocation
of keyframes of video which correspond the geo-tagged images retrieved
from the GIS database.
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Ijripublishers Ijri
This document discusses techniques for improving video compression efficiency for surveillance videos. It proposes modifying the architecture of scalable video coding to make it surveillance-centric by allowing adaptive rate-distortion optimization at the GOP level based on whether events of interest are present. Experimental results show foreground detection and updating of background adaptively over time to improve compression. Future work includes further enhancing selective motion estimation techniques to improve processing efficiency without degrading video quality.
IRJET-Feature Extraction from Video Data for Indexing and Retrieval IRJET Journal
This document summarizes techniques for feature extraction from video data to enable effective indexing and retrieval of video content. It discusses common approaches for segmenting video into shots and scenes, extracting key frames, and determining various visual features like color, texture, objects and motion. Feature extraction is an important but time-consuming step in content-based video retrieval. The document also reviews methods for video representation, mining patterns from video data, classifying video content, and generating semantic annotations to support search and retrieval of relevant videos.
Key Frame Extraction in Video Stream using Two Stage Method with Colour and S...ijtsrd
Key Frame Extraction is the summarization of videos for different applications like video object recognition and classification, video retrieval and archival and surveillance is an active research area in computer vision. In this paper describe a new criterion for well presentative key frames and correspondingly, create a key frame selection algorithm based Two stage Method. A two stage method is used to extract accurate key frames to cover the content for the whole video sequence. Firstly, an alternative sequence is got based on color characteristic difference between adjacent frames from original sequence. Secondly, by analyzing structural characteristic difference between adjacent frames from the alternative sequence, the final key frame sequence is obtained. And then, an optimization step is added based on the number of final key frames in order to ensure the effectiveness of key frame extraction. Khaing Thazin Min | Wit Yee Swe | Yi Yi Aung | Khin Chan Myae Zin "Key Frame Extraction in Video Stream using Two-Stage Method with Colour and Structure" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd27971.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-processing/27971/key-frame-extraction-in-video-stream-using-two-stage-method-with-colour-and-structure/khaing-thazin-min
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Ijripublishers Ijri
Global interconnect planning becomes a challenge as semiconductor technology continuously scales. Because of the increasing wire resistance and higher capacitive coupling in smaller features, the delay of global interconnects becomes large compared with the delay of a logic gate, introducing a huge performance gap that needs to be resolved A novel equalized global link architecture and driver– receiver co design flow are proposed for high-speed and low-energy on-chip communication by utilizing a continuous-time linear equalizer (CTLE). The proposed global link is analyzed using a linear system method, and the formula of CTLE eye opening is derived to provide high-level design guidelines and insights.
Compared with the separate driver–receiver design flow, over 50% energy reduction is observed.
IRJET - A Research on Video Forgery Detection using Machine LearningIRJET Journal
The document presents a research on detecting video forgery using machine learning. It proposes a novel approach that uses optical flow and coarse-to-fine detection strategy to detect copy-move image forgery in videos. The approach first divides video frames into overlapping blocks, then extracts GLCM features from blocks. It identifies duplicate blocks using k-means clustering and Euclidean distance calculation. Finally, it detects forged regions in frames by highlighting the duplicate blocks. The approach was implemented and experiments showed it could successfully detect forged regions in videos.
Content based video retrieval using discrete cosine transformnooriasukmaningtyas
A content based video retrieval (CBVR) framework is built in this paper.
One of the essential features of video retrieval process and CBVR is a color
value. The discrete cosine transform (DCT) is used to extract a query video
features to compare with the video features stored in our database. Average
result of 0.6475 was obtained by using the DCT after implementing it to the
database we created and collected, and on all categories. This technique was
applied on our database of video, check 100 database videos, 5 videos in
Keywords: each category.
IRJET - Applications of Image and Video Deduplication: A SurveyIRJET Journal
This document discusses applications of image and video deduplication techniques. It begins by providing background on the growth of multimedia data and need for deduplication to reduce redundant data. It then describes key aspects of image and video deduplication, including extracting fingerprints from images and frames to identify duplicates. The document reviews several studies on image and video deduplication applications, such as identifying near-duplicate images on social media, detecting spoofed face images, verifying image copy detection, and eliminating near-duplicates from visual sensor networks. Overall, the document surveys various real-world implementations of image and video deduplication.
Video saliency has a profound effect on our lives with its compression efficiency and precision. There have been several types of research done on image saliency but not on video saliency. This paper proposes a modified high efficiency video coding (HEVC) algorithm with background modelling and the implication of classification into coding blocks. This solution first employs the G-picture in the fourth frame as a long-term reference and then it is quantized based on the algorithm that segregates using the background features of the image. Then coding blocks are introduced to decrease the complexity of the HEVC code, reduce time consumption and overall speed up the process of saliency. The solution is experimented upon with the dynamic human fixation 1K (DHF1K) dataset and compared with several other stateof-the-art saliency methods to showcase the reliability and efficiency of the proposed solution.
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
In recent years, the modeling of human behaviors and patterns of activity for recognition or detection of special events has attracted considerable research interest. Various methods abounding to build intelligent vision systems aimed at understanding the scene and making correct semantic inferences from the observed dynamics of moving targets. Many systems include detection, storage of video information, and human-computer interfaces. Here we present not only an update that expands previous similar surveys but also a emphasis on contextual abnormal detection of human activity , especially in video surveillance applications. The main purpose of this survey is to identify existing methods extensively, and to characterize the literature in a manner that brings to attention key challenges.
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonCSCJournals
The document presents a novel method for extracting key frames from videos using unsupervised clustering and mutual comparison. It assigns weights of 70% to color (HSV histogram) and 30% to texture (GLCM) when computing frame similarity for clustering. It then performs mutual comparison of extracted key frames to remove near duplicates, improving accuracy. The algorithm is computationally simple and able to detect unique key frames, improving concept detection performance as validated on open databases.
Block matching algorithm (BMA) for motion estimation is extremely normally utilized in current video coding standard like H.26x and MPEG-x as a result of its simplicity and performance and also it is a very important content in video compression the motion estimation is becoming a problem in many video applications as it to estimate the motion of the object. There are homography between 2 frames within the video sequences captured by pan-tilt (PT) cameras in their unnatural movements and therefore the geometric relationship is used to reduce the spatial redundancy in the video. In this paper, I present a homography based motion estimation algorithm and a comparative study of different algorithms. Also I introduce a unique homography-based motion for block motion estimation. This study is to provide an idea about the important tradeoff between computational complexity, result quality and various applications. This algorithm can be done on Matlab.
Video Compression Using Block By Block Basis Salience DetectionIRJET Journal
This document presents a method for video compression using block-by-block salience detection. It aims to reduce noticeable coding artifacts in non-region of interest (ROI) parts of video frames by optimizing the saliency-related Lagrange parameter possibly on a block-by-block basis. The proposed method detects ROI using a visual saliency model and encodes ROI blocks with higher quality than non-ROI blocks. It then separates each frame into blocks and uses a conjugate gradient algorithm to iteratively update weight coefficients and minimize a cost function, compressing each block losslessly based on its saliency. An experiment found the proposed method improved visual quality over other perceptual video coding methods according to metrics like eye-tracking weighted PSNR and
A Segmentation Based Sequential Pattern Matching for Efficient Video Copy Det...Best Jobs
This document discusses a video copy detection system that uses segmentation based sequential pattern matching of SIFT features for efficient detection. It divides videos into homogeneous segments and extracts SIFT features from keyframes of each segment. The SIFT features are then quantized into visual words for optimized matching between video segments. By performing visual word matching at the cluster level followed by feature level similarity measures, the system is able to detect copied video segments in a time-efficient manner while achieving improved accuracy over other methods.
Multi-View Video Coding Algorithms/Techniques: A Comprehensive StudyIJERA Editor
This document summarizes recent developments in multi-view video coding techniques. It begins with an introduction to multi-view video and multi-view video coding. It then discusses exploiting temporal and inter-view similarities for efficient compression. Several existing multi-view video coding methods and algorithms are reviewed, including predictive coding, subband coding, motion and disparity compensation, and wavelet-based approaches. The benefits and requirements of multi-view video compression are also outlined.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
This document presents a method for extracting key frames from videos using discrete wavelet transform (DWT) statistics. It begins with background on video frames, scene changes, and DWT wavelet frequency components. The proposed method extracts key frames in 4 steps: 1) applying DWT to consecutive frames and calculating differences between detail coefficients, 2) computing mean and standard deviation of differences, 3) estimating thresholds using mean and standard deviation, 4) comparing differences to thresholds to identify key frames where differences exceed thresholds. Experimental results on test videos demonstrate the method can detect key frames to represent scene changes for video summarization.
This document presents a method for extracting key frames from videos using discrete wavelet transform (DWT) statistics. It begins with background on video frames, scene changes, and DWT wavelet frequency components. It then describes the proposed key frame extraction algorithm, which involves: 1) applying DWT to consecutive video frames, 2) calculating differences between detail coefficients, 3) computing mean and standard deviation of differences, 4) setting thresholds based on mean and standard deviation, and 5) identifying frames where two difference values exceed thresholds as key frames. The method is applied to sample videos and parameters are tuned to effectively detect key frames for video summarization.
Design and Analysis of Quantization Based Low Bit Rate Encoding Systemijtsrd
This document summarizes research on developing a low bit rate encoding system for video compression using vector quantization. It first discusses how vector quantization can achieve high compression ratios and has been used widely in image and speech coding. It then describes the methodology used, which involves taking video frames as input, downsampling the frames to extract pixels, applying vector quantization, and detecting edges on the compressed frames to check compression quality. Finally, it discusses the results of testing the approach on MATLAB and presents conclusions on the advantages of the proposed algorithm for very low bit rate video coding applications.
Key frame extraction is an essential technique in the computer vision field. The extracted key frames should brief the salient events with an excellent feasibility, great efficiency, and with a high-level of robustness. Thus, it is not an easy problem to solve because it is attributed to many visual features. This paper intends to solve this problem by investigating the relationship between these features detection and the accuracy of key frames extraction techniques using TRIZ. An improved algorithm for key frame extraction was then proposed based on an accumulative optical flow with a self-adaptive threshold (AOF_ST) as recommended in TRIZ inventive principles. Several video shots including original and forgery videos with complex conditions are used to verify the experimental results. The comparison of our results with the-state-of-the-art algorithms results showed that the proposed extraction algorithm can accurately brief the videos and generated a meaningful compact count number of key frames. On top of that, our proposed algorithm achieves 124.4 and 31.4 for best and worst case in KTH dataset extracted key frames in terms of compression rate, while the-state-of-the-art algorithms achieved 8.90 in the best case.
Vector space model, term frequency-inverse document frequency with linear sea...CSITiaesprime
For Muslims, the Hadith ranks as the secondary legal authority following the Quran. This research leverages hadith data to streamline the search process within the nine imams’ compendium using the vector space model (VSM) approach. The primary objective of this research is to enhance the efficiency and effectiveness of the search process within Hadith collections by implementing pre-filtering techniques. This study aims to demonstrate the potential of linear search and Django object-relational mapping (ORM) filters in reducing search times and improving retrieval performance, thereby facilitating quicker and more accurate access to relevant Hadiths. Prior studies have indicated that VSM is efficient for large data sets because it assigns weights to every term across all documents, regardless of whether they include the search keywords. Consequently, the more documents there are, the more protracted the weighting phase becomes. To address this, the current research pre-filters documents prior to weighting, utilizing linear search and Django ORM as filters. Testing on 62,169 hadiths with 20 keywords revealed that the average VSM search duration was 51 seconds. However, with the implementation of linear and Django ORM filters, the times were reduced to 7.93 and 8.41 seconds, respectively. The recall@10 rates were 79% and 78.5%, with MAP scores of 0.819 and 0.814, accordingly.
Electro-capacitive cancer therapy using wearable electric field detector: a r...CSITiaesprime
Electro-capacitive cancer therapy (ECCT), a less invasive and more targeted approach using wearable electric field detectors, is revolutionizing cancer therapy, a complex process involving traditional methods like surgery, chemotherapy, and radiation. The review aims to investigate the safety and efficacy of electric field exposure in vital organs, particularly in cancer therapy, to improve medical advancements. It will investigate the impact on cytokines and insulation integrity, as well as contribute to improving diagnostic techniques and safety measures in medical and engineering fields. Wearable electric field detectors have revolutionized cancer therapy by offering a non-invasive and personalized approach to treatment. These devices, such as smart caps or patches, measure changes in electric fields by detecting capacitance alterations. Their lightweight, comfortable, and easy to-wear nature allows for real-time monitoring, providing valuable data for personalized treatment plans. The portability of wearable detectors allows for long-term surveillance outside clinical settings, increasing therapy efficacy. The ability to collect data over extended periods provides a comprehensive view of electric field dynamics, aiding researchers in understanding tumor growth and progression. Technology advancements in electro-capacitive therapy, including wearable devices, have revolutionized cancer treatment by adjusting electric field intensity in real-time, enhancing personalized medicine, and improving treatment outcomes and patient quality of life.
Technology adoption model for smart urban farming-a proposed conceptual modelCSITiaesprime
Technological advancements have made their way into the heart of human civilization across numerous fields, namely healthcare, logistics, and agriculture. Amidst the sprouting issues and challenges in the agriculture sector, particularly, the growing trend of integrating agriculture and technologies is roaring. The public and private sectors work hand in hand with regard to addressing these complex issues and challenges that arise, aiming for efficient and sustainable possible solutions. This study is a continuation of a previous systematic literature review; hence, the main objective is to deliver a proposed conceptual model for technology adoption specifically for smart urban farming. Innovation diffusion theory (IDT) is used as the main foundation of the proposed conceptual model, supplemented with additional factors drawn from other exisiting technology adoption models both the originals and extended versions. The outcome of the study is expected to reveal valuable insights into the components affecting the technology adoption model in smart urban farming, which will be further laid out upon in the upcoming study, offering a robust framework for future studies and applications in smart urban farming.
Optimizing development and operations from the project success perspective us...CSITiaesprime
By merging development and operation disciplines, the approach known as development and operations (DevOps) can significantly improve the efficiency and effectiveness of software development. Despite its potential benefits, successfully implementing DevOps within traditional project management frameworks presents significant challenges. This study explores the critical factors influencing the implementation of DevOps practices from the project management perspective, specifically focusing on software development projects in the Ministry of Finance. This study utilizes the analytic hierarchy process (AHP) to prioritize the critical elements of project success criteria and DevOps factors necessary for effective implementation. The findings indicate that stakeholder satisfaction, quality, and value creation are the primary criteria for project success. Moreover, knowledge and skills, collaboration and communication, and robust infrastructure are pivotal factors for facilitating DevOps within project management. The study provides actionable insights for organizations aiming to improve their project outcomes by incorporating DevOps and offers a systematic approach to decision-making using AHP. This study recognizes limitations due to its focus on specific contexts and emphasizes the need for future research in diverse organizational environments to validate and expand these findings.
Unraveling Indonesian heritage through pattern recognition using YOLOv5CSITiaesprime
This research focuses on three iconic Indonesian batik patterns-Kawung, Mega Mendung, and Parang-due to their cultural significance and recognition. Kawung symbolizes harmony, Mega Mendung represents power, and Parang signifies protection and spiritual power. Using the YOLOv5 deep learning model, the study aimed to accurately identify these patterns. Results showed mean average precision (mAP) scores of 77% for Kawung, 80% for Parang, and an impressive 99% for Mega Mendung. The highest precision results were 91% for Kawung, 88% for Parang, and 77% for Mega Mendung. These findings highlight the potential of pattern recognition in preserving cultural heritage. Understanding these designs contributes to the appreciation of Indonesia s culture. The research suggests applications in cultural studies, digital archiving, and the textile industry, ensuring the legacy of these patterns endures.
Capabilities of cellebrite universal forensics extraction device in mobile de...CSITiaesprime
The powerful digital forensics tool cellebrite universal forensics extraction device (UFED) extracts and analyzes mobile device data, helping investigators solve criminal and cybersecurity cases. Advanced methods and algorithms allow Cellebrite UFED to recover data from erased or obscured devices. Cellebrite UFED can pull data from call logs, texts, emails, and social media, providing valuable evidence for investigations. The use of smartphones and tablets in personal and professional settings has spurred the development of mobile device forensics. The intuitive user interface speeds up data extraction and analysis, revealing crucial information. It can decrypt encrypted data, recover deleted files, and extract data from multiple devices. The sector's best data extraction functionality, Cellebrite UFED, helps forensic analysts gather crucial evidence for investigations. Legal and ethical considerations are crucial in mobile device forensics. Legal considerations include allowing access to data, protecting privacy, and adhering to chain of custody protocols. Ethics include transparency, defamation, and information exploitation protection. Using Cellebrite UFED, researchers can navigate complex data on mobile devices more efficiently and precisely. Artificial intelligence (AI) and machine learning (ML) algorithms may automate data extraction in future tools. Examiners must train, maintain, and establish clear protocols for using Cellebrite UFED in forensic investigations.
Company clustering based on financial report data using k-meansCSITiaesprime
Stock investment is the act of providing funds or assets to obtain future payments for gifts given. In its application, novice investors often make mistakes, one of which is not knowing the health condition of the company they want to target. By applying the machine learning clustering method based on company financial report data, it was found that 2 clusters were formed. This can show the current condition of the company so that it can be a consideration for investors, such as clusters of companies that have a profit trend that is always stable and increasing, or clusters of companies that are in the process of developing their business and groups of companies that have large amounts of debt from year to year.
Securing DNS over HTTPS traffic: a real-time analysis toolCSITiaesprime
DNS over HTTPS (DoH) is a developing protocol that uses encryption to secure domain name system (DNS) queries within hypertext transfer protocol secure (HTTPS) connections, thereby improving privacy and security while browsing the web. This study involved the development of a live tool that captures and analyzes DoH traffic in order to classify it as either benign or malicious. We employed machine learning (ML) algorithms such as K-nearest neighbors (K-NN), random forest (RF), decision tree (DT), deep neural network (DNN), and support vector machine (SVM) to categorize the data. All of the algorithms, namely KNN, RF, and DT, achieved exceptional performance, with F1 scores of 1.0 or above for both precision and recall. The SVM and DNN both achieved exceptionally high scores, with only slight differences in accuracy. This tool employs a voting mechanism to arrive at a definitive classification decision. By integrating with the Mallory tool, it becomes possible to locally resolve DNS, which in turn allows for more accurate simulation of DoH queries. The evaluation results clearly indicate outstanding performance, confirming the tool's effectiveness in analyzing DoH traffic for network security and threat detection purposes.
Adversarial attacks in signature verification: a deep learning approachCSITiaesprime
Handwritten signature recognition in forensic science is crucial for identity and document authentication. While serving as a legal representation of a person’s agreement or consent to the contents of a document, handwritten signatures de termine the authenticity of a document, identify forgeries, pinpoint the suspects and support other pieces of evidence like ink or document analysis. This work focuses on developing and evaluating a handwritten signature verification sys tem using a convolutional neural network (CNN) and emphasising the model’s efficacy using hand-crafted adversarial attacks. Initially, handwritten signatures have been collected from sixteen volunteers, each contributing ten samples, fol lowed by image normalization and augmentation to boost synthetic data samples and overcome the data scarcity. The proposed model achieved a testing accu racy of 91.35% using an 80:20 train-test split. Additionally, using the five-fold cross-validation, the model achieved a robust validation accuracy of nearly 98%. Finally, the introduction of manually constructed adversarial assaults on the sig nature images undermines the model’s accuracy, bringing the accuracy down to nearly 80%. This highlights the need to consider adversarial resilience while designing deep learning models for classification tasks. Exposing the model to real look-alike fake samples is critical while testing its robustness and refining the model using trial and error methods.
Optimizing classification models for medical image diagnosis: a comparative a...CSITiaesprime
The surge in machine learning (ML) and artificial intelligence has revolutionized medical diagnosis, utilizing data from chest ct-scans, COVID-19, lung cancer, brain tumor, and alzheimer parkinson diseases. However, the intricate nature of medical data necessitates robust classification models. This study compares support vector machine (SVM), naïve Bayes, k-nearest neighbors (K-NN), artificial neural networks (ANN), and stochastic gradient descent on multi-class medical datasets, employing data collection, Canny image segmentation, hu moment feature extraction, and oversampling/under-sampling for data balancing. Classification algorithms are assessed via 5-fold cross-validation for accuracy, precision, recall, and F-measure. Results indicate variable model performance depending on datasets and sampling strategies. SVM, K-NN, ANN, and SGD demonstrate superior performance on specific datasets, achieving accuracies between 0.49 to 0.57. Conversely, naïve Bayes exhibits limitations, achieving precision levels of 0.46 to 0.47 on certain datasets. The efficacy of oversampling and under-sampling techniques in improving classification accuracy varies inconsistently. These findings aid medical practitioners and researchers in selecting suitable models for diagnostic applications.
Acoustic echo cancellation system based on Laguerre method and neural networkCSITiaesprime
Acoustic echo cancellation (AEC) is a fundamental requirement of signal processing to increase the quality of teleconferences. In this paper, a system that combines the Laguerre method with neural networks is proposed for AEC. In particular, the signal is processed using the Laguerre method to effectively handle nonlinear transmission line system. The results after applying the Laguerre method are then fed into a neural network for training and acoustic echo cancellation. The proposed system is tested on both linear and nonlinear transmission lines. Simulation results show that combining the Laguerre method with neural networks is highly effective for AEC in both linear and nonlinear transmission lines system. The AEC results obtained by the proposed method achieves a significant improvement in nonlinear transmission lines and it is the basis for building a practical echo cancellation system.
Clustering man in the middle attack on chain and graph-based blockchain in in...CSITiaesprime
Network security on internet of things (IoT) devices in the IoT development process may open rooms for hackers and other problems if not properly protected, particularly in the addition of internet connectivity to computing device systems that are interrelated in transferring data automatically over the network. This study implements network detection on IoT network security resembles security systems from man in the middle (MITM) attacks on blockchains. Security systems that exist on blockchains are decentralized and have peer to peer characteristics which are categorized into several parts based on the type of architecture that suits their use cases such as blockchain chain based and graph based. This study uses the principal component analysis (PCA) to extract features from the transaction data processing on the blockchain process and produces 9 features before the k-means algorithm with the elbow technique was used for classifying the types of MITM attacks on IoT networks and comparing the types of blockchain chain-based and graph-based architectures in the form of visualizations as well. Experimental results show 97.16% of normal data and 2.84% of MITM attack data were observed.
Smart irrigation system using node microcontroller unit ESP8266 and Ubidots c...CSITiaesprime
The agricultural irrigation system is extremely important. For optimal harvest yields, farmers must manage rice plant quality by monitoring water, soil, and temperature on agricultural fields. If market demand rises, traditional rice field irrigation in Indonesia will make things harder for farmers. This modern era requires a system that lets farmers monitor and regulate agricultural fields anywhere, anytime. We need a solution that can control the irrigation system remotely using an internet of things (IoT) device and a smartphone. This study employed the Ubidots IoT cloud platform. In addition, the study uses soil moisture and temperature sensors to monitor conditions in agricultural regions, while pumps function as irrigation systems. The test results indicate the proper design of the system. Each trial collected data. The pump will turn on and off automatically based on soil moisture criteria, with the pump active while the soil moisture is less than 20% and deactivated when the soil moisture exceeds 20%. In simulation mode, the pump operates for an average of 0–5 seconds of watering. The monitoring system shows the current soil temperature and moisture levels. Temperature sensors respond in 1-3 seconds, whereas soil moisture sensors respond in 0–4 seconds.
Development of learning videos for natural science subjects in junior high sc...CSITiaesprime
The purpose of this study was to determine the development procedure and the feasibility of learning media for whiteboard animation in Natural Sciences subjects at SMP Padindi, Tangerang Regency. This study uses a research and development (R&D) approach. The development model in this study is the analysis design development implementation evaluation (ADDIE) model. The feasibility test is carried out by means of individual testing (one to one) on 3 experts, namely material experts, learning experts, and media experts, as well as 3 students. In addition, a small group test was also carried out on 9 students. The results showed that: i) the material expert test was 87.5%, the learning expert was 85%, the media expert was 84.44%, 3 students were 88.84%, and the small group was 90%; and ii) this whiteboard animation learning media is suitable for use based on the results of media trials by experts and students.
Clustering of uninhabitable houses using the optimized apriori algorithmCSITiaesprime
Clustering is one of the roles in data mining which is very popularly used for data problems in solving everyday problems. Various algorithms and methods can support clustering such as Apriori. The Apriori algorithm is an algorithm that applies unsupervised learning in completing association and clustering tasks so that the Apriori algorithm is able to complete clustering analysis in Uninhabitable Houses and gain new knowledge about associations. Where the results show that the combination of 2 itemsets with a tendency value for Gas Stove fuel of 3 kg and the installed power meter for the attribute item criteria results in a minimum support value of 77% and a minimum confidence value of 87%. This proves that a priori is capable of clustering Uninhabitable Houses to help government work programs.
Improving support vector machine and backpropagation performance for diabetes...CSITiaesprime
Diabetes mellitus is a glucose disorder disease in the human body that contributes significantly to the high mortality rate. Various studies on early detection and classification have been conducted as a diabetes mellitus prevention effort by applying a machine learning model. The problems that may occur are weak model performance and misclassification caused by imbalanced data. The existence of dominating (majority) data causes poor model performance in identifying minority data. This paper proposed handling the problem of imbalanced data by performing the synthetic minority oversampling technique (SMOTE) and observing its effect on the classification performance of the support vector machine (SVM) and Backpropagation artificial neural network (ANN) methods. The experiment showed that the SVM method and imbalanced data achieved 94.31% accuracy, and the Backpropagation ANN achieved 91.56% accuracy. At the same time, the SVM method and balanced data produced an accuracy of 98.85%, while the Backpropagation ANN method and balanced data produced an accuracy of 94.90%. The results show that oversampling techniques can improve the performance of the classification model for each data class.
Machine learning-based anomaly detection for smart home networks under advers...CSITiaesprime
As smart home networks become more widespread and complex, they are capable of providing users with a wide range of applications and services. At the same time, the networks are also vulnerable to attack from malicious adversaries who can take advantage of the weaknesses in the network's devices and protocols. Detection of anomalies is an effective way to identify and mitigate these attacks; however, it requires a high degree of accuracy and reliability. This paper proposes an anomaly detection method based on machine learning (ML) that can provide a robust and reliable solution for the detection of anomalies in smart home networks under adversarial attack. The proposed method uses network traffic data of the UNSW-NB15 and IoT-23 datasets to extract relevant features and trains a supervised classifier to differentiate between normal and abnormal behaviors. To assess the performance and reliability of the proposed method, four types of adversarial attack methods: evasion, poisoning, exploration, and exploitation are implemented. The results of extensive experiments demonstrate that the proposed method is highly accurate and reliable in detecting anomalies, as well as being resilient to a variety of types of attacks with average accuracy of 97.5% and recall of 96%.
Transfer learning: classifying balanced and imbalanced fungus images using in...CSITiaesprime
Identifying the genus of fungi is known to facilitate the discovery of new medicinal compounds. Currently, the isolation and identification process is predominantly conducted in the laboratory using molecular samples. However, mastering this process requires specific skills, making it a challenging task. Apart from that, the rapid and highly accurate identification of fungus microbes remains a persistent challenge. Here, we employ a deep learning technique to classify fungus images for both balanced and imbalanced datasets. This research used transfer learning to classify fungus from the genera Aspergillus, Cladosporium, and Fusarium using InceptionV3 model. Two experiments were run using the balanced dataset and the imbalanced dataset, respectively. Thorough experiments were conducted and model effectiveness was evaluated with standard metrics such as accuracy, precision, recall, and F1 score. Using the trendline of deviation knew the optimum result of the epoch in each experimental model. The evaluation results show that both experiments have good accuracy, precision, recall, and F1 score. A range of epochs in the accuracy and loss trendline curve can be found through the experiment with the balanced, even though the imbalanced dataset experiment could not. However, the validation results are still quite accurate even close to the balanced dataset accuracy.
Implementation of automation configuration of enterprise networks as software...CSITiaesprime
Software defined network (SDN) is a new computer network configuration concept in which the data plane and control plane are separated. In Cisco system, the SDN concept is implemented in Cisco Application Centric Infrastructure (Cisco ACI), which by default can be configured through the main controller, namely the Application Policy Infrastructure Controller (APIC). Conventional configuration on Cisco ACI creates problems, i.e.: the large number of required configurations causes the increase of time required for configuration and the risk of misconfiguration due to repetitive works. This problem reduces the productivity of network engineers in managing Cisco system. In overcoming these problems, this research work proposes an automation tool for Cisco ACI configuration using Ansible and Python as an SDN implementation for optimizing enterprise network configuration. The SDN is implemented and experimented at PT. NTT Indonesia Technology network, as a case study. The experimental result shows the proposed SDN successfully performs multiple routers configurations accurately and automatically. Observations on manual configuration takes 50 minutes and automatic configuration takes 6 minutes, thus, the proposed SDN achieves 833.33% improvement.
Hybrid model for detection of brain tumor using convolution neural networksCSITiaesprime
The development of aberrant brain cells, some of which may turn cancerous, is known as a brain tumor. Magnetic resonance imaging (MRI) scans are the most common technique for finding brain tumors. Information about the aberrant tissue growth in the brain is discernible from the MRI scans. In numerous research papers, machine learning, and deep learning algorithms are used to detect brain tumors. It takes extremely little time to forecast a brain tumor when these algorithms are applied to MRI pictures, and better accuracy makes it easier to treat patients. The radiologist can make speedy decisions because of this forecast. The proposed work creates a hybrid convolution neural networks (CNN) model using CNN for feature extraction and logistic regression (LR). The pre-trained model visual geometry group 16 (VGG16) is used for the extraction of features. To reduce the complexity and parameters to train we eliminated the last eight layers of VGG16. From this transformed model the features are extracted in the form of a vector array. These features fed into different machine learning classifiers like support vector machine (SVM), naïve bayes (NB), LR, extreme gradient boosting (XGBoost), AdaBoost, and random forest for training and testing. The performance of different classifiers is compared. The CNN-LR hybrid combination outperformed the remaining classifiers. The evaluation measures such as recall, precision, F1-score, and accuracy of the proposed CNN-LR model are 94%, 94%, 94%, and 91% respectively.
Autopilot for Everyone Series Session 2: Elevate Your Automation SkillsUiPathCommunity
📕 This engaging session will include:
Quick recap of Session 1: refresh your knowledge and get ready for what's next
Hands-on experience: import prebuilt automations to fast-track your automation journey with practical insights
Build your own tools: dive into creating tailored automation solutions that meet your specific needs
Live Q&A with experts: engage directly with industry experts and get your burning questions answered
👉 Register to our next Autopilot for Everyone Series - Session 3: Exploring Real-World Use Cases: https://bit.ly/4cMgC8F
Don't miss this unique opportunity to enhance your skills and connect with fellow automation enthusiasts. RSVP now to secure your spot and bring a friend along! Let's make automation accessible and exciting for everyone.
This session streamed live on April 17, 2025, 18:00 GST.
Check out our upcoming UiPath Community sessions at https://community.uipath.com/events/.
Managing Changing Data with FME: Part 2 – Flexible Approaches to Tracking Cha...Safe Software
Your data is always changing – but are you tracking it efficiently? By using change detection methods in FME, you can streamline your workflows, reduce manual effort, and boost productivity.
In Part 1, we explored a basic method for detecting changes using the ChangeDetector transformer. But what if your use case requires a more tailored approach?
In this webinar, we’ll go beyond basic comparison and explore more flexible, customizable methods for tracking data changes.
Join us as we explore these three methods for tracking data changes:
- Filtering by modification date to instantly pull updated records.
-Using database triggers in shadow tables to capture changes at the column level.
-Storing all changes in a transaction log to maintain a history of all changes with transactional databases.
Whether you’re handling a simple dataset or managing large-scale data updates, learn how FME provides the adaptable solutions to track changes with ease.
The ability to seamlessly exchange data and coordinate workflows between different systems is crucial for any organization, regardless of its technological dependence. API integration platforms play a central role in achieving this by providing a foundation for efficient communication and process automation. Choosing the right platform is key to ensuring smooth data flow and well-orchestrated workflows.
Amidst this backdrop, organizations find themselves contemplating migration from legacy systems like webMethods to modern solutions like MuleSoft Integration solutions. The motivation behind this shift is multifold, ranging from the need for enhanced scalability and agility to keeping pace with evolving technological standards.
You know you need to invest in a CRM platform, you just need to invest in the right one for your business.
It sounds easy enough but, with the onslaught of information out there, the decision-making process can be quite convoluted.
In a recent webinar we compared two options – HubSpot’s Sales Hub and Salesforce’s Sales Cloud – and explored ways to help you determine which CRM is better for your business.
An introductory presentation of a short paper with same name in the ICUFN2022: The 13th International Conference on Ubiquitous and Future Networks in Barcelona, Spain. The presentation and paper describes our (Karri Huhtanen, Antti Kolehmainen) initial proposal for distributed multi-factor AAA architecture capable of surviving connectivity disruptions. Together with Tampere University we intended to design, implement and deploy the proposed architecture in practice to ensure its validity, but did not have time to do it.
Autopilot for Everyone Series - Session 3: Exploring Real-World Use CasesUiPathCommunity
Welcome to 'Autopilot for Everyone Series' - Session 3: Exploring Real-World Use Cases!
Join us for an interactive session where we explore real-world use cases of UiPath Autopilot, the AI-powered automation assistant.
📕 In this engaging event, we will:
- demonstrate how UiPath Autopilot enhances productivity by combining generative AI, machine learning, and automation to streamline business processes
- discover how UiPath Autopilot enables intelligent task automation with natural language inputs and AI-powered decision-making for smarter workflows
Whether you're new to automation or a seasoned professional, don't miss out on this opportunity to transform your approach to business automation.
Register now and step into the future of efficient work processes!
Oil seed milling, also known as oilseed crushing, is the process of extracting oil from seeds like soybeans, sunflower seeds, and rapeseed. This process involves several steps, including seed preparation, oil extraction (often using mechanical pressing or solvent extraction), and oil refining.
Transcript - Delta Lake Tips, Tricks & Best Practices (1).pdfcarlyakerly1
This session takes you back to the core principles for for successfully utilizing and operating Delta Lake. We break down the fundamentals—Delta Lake’s structure, transaction management, and data retention strategies—while showcasing its powerful features like time travel for seamless rollback and vacuuming for efficient cleanup.
Demonstrations will teach you how to create and manage tables, execute transactions, and optimize performance with proven techniques. Walk away with a clear understanding of how to harness Delta Lake’s full potential for scalable, reliable data management.
Speakers: Scott Haines (Nike) & Youssef Mirini (Databricks)
YouTube video: https://www.youtube.com/live/O8_82Cu6NBw?si=--4iJL1NkzEPCBgd
Slide deck from presentation: https://www.slideshare.net/slideshow/delta-lake-tips-tricks-and-best-practices-wip-pptx/277984087
The proposed regulatory framework for Artificial Intelligence and the EU General Data Protection Regulation oblige automated reasoners to justify their conclusions in human-understandable terms. In addition, ethical and legal concerns must be provably addressed to ensure that the advice given by AI systems is aligned with human values. Value-aware systems tackle this challenge by explicitly representing and reasoning with norms and values applicable to a problem domain. For instance, in the context of a public administration such systems may provide support to decision-makers in the design and interpretation of administrative procedures and, ultimately, may enable the automation of (parts of) these administrative processes. However, this requires the capability to analyze as to how far a particular legal model is aligned with a certain value system. In this work, we take a step forward in this direction by analysing and formally representing two (political) strategies for school place allocation in educational institutions supported by public funds. The corresponding (legal) norms that specify this administrative process differently weigh human values such as equality, fairness, and non-segregation. We propose the use of s(LAW), a legal reasoner based on Answer Set Programming that has proven capable of adequately modelling administrative processes in the presence of vague concepts and/or discretion, to model both strategies. We illustrate how s(LAW) simultaneously models different scenarios, and how automated reasoning with these scenarios can answer questions related to the value-alignment of the resulting models.
Monitor Kafka Clients Centrally with KIP-714Kumar Keshav
Apache Kafka introduced KIP-714 in 3.7 release, which allows the Kafka brokers to centrally track client metrics on behalf of applications. The broker can subsequently relay these metrics to a remote monitoring system, facilitating the effective monitoring of Kafka client health and the identification of any problems.
KIP-714 is useful to Kafka operators because it introduces a way for Kafka brokers to collect and expose client-side metrics via a plugin-based system. This significantly enhances observability by allowing operators to monitor client behavior (including producers, consumers, and admin clients) directly from the broker side.
Before KIP-714, client metrics were only available within the client applications themselves, making centralized monitoring difficult. With this improvement, operators can now access client performance data, detect anomalies, and troubleshoot issues more effectively. It also simplifies integrating Kafka with external monitoring systems like Prometheus or Grafana.
This talk covers setting up ClientOtlpMetricsReporter that aggregates OpenTelemetry Protocol (OTLP) metrics received from the client, enhances them with additional client labels and forwards them via gRPC client to an external OTLP receiver. The plugin is implemented in Java and requires the JAR to be added to the Kafka broker libs.
Be it a kafka operator or a client application developer, this talk is designed to enhance your knowledge of efficiently tracking the health of client applications.
"Smarter, Faster, Autonomous: A Deep Dive into Agentic AI & Digital Agents"panktiskywinds12
Discover how Agentic AI and AI Agents are revolutionizing business automation. This presentation introduces the core concepts behind machines that can plan, learn, and act autonomously—without constant human input.
Learn what makes an AI Agent more than just a bot, and explore their real-world applications in customer support, supply chains, finance, and marketing. We’ll also cover the challenges businesses must navigate and how to get started with frameworks.
Introduction to LLM Post-Training - MIT 6.S191 2025Maxime Labonne
In this talk, we will cover the fundamentals of modern LLM post-training at various scales with concrete examples. High-quality data generation is at the core of this process, focusing on the accuracy, diversity, and complexity of the training samples. We will explore key training techniques, including supervised fine-tuning, preference alignment, and model merging. The lecture will delve into evaluation frameworks with their pros and cons for measuring model performance. We will conclude with an overview of emerging trends in post-training methodologies and their implications for the future of LLM development.
ISTQB Foundation Level – Chapter 4: Test Design Techniqueszubair khan
This presentation covers Chapter 4: Test Design Techniques from the ISTQB Foundation Level syllabus. It breaks down core concepts in a simple, visual, and easy-to-understand format — perfect for beginners and those preparing for the ISTQB exam.
✅ Topics covered:
Static and dynamic test techniques
Black-box testing (Equivalence Partitioning, Boundary Value Analysis, Decision Tables, State Transition Testing, etc.)
White-box testing (Statement and Decision coverage)
Experience-based techniques (Exploratory Testing, Error Guessing, Checklists)
Choosing appropriate test design techniques based on context
🎓 Whether you're studying for the ISTQB certification or looking to strengthen your software testing fundamentals, these slides will guide you through the essential test design techniques with clarity and real-world relevance.
Navigating common mistakes and critical success factors
Is your team considering or starting a database migration? Learn from the frontline experience gained guiding hundreds of high-stakes migration projects – from startups to Google and Twitter. Join us as Miles Ward and Tim Koopmans have a candid chat about what tends to go wrong and how to steer things right.
We will explore:
- What really pushes teams to the database migration tipping point
- How to scope and manage the complexity of a migration
- Proven migration strategies and antipatterns
- Where complications commonly arise and ways to prevent them
Expect plenty of war stories, along with pragmatic ways to make your own migration as “blissfully boring” as possible.
real time ai agent examples | AI agent developmentybobbyyoung
🚀 10 Real-World AI Agent Examples That Are Changing How We Work in 2025
Discover how AI agents are simplifying workflows, boosting productivity, and transforming industries — from customer support to HR, IT, finance, and more!
This presentation breaks down real-world use cases of AI agents and shows how your business can benefit from custom-built AI solutions.
🎯 Built by Shamla Tech – Your Trusted AI Agent Development Partner
✅ Easy Integration
✅ One-Time Ownership
✅ Tailored for Your Business
✅ Free Demo & Consultation
The Gold Jacket Journey - How I passed 12 AWS Certs without Burning Out (and ...VictorSzoltysek
Only a few hundred people on the planet have done this — and even fewer have documented the journey like this.
In just one year, I passed all 12 AWS certifications and earned the ultra-rare AWS Gold Jacket — without burning out, without quitting my job, and without wasting hours on fluff.
My secret? A completely AI-powered study workflow using ChatGPT, custom prompts, and a technique I call DeepResearch — a strategy that pulls high-signal insights from Reddit, blogs, and real-world exam feedback to shortcut the noise and fast-track what actually matters.
This is the slide deck from my live talk — it breaks down everything:
✅ How I used ChatGPT to quiz, explain, and guide me
✅ How DeepResearch helped me prioritize the right content
✅ My top 80/20 study tips, service-specific rules of thumb, and real-world exam traps
✅ The surprising things that still trip up even experienced cloud teams
If you’re considering AWS certifications — or want to learn how to study smarter using AI — this is your blueprint.
Beginners: Radio Frequency, Band and Spectrum (V3)3G4G
Welcome to this tutorial where we break down the complex topic of radio spectrum in a clear and accessible way.
In this video, we explore:
✅ What is spectrum, frequency, and bandwidth?
✅ How does wavelength affect antenna design?
✅ The difference between FDD and TDD
✅ 5G spectrum ranges – FR1 and FR2
✅ The role of mmWave, and why it's misunderstood
✅ What makes 5G Non-Standalone (NSA) different from 5G Standalone (SA)
✅ Concepts like Carrier Aggregation, Dual Connectivity, and Dynamic Spectrum Sharing (DSS)
✅ Why spectrum refarming is critical for modern mobile networks
✅ Evolution of antennas from legacy networks to Massive MIMO
Whether you're just getting started with wireless technology or brushing up on the latest in 5G and beyond, this video is designed to help you learn and stay up to date.
👍 Like the video if you find it helpful
🔔 Subscribe for more tutorials on 5G, 6G, and mobile technology
💬 Drop your questions or comments below—we’d love to hear from you!
All our #3G4G5G slides, videos, blogs and tutorials are available at:
Tutorials: https://www.3g4g.co.uk/Training/
Videos: https://www.youtube.com/3G4G5G
Slides: https://www.slideshare.net/3G4GLtd
Our channels:
3G4G Website – https://www.3g4g.co.uk/
The 3G4G Blog – https://blog.3g4g.co.uk/
Telecoms Infrastructure Blog – https://www.telecomsinfrastructure.com/
Operator Watch Blog – https://www.operatorwatch.com/
Connectivity Technology Blog – https://www.connectivity.technology/
Free 5G Training – https://www.free5gtraining.com/
Free 6G Training – https://www.free6gtraining.com/
Private Networks Technology Blog - https://blog.privatenetworks.technology/
CLI, HTTP, GenAI and MCP telemetry/observability in JavaPavel Vlasov
This presentation demonstrates Nasdanika telemetry/observability capabilities for CLI, HTTP, GenAI and MCP in Java.
With these capabilities you can build observable custom Java-based CLI tools, including MCP & HTTP servers, deployed to workstations, build pipelines, servers, Docker images, etc. and track usage of individual commands and their use of other resources - HTTP, AI Chat and Embeddings, MCP servers. You can also track MCP and HTTP server requests.
The CLI approach allows to leverage CPUs/GPUs of local workstations and local LLMs.
While local LLMs may not be very fast, they can be used in a batch mode, e.g. overnight. For example, generating code, analyzing merge requests, or tailoring resumes for job postings (using a CrewAI example - https://nasdanika-knowledge.github.io/crew-ai-visual-synopsis/tailor-job-applications/index.html).
Also, CLI-based tools can be used to deliver fine-grained functionality specific to a particular group of people. For example, a custom bundled RAG/Chat on top of a document base for, say, mortgage agents.
AI in Real Estate Industry PPT | PresentationCodiste
The real estate industry stands at the threshold of a technological revolution. Artificial intelligence is reshaping traditional practices, automating processes, and delivering unprecedented value across the entire sector.
This presentation explores how AI technologies are transforming property valuation, management, customer service, and investment analysis - creating new opportunities and efficiencies for real estate professionals.
Read more information: https://bit.ly/4ioa2WZ
AI in Real Estate Industry PPT | PresentationCodiste
Video shot boundary detection based on frames objects comparison and scale-invariant feature transform technique
1. Computer Science and Information Technologies
Vol. 5, No. 2, July 2024, pp. 130~139
ISSN: 2722-3221, DOI: 10.11591/csit.v5i2.pp130-139 130
Journal homepage: http://iaesprime.com/index.php/csit
Video shot boundary detection based on frames objects
comparison and scale-invariant feature transform technique
Noor Khalid Ibrahim, Zinah Sadeq Abduljabbar
Department of Computer Science, College of Science, Mustansiriyah University, Baghdad, Iraq
Article Info ABSTRACT
Article history:
Received Dec 12, 2023
Revised Feb 24, 2024
Accepted Mar 4, 2024
The most popular source of data on the Internet is video which has a lot of
information. Automating the administration, indexing, and retrieval of movies
is the goal of video structure analysis, which uses content-based video
indexing and retrieval. Video analysis requires the ability to recognize shot
changes since video shot boundary recognition is a preliminary stage in the
indexing, browsing, and retrieval of video material. A method for shot
boundary detection (SBD) is suggested in this situation. This work proposes
a shot boundary detection system with three stages. In the first stage, multiple
images are read in temporal sequence and transformed into grayscale images.
Based on correlation value comparison, the number of redundant frames in
the same shots is decreased, from this point on, the amount of time and
computational complexity is reduced. Then, in the second stage, a candidate
transition is identified by comparing the objects of successive frames and
analyzing the differences between the objects using the standard deviation
metric. In the last stage, the cut transition is decided upon by matching key
points using a scale-invariant feature transform (SIFT). The proposed system
achieved an accuracy of 0.97 according to the F-score while minimizing time
consumption.
Keywords:
Frames correlation
Object comparison
Shot boundary
Video analysis
Video segmentation
This is an open access article under the CC BY-SA license.
Corresponding Author:
Noor Khalid Ibrahim
Department of Computer Science, College of Science, Mustansiriyah University
Baghdad, Iraq
Email: noor.kh20@ uomustansiriyah.edu.iq
1. INTRODUCTION
The vast amount of video content on the internet makes it challenging to develop effective indexing
and search strategies for managing video data. Content-based video retrieval is emerging as a trend in video
retrieval systems, while conventional methods like video compression and summarizing aim for minimal
storage requirements and maximum visual and semantic accuracy [1]. Given that video is the most
sophisticated sort of multimedia data, it includes information about the target's mobility within the scene as
well as information about the objective world changing with time [2].
Two modules can be approximately regarded in video segmentation which are video object
(foreground/background) segmentation, and video semantic segmentation [3]. Video segmentation, also known
as shot boundary detection (SBD), involves breaking the video up into meaningful scenes so that the essential
feature(s) may be found in each scene through analysis [4]. A cut is a sudden change in the shot that takes place
inside a single frame. A fade is a gradual alteration in brightness that often begins or ends with a completely
dark frame. Frames inside the transition show one image overlaid on the other during a dissolve, which happens
as the images of the first shot go darker and the images of the second shot get brighter [1]. The primary
2. Comput Sci Inf Technol ISSN: 2722-3221
Video shot boundary detection based on frames objects comparison and scale… (Noor Khalid Ibrahim)
131
difficulties in shot boundary recognition are movements of the camera and objects since these can significantly
change the video content, producing an effect akin to transition effects and leading to inaccurate shot transition
detection [5].
Numerous studies have addressed video segmentation, Hong Shao et al. [6] utilized a combination of
a color histogram with Hue Saturation Value (HSV) and features of histogram of gradient (HOG) to effectively
detect abrupt shot changes in videos. In [3] This work proposes a shot boundary detection approach based on
the scale-invariant feature transform (SIFT). Using a top-down search strategy, the initial phase of this
approach compares the ratio of matched features derived by SIFT for each RGB channel of video frames to
locate transitions. The boundaries' locations are shown in the overview stage. Second, to ascertain the kind of
transition, a moving average computation is made.
In [7] The research aimed to use a multi-modal visual features-based SBD framework; the behaviors
of the visual representation are analyzed concerning the discontinuity signal. This used a candidate segment
selection strategy that does not compute the threshold; instead, it utilizes the discontinuity signal's cumulative
moving average to determine the shot boundary locations while disregarding the non-boundary video frames.
To differentiate between a candidate segment that is a cut transition and one that is a gradual transition,
including fade in/out and logo occurrence, the transition detection is carried out structurally.
In [8] the proposed temporal video segment representation formalizes video scenes as temporal
motion change data, determining motion modifications and cuts between scenes through optical flow character
changes. This reduces the issue to an optical flow-based cut detection problem, enhancing a pixel-based
representation. The proposed video segment representation divides temporal video segment points into cuts
and non-cuts.
In [9] the bag of visual word (BoVW) model, which splits the video into shots and keyframes, is the
basis for the segmentation model for videos that the study suggested. The BoVW model is employed in two
variants: the traditional BoVW and an expansion known as the vector of linearly aggregated descriptors
(VLAD). Keyframe feature vectors inside a sliding window of length L are used to calculate similarity. In [10]
The study presents a method for feature fusion and clustering technique (FFCT)-based video shot boundary
detection, which involves converting interval frames into grayscale images, extracting fingerprint and speed-
up robust features, fusion, and clustering them using a K-means algorithm. Linear discriminant analysis (LDA)
is introduced for cluster mapping, and features are chosen using density computation based on frame
correlation.
In [2] a novel algorithm for camera detection based on SIFT features was introduced in this study.
The proposed method involves the analysis of multiple frames of images in a sequential manner. Initially, the
images are converted into grayscale and divided into blocks. Subsequently, the dynamic texture of the film is
computed, and the correlation between the dynamic texture of adjacent frames and the matching degree of
SIFT features is determined. Based on these matching results, pre-detection outcomes are obtained.
Idan et al. [11] proposed a fast video processing method for SBD. To reduce computing costs and
disturbances, the proposed SBD framework makes use of candidate segment selection with frame active area
and separable moments. Inequality criteria and adaptive threshold are used to exclude non-transition frames
and maintain candidate segments. Cut transition detection is done using machine learning statistics.
In [12] a practical SBD method was presented in the study, which uses average edge information for
gradual transition detection and gradient and color information for abrupt transition detection. Processing only
transition regions yield an average edge frame and reduces computational complexity. In [5] The proposed
method comprises two distinct stages. In the initial stage, projection features were employed to differentiate
between non-boundary transitions and candidate transitions that potentially encompass abrupt boundaries.
Consequently, only the candidate transitions were retained for further analysis in the subsequent stage. This
approach effectively enhances the speed of shot detection by minimizing the detection scope. In [13] An
effective SBD approach with several invariant properties was presented in this work. With the right mix of
invariant features, such as edge change ratio (ECR), color layout descriptor (CLD), and scale-invariant feature
transform (SIFT) key point descriptors, the accuracy level of SBD was increased.
According to the literature, many applications have been created to address the issue of shot boundary
detection in videos. These applications are performed based on various techniques to process the challenges in
SBD. This proposed SBD system has been achieved in three stages to improve its performance and try to
reduce the problem of object and camera motion, wherein the first stage the redundancy frames in the same
shots are reduced based on correlation value comparison, this stage yields minimizing time-consuming and
computation complexity. Then in the second stage candidate transition is determined by comparing the objects
of sequential frames, final stage the decision of the cut transition is made based on key points matching of SIFT
method. This proposed method aims to find the boundary frame of a shot with a cut transition between
consecutive shots accurately. The rest of the paper is organized as follows, section 2 explains the proposed
method, the experimental result, and the analysis demonstrated in section 3, followed by a conclusion in
section 4.
3. ISSN: 2722-3221
Comput Sci Inf Technol, Vol. 5, No. 2, July 2024: 130-139
132
2. SBD PROPOSED METHOD
This proposed SBD system has been achieved in three stages, in the first stage, multiple images are
read in temporal sequence and transformed into grayscale images. Based on correlation value comparison, the
number of redundant frames in the same shots is decreased, and then, in the second stage, a candidate transition
is identified by comparing the objects of successive frames using the proposed method to extract frame image
objects. In the last stage, the cut transition is decided upon by matching key points using the SIFT approach.
The details of these stages are explained as follows:
2.1. Reduces redundancy stage
The multiple frames of input video are extracted as the first step, then converted into grayscale and
resized into 256*256. Some pre-processed operations are achieved on these frames to improve their quality
when the noise is removed by the wiener filter [14], and contrast is enhanced by histogram equalization [15].
The resulting frames are normalized in the range [0-1].
In one shot the consecutive frames have a very high similarity, and achieving the SBD process on
each pair of frames will be very time-consuming and computationally complex. So, to minimize this time and
complexity the redundancy frames in one shot have been reduced based on the measure of their correlation
value. The correlation value (r) of frames Fr(i) and Fr(i+1) and based on the threshold value (Th) identified
experimentally the frame Fr(i) is passed to the next stage, otherwise, frame Fr(i) is discarded as demonstrated
in (1). Where the correlation value is calculated as explained in (2) [16].
Passed to next stage r < Th
Fr(i)
discard otherwise (1)
𝑟 =
∑ (𝑥𝑖−𝑥𝑚 )(𝑦𝑖−𝑦𝑚)
𝑖
√(∑ (𝑥𝑖−𝑥𝑚
𝑖 )2 √(∑ (𝑦𝑖−𝑦𝑚
𝑖 )2
(2)
where 𝑥𝑖 denotes the pixel intensity in order ith of the first image, and 𝑦𝑖 demarcated the ith pixel
intensity of the second image, additionally, 𝑥𝑚 and 𝑦𝑚 is the mean intensity of first and second images
sequentially.
2.2. Selection of candidate transition stage
Candidate transition selection is performed based on comparison made on consecutive frame objects,
that means on frame image content. This image content extraction is achieved based on the proposed extraction
method as explained in Figure 1in this stage. As seen in the figure, the objects of the frame have been extracted
in two steps, which are the generation of the feature template, and extract the object, these steps are detailed as
following:
Figure 1. Frame objects extraction flowchart
2.2.1. First step (generate features template)
For each consecutive frame passed to this stage the template of features is generated when multiple
features are extracted and combined from each frame image. The selection of these multiple features must be
able to extract the objects of a frame image accurately, so in this proposed extraction method of this proposed
SBD algorithm, the multiple features are represented by the texture characteristics that yield information about
the local variability of the pixel's intensity values are recovered using the standard deviation filter (SD) [17] of
the 3-by-3 neighborhood around the consistent pixel. The value luminance grayscale of these processed frames
4. Comput Sci Inf Technol ISSN: 2722-3221
Video shot boundary detection based on frames objects comparison and scale… (Noor Khalid Ibrahim)
133
is represented by channel L* in the L* a* b* color space [18] used as second feature. The L*a*b* typically
appears to be able to depict the colors to human vision. Additionally, because the RGB representation includes
a transition color between blue and green, the L*a*b* color representation compensates for the diversity in the
color distribution in the RGB color model [19]. For this reason, L*a*b* is taken into account along with its L*
value. These two feature matrices are then merged with the edge of the detected frame by a canny operator
which has the ability to recognize object boundaries in an image and object appreciation to create a feature
template. The following is how SD is calculated [20].
𝜇𝑗 =
1
𝑁
∑𝑥𝑗𝑖
𝑁
𝑖=1
(3)
𝜎𝑗 = √
1
𝑁
∑(𝑥𝑗𝑖 − 𝜇𝑗)2
𝑁
𝑖=1
(4)
2.2.2. Second step (objects extracted)
Utilize the k-means [21] algorithm with this created template to extract the objects from these
successive frames. A k-number group of data is gathered in order to use K-means. kmeans method consists of
two stages. In the first, the centroid is initialized, and in the second, the distance to the closest centroid is used
to identify which cluster the data point belongs to. Because of its ease of use and quick calculation, the k-means
clustering approach is widely utilized in clustering processes [22], which is the reason that it was chosen for
this phase. Consequently, the frame image object has been identified based on this proposed extraction method
with generated features template and K-means technique.
The frames' similarity has been measured based on the objects' comparison by dividing images of
objects of related sequential frames into 8×8 blocks, and then the entropy value of each block is calculated, in
turn, these entropies values are arranged into vectors of the length 64, which represent similarity measurement
vectors as explained in Figure 2, and then the standard deviation is calculated to differences between these two
entropies vectors of object images of consecutive frames when the value of standard deviation is nearest to
zero normal transition has been distinguished. According to the threshold (Thr) value perceived experimentally,
the abrupt transition has been a candidate, otherwise, normal transition has been detected as demarcated in (5).
Entropy value is determined as in (6) [23]. In turn, these candidate frames are passed to the third stage to make
a decision of abrupt transition.
Figure 2. Construct similarity measurement vector of object image
Abrupt transition candidate sd > Thr
Fri
Normal transition otherwise (5)
Let Fri represent the video frame with index i
5. ISSN: 2722-3221
Comput Sci Inf Technol, Vol. 5, No. 2, July 2024: 130-139
134
𝐻𝑟 = − ∑ log2(𝑔𝑟
−𝑘
𝑔𝑟
−𝑘
) (6)
Where 𝑔𝑟
−𝑘
denote distribution of assumed color space.
2.3. Transition decision stage
Making the right choice when deciding how to divide a video sequence into shots is mostly dependent
on selecting the right method. David adopted a scale-invariant feature transform SIFT [24]. The SIFT feature
has been used in this stage to determine the frame transition and its boundary because, given an image as input,
the SIFT descriptor generates a wide range of local feature vectors that are independent of image scaling and
rotation. SIFT is capable of precisely correlating two images [13]. In situations of abrupt transitions, when the
matching degree of the SIFT feature between the frames is low, neighboring frames are recognized as
belonging to different shots, which can better discern the moving objects in successive frames.
3. EXPERIMENTAL RESULTS AND ANALYSIS
Eight distinct videos from the standard dataset, TRECVid 2001 test data made existing on the open
video project and accessible at https://open-video.org, are used to assess the suggested method in this research.
These videos are referred to as Vid1 through Vid8. A comprehensive description of those input videos is
provided in Table 1. The ground truth value is determined by observing abrupt changes as seen by people. The
chosen videos contain a variety of aberrations, including lighting variations, viewpoint shifts, scaling, zooming,
rotation, and more.
Table 1. Description of input videos
Input
video
Video name Time Duration
In sec.
Frames
number
Abrupt
transition
Vid1 Free-for-all race at Charter Oak Park (Historical) 26 853 3
Vid2 New Indians, Segment 101 (Documentary) 131 3953 14
Vid3 New Indians, Segment 01 (Documentary) 56 1687 15
Vid4 Winning Aerospace, Segment 02 (Documentary) 65 1970 11
Vid5 Hidden Fury, segment 10 (Documentary) 33 1002 1
Vid6 Hurricanes, Segment 05 (Documentary) 115 3448 32
Vid7 The Miracle of Water, segment 05 (Documentary) 83 2314 1
Vid8 Winning Aerospace, Segment 04 (Documentary) 110 3318 18
3.1. Reduces redundancy stage
When the multiple frames of input video have been extracted, the frames images in the same shot
have a high similarity degree and when performing features extraction to extract objects from each frame image
results in time-consuming and computing complexity, so reducing the redundancy frames stage results in time-
consuming minimization as seen in Table 2 and Figure 3, for instance, the execution time was equivalent to
(111.4 seconds) when the second stage was applied to all of the vid1's frames, that means without similarity
frames reduction. as opposed to the execution time (44.41 seconds) when vid1 advanced to the lower
redundancy level, and so on to others videos as explained in this table that shows how much time each utilized
video takes.
3.2. Selection of candidate transition stage
Based on the motion of the object and/or the camera, shots may be categorized into four types: static
objects with static cameras, dynamic objects with static cameras, dynamic objects with dynamic cameras, and
dynamic objects with dynamic cameras [25]. Candidate transition selection is performed based on comparison
made on consecutive frames objects. This stage is achieved by comparison made to the extracted objects of
frames images based on the created features template by combining multiple features texture, edges, and L*
value of L*a*b* color space applied to the k-means technique.
The stage starts choosing potentially cut transition frames by examining the standard deviation to the
differences of vectors created from frames object blocks for similarity comparison after the frames Fri and
Fri+1 pass the first stage based on their correlation value measure. Table 3 and Figure 4 describe how the block
size of the frame image object is determined empirically, where Figure 4(a) explains the block size effect on
execution time and Figure 4(b) demonstrates the effect of block size on F-score. Vid3, Vid4, and Vid8 are
taken as examples to demonstrate that block size affects execution time and accuracy in this table. For this
investigation, 8*8 blocks with a 32*32 block size are more appropriate in this study.
6. Comput Sci Inf Technol ISSN: 2722-3221
Video shot boundary detection based on frames objects comparison and scale… (Noor Khalid Ibrahim)
135
Table 2. Time consuming comparison
Videos Execution time
in Sec. (with reduction)
Execution time
in Sec. (without reduction)
Vid1 44.41 111.42
Vid2 224.94 785.83
Vid3 99.75 314.75
Vid4 129.07 363.44
Vid5 70.178 177.52
Vid6 320.34 566.40
Vid7 111.47 421.15
Vid8 201.23 679.02
Figure 3. Comparison in execution time
(a) (b)
Figure 4. Block size effect, (a) on execution time and (b) on F-score
Table 3. The effect of block size within 256*256 frame size
videos 4*4 blocks (64*64 block size) 8*8 blocks (32*32 block size)
Execution time
In Sec.
F-score Execution time
In Sec.
F-score
Vid3 183.58 0.95 99.75 0.96
Vid4 148.92 0.90 129.07 1.00
Vid8 311.36 0.90 201.23 0.97
To explain the frame’s object extraction, for example with samples of frames that explained in
Figure 5, the frame objects extraction method steps demonstrate in Figure 6. The recovered combined features
(Texture, frame edge, and L* value of L*a*b* color space) from frames i and i+1 create the template features
for each one. The frame objects are then extracted for the frame similarity comparison using the k-means
approach. If identical objects are found in two consecutive frames, they are likely associated with the same
shot; if not, a cut shot transition is a possibility. The significant problem of object and camera movements can
be addressed by similarity discovery based on object comparison because the frame object is recognized where
it should be in the image of succeeding frames.
7. ISSN: 2722-3221
Comput Sci Inf Technol, Vol. 5, No. 2, July 2024: 130-139
136
Figure 5. transitions examples
Figure 6. Example of consecutive frames objects extracted
This proposed object extraction method has been assessed for adopting in this proposed SBD
algorithim. According to Table 4 and Figure 7, which describe the information content as determined by the
entropy value that means the accuracy of extracting objects by the proposed extraction method of frame, in this
table some frames that apply extraction its objects from some different used videos are selected as samples for
evaluation. As a result of this evaluation explained in this table, and from the analysis of this evaluation, this
proposed object extraction operation has been adopted in this stage of the proposed SBD algorithm.
Figure 7. Object extraction accuracy using entropy
Table 4. Object extraction evaluation using entropy measure (Ent)
Vid2 Vid3 Vid5 Vid6 Vid7 Vid8
Fr. 397 398 262 263 432 432 82 83 568 569 1375 1376 517
Ent 0.926 0.634 0.890 0.985 0.660 0.969 0.998 0.989 0.968 0.992 0.979 0.956 0.884
8. Comput Sci Inf Technol ISSN: 2722-3221
Video shot boundary detection based on frames objects comparison and scale… (Noor Khalid Ibrahim)
137
3.3. Transition decision stage
The SIFT properties are adopted in this stage for shot transition decision-making because when it
comes to rotation, and zoom, SIFT characteristics remain unaffected and it able to reflect the local variation of
moving object efficiently, and may be used to impartially characterize the image [2]. SIFT key points are
detected, features are extracted from candidate frames of video results from the previous stage, then feature
matching is performed. In features matching two features’ matrices of frame i, frame i+1 have been matched
using distance calculation results in a p-by-1 vector, where p represents the key point number that is detected.
And from the matched features shot transition decision-making, when the matching degree of the SIFT feature
between the frames is low, neighboring frames are recognized as belonging to different shots. Figure 8(a)
demonstrates features key point matching for frames in same shot, and Figure 8(b) from different shot.
(a) (b)
Figure 8. Frames shots feature key points matching, (a) frames in the same shot and (b) frames in a
different shot
As seen in the figure, due to comparable visual features, the similarity matching between two frames
in the same shot is typically high. Frames from diverse shots, however, lack visual uniformity. They therefore
have either little or no similarity matching.
Recall and precision are the key performance metrics of the suggested system that are typically
employed in the SBD process. The F1 score, which is the harmonic mean of precision and recall, is used in
this paper's evaluation along with these metrics [2]. The following formula can be used to compute these
metrics [5]:
𝑅 =
𝑡𝑟𝑢𝑒
𝑡𝑟𝑢𝑒 + 𝑚𝑖𝑠𝑠
(7)
𝑃 =
𝑡𝑟𝑢𝑒
𝑡𝑟𝑢𝑒 + 𝑓𝑎𝑙𝑠𝑒
(8)
𝐹 − 𝑠𝑐𝑜𝑟𝑒 =
2 ∗ 𝑃 ∗ 𝑅
𝑃 + 𝑅
(9)
where True denotes accurate transition detection, False denotes inaccurate transition detection, and Miss
denotes missed transition detection. Table 5 demonstrates the accuracy with these metrics of this proposed
SBD algorithm.
Table 5. Efficiency of the proposed method
Video Recall Precision F-score
Vid1 1.00 1.00 1.00
Vid2 1.00 1.00 1.00
Vid3 0.93 1.00 0.96
Vid4 1.00 1.00 1.00
Vid5 1.00 1.00 1.00
Vid6 0.87 0.96 0.91
Vid7 1.00 1.00 1.00
Vid8 0.94 0.94 0.94
Average 0.96 0.98 0.97
9. ISSN: 2722-3221
Comput Sci Inf Technol, Vol. 5, No. 2, July 2024: 130-139
138
3. CONCLUSION
By comparing frame image objects and using a scale-invariant feature transform SIFT feature with
the discard to the redundant frames of the same shot, the suggested SBD approach has been realized. Three
stages are involved in implementing this proposed system: first, the redundancy frames are reduced based on
their correlation value; this reduces computation complexity and time consumption; second, the candidate shot
transition and boundary are identified based on object comparison using proposed extraction method; this stage
can identify objects that where should be in the image of subsequent frames. The last step then uses the SIFT
feature to choose which of these candidate frames to select. The research demonstrates that this approach
minimizes false positives by utilizing SIFT matching key points, which are independent of the scale and
rotation of the image. Our method yields a 97% F1 score, which is high result while requiring a lesser amount
of time and complexity.
ACKNOWLEDGEMENTS
Authors thank the Department of Computer Science, College of Science, Mustansiriyah University,
Baghdad-Iraq for supporting this present work.
REFERENCES
[1] Z. El Khattabi, Y. Tabii, and A. Benkaddour, “Video shot boundary detection using the scale invariant feature transform and RGB
color channels.,” International Journal of Electrical & Computer Engineering (2088-8708), vol. 7, no. 5, 2017.
[2] L. Kong, “SIFT feature-based video camera boundary detection algorithm,” Complexity, vol. 2021, pp. 1–11, 2021.
[3] T. Zhou, F. Porikli, D. J. Crandall, L. Van Gool, and W. Wang, “A survey on deep learning technique for video segmentation,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 6, pp. 7099–7122, 2022.
[4] D. M. Thounaojam, T. Khelchandra, K. M. Singh, and S. Roy, “A genetic algorithm and fuzzy logic approach for video shot
boundary detection,” Computational intelligence and neuroscience, vol. 2016, 2016.
[5] E. Hato, “Temporal video segmentation using optical flow estimation,” Iraqi Journal of Science, pp. 4181–4194, 2021.
[6] H. Shao, Y. Qu, and W. Cui, “Shot boundary detection algorithm based on HSV histogram and HOG feature,” in 2015 International
Conference on Advanced Engineering Materials and Technology, Atlantis Press, pp. 951–957, 2015.
[7] S. Tippaya, S. Sitjongsataporn, T. Tan, M. M. Khan, and K. Chamnongthai, “Multi-modal visual features-based video shot boundary
detection,” IEEE Access, vol. 5, pp. 12563–12575, 2017, doi: 10.1109/ACCESS.2017.2717998.
[8] S. Akpinar and F. Alpaslan, “A novel optical flow-based representation for temporal video segmentation,” Turkish Journal of
Electrical Engineering and Computer Sciences, vol. 25, no. 5, pp. 3983–3993, 2017.
[9] M. Haroon, J. Baber, I. Ullah, S. M. Daudpota, M. Bakhtyar, and V. Devi, “Video scene detection using compact bag of visual word
models,” Advances in Multimedia, vol. 2018, pp. 1–9, 2018.
[10] F.-F. Duan and F. Meng, “Video shot boundary detection based on feature fusion and clustering technique,” IEEE Access, vol. 8,
pp. 214633–214645, 2020.
[11] Z. N. Idan, S. H. Abdulhussain, B. M. Mahmmod, K. A. Al-Utaibi, S. A. R. Al-Hadad, and S. M. Sait, “Fast shot boundary detection
based on separable moments and support vector machine,” IEEE Access, vol. 9, pp. 106412–106427, 2021.
[12] N. Kumar, “Shot boundary detection framework for video editing via adaptive thresholds and gradual curve point,” Turkish Journal
of Computer and Mathematics Education (TURCOMAT), vol. 12, no. 11, pp. 3820–3828, 2021.
[13] J. T. Jose, S. Rajkumar, M. R. Ghalib, A. Shankar, P. Sharma, and M. R. Khosravi, “Efficient shot boundary detection with multiple
visual representations,” Mobile Information Systems, vol. 2022, 2022.
[14] K. A. Akintoye, N. A. F. B. Ismial, N. Z. S. B. Othman, M. S. M. Rahim, and A. H. Abdullah, “Composite median Wiener filter
based technique for image enhancement.,” Journal of Theoretical & Applied Information Technology, vol. 96, no. 15, 2018.
[15] S. H. Majeed and N. A. M. Isa, “Adaptive entropy index histogram equalization for poor contrast images,” IEEE Access, vol. 9, pp.
6402–6437, 2020, doi: 10.1109/ACCESS.2020.3048148.
[16] A. M. Neto, A. C. Victorino, I. Fantoni, D. E. Zampieri, J. V. Ferreira, and D. A. Lima, “Image processing using Pearson’s
correlation coefficient: Applications on autonomous robotics,” in 2013 13th International Conference on Autonomous Robot
Systems, IEEE, pp. 1–6,2013.
[17] N. K. Ibrahim, A. H. Al-Saleh, and A. S. A. Jabar, “Texture and pixel intensity characterization-based image segmentation with
morphology and watershed techniques,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 31, no. 3, pp.
1464–1477, 2023. doi: 10.11591/ijeecs.v31.i3.
[18] N. khalid, “Hybrid features of mask generated with gabor filter for texture analysis and sobel operator for image regions
segmentation using K-Means technique,” Journal La Multiapp, vol. 3, no. 5, pp. 250–258, 2022, doi:
10.37899/journallamultiapp.v3i5.743.
[19] X. Zheng, Q. Lei, R. Yao, Y. Gong, and Q. Yin, “Image segmentation based on adaptive K-means algorithm,” EURASIP Journal
on Image and Video Processing, vol. 2018, no. 1, pp. 1–10, 2018.
[20] U. Petronas, “Mean and standard deviation features of color histogram using laplacian filter for content-based image retrieval,”
Journal of Theoretical and Applied Information Technology, vol. 34, no. 1, pp. 1–7, 2011.
[21] R. Sammouda and A. El-Zaart, “An optimized approach for prostate image segmentation using K-means clustering algorithm with
elbow method,” Computational Intelligence and Neuroscience, vol. 2021, 2021.
[22] N. Dhanachandra and Y. J. Chanu, “A new approach of image segmentation method using K-means and kernel based subtractive
clustering methods,” International Journal of Applied Engineering Research, vol. 12, no. 20, pp. 10458–10464, 2017.
[23] N. M. Kwok, Q. P. Ha, and G. Fang, “Effect of color space on color image segmentation,” in 2009 2nd International Congress on
Image and Signal Processing, IEEE, pp. 1–5, 2009.
[24] L. David, “Distinctive image features from scale-invariant keypoints,” International journal of computer vision, vol. 60, pp. 91–
110, 2004.
[25] S. H. Abdulhussain, A. R. Ramli, M. I. Saripan, B. M. Mahmmod, S. A. R. Al-Haddad, and W. A. Jassim, “Methods and challenges
in shot boundary detection: a review,” Entropy, vol. 20, no. 4, p. 214, 2018.
10. Comput Sci Inf Technol ISSN: 2722-3221
Video shot boundary detection based on frames objects comparison and scale… (Noor Khalid Ibrahim)
139
BIOGRAPHIES OF AUTHORS
Noor Khalid Ibrahim is lecturer at college of college of science, Mustansiriyah
University, Iraq. Received the B.Sc. degree in computer science from Department of
Computer, College of Science, Mustansiriyah University, Iraq. She holds a master degree in
computer science at 2015, with specialization in multi-media. Her research areas in image
processing. She can be contacted at email: noor.kh20@uomustansiriyah.edu.iq.
Zinah Sadeq Abduljabbar is lectuter at collage of science, Mustansiriyah
university, Iraq. Received the B.Sc. degree in computer science from department of computer,
collage of science, Mustansiriyah university, Iraq. She holds a master degree in computer
science at 2014, with specialization in multi-media. she can be contacted at email:
zinahsadeq@uomustansiriyah.edu.iq.