Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

video scene
Recently Published Documents


TOTAL DOCUMENTS

265
(FIVE YEARS 32)

H-INDEX

19
(FIVE YEARS 2)

Author(s):  
Badri Narayan Subudhi ◽  
Manoj Kumar Panda ◽  
T. Veerakumar ◽  
Vinit Jakhetiya ◽  
S. Esakkirajan

2021 ◽  
Author(s):  
Yongkang Huang ◽  
Meiyu Liang

Abstract Inspired by the wide application of transformer in computer vision and its excellent ability in temporal feature learning. This paper proposes a novel and efficient spatio-temporal residual attention network for student action recognition in classroom teaching video. It first fuses 2D spatial convolution and 1D temporal convolution to study spatio-temporal feature, then combines the powerful Reformer to better study the deeper spatio-temporal characteristics with visual significance of student classroom action. Based on the spatio-temporal residual attention network, a single person action recognition model in classroom teaching video is proposed. Considering that there are often multiple students in the classroom video scene, on the basis of single person action recognition, combined with object detection and tracking technology, the association of temporal and spatial characteristics of the same student targets is established, so as to realize the multi-student action recognition in classroom video scene. The experimental results on classroom teaching video dataset and public video dataset show that the proposed model achieves higher action recognition performance than the existing excellent models and methods.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Hui Qian ◽  
Mengxuan Dai ◽  
Yong Ma ◽  
Jiale Zhao ◽  
Qinghua Liu ◽  
...  

Video situational information detection is widely used in the fields of video query, character anomaly detection, surveillance analysis, and so on. However, most of the existing researches pay much attention to the subject or video backgrounds, but little attention to the recognition of situational information. What is more, because there is no strong relation between the pixel information and the scene information of video data, it is difficult for computers to obtain corresponding high-level scene information through the low-level pixel information of video data. Video scene information detection is mainly to detect and analyze the multiple features in the video and mark the scenes in the video. It is aimed at automatically extracting video scene information from all kinds of original video data and realizing the recognition of scene information through “comprehensive consideration of pixel information and spatiotemporal continuity.” In order to solve the problem of transforming pixel information into scene information, this paper proposes a video scene information detection method based on entity recognition. This model integrates the spatiotemporal relationship between the video subject and object on the basis of entity recognition, so as to realize the recognition of scene information by establishing mapping relation. The effectiveness and accuracy of the model are verified by simulation experiments with the TV series as experimental data. The accuracy of this model in the simulation experiment can reach more than 85%.


2021 ◽  
Vol 12 (5) ◽  
pp. 1-19
Author(s):  
Yuan Cheng ◽  
Yuchao Yang ◽  
Hai-Bao Chen ◽  
Ngai Wong ◽  
Hao Yu

Real-time segmentation and understanding of driving scenes are crucial in autonomous driving. Traditional pixel-wise approaches extract scene information by segmenting all pixels in a frame, and hence are inefficient and slow. Proposal-wise approaches only learn from the proposed object candidates, but still require multiple steps on the expensive proposal methods. Instead, this work presents a fast single-shot segmentation strategy for video scene understanding. The proposed net, called S3-Net, quickly locates and segments target sub-scenes , and meanwhile extracts attention-aware time-series sub-scene features ( ats-features ) as inputs to an attention-aware spatio-temporal model (ASM) . Utilizing tensorization and quantization techniques, S3-Net is intended to be lightweight for edge computing. Experiments results on CityScapes, UCF11, HMDB51, and MOMENTS datasets demonstrate that the proposed S3-Net achieves an accuracy improvement of 8.1% versus the 3D-CNN based approach on UCF11, a storage reduction of 6.9× and an inference speed of 22.8 FPS on CityScapes with a GTX1080Ti GPU.


2021 ◽  
Author(s):  
Rosiana Natalie ◽  
Jolene Loh ◽  
Huei Suen Tan ◽  
Joshua Tseng ◽  
Ian Luke Yi-Ren Chan ◽  
...  
Keyword(s):  

2021 ◽  
Author(s):  
Toshal Patel ◽  
Alvin Yan Hong Yao ◽  
Yu Qiang ◽  
Wei Tsang Ooi ◽  
Roger Zimmermann

Author(s):  
Sajjan Kiran ◽  
Umesh Patil ◽  
P Siddarth Shankar ◽  
Poonam Ghuli

2021 ◽  
Author(s):  
Jiaxu Miao ◽  
Yunchao Wei ◽  
Yu Wu ◽  
Chen Liang ◽  
Guangrui Li ◽  
...  

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Qichang Xu

Aiming at the shortcomings of traditional moving target detection methods in complex scenes such as low detection accuracy and high complexity, and not considering the overall structure information of the video frame image, this paper proposes a moving-target detection based on sensor network. First, a low-power motion detection wireless sensor network node is designed to obtain motion detection information in real time. Secondly, the background of the video scene is quickly extracted by the time domain averaging method, and the video sequence and the background image are channel-merged to construct a deep full convolutional network model. Finally, the network model is used to learn the deep features of the video scene and output the pixel-level classification results to achieve moving target detection. This method not only can adapt to complex video scenes of different sizes but also has a simple background extraction method, which effectively improves the detection speed.


Export Citation Format

Share Document