Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (18)

Search Parameters:
Keywords = video inpainting

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 8781 KiB  
Article
A Virtual View Acquisition Technique for Complex Scenes of Monocular Images Based on Layered Depth Images
by Qi Wang and Yan Piao
Appl. Sci. 2024, 14(22), 10557; https://doi.org/10.3390/app142210557 - 15 Nov 2024
Viewed by 629
Abstract
With the rapid development of stereoscopic display technology, how to generate high-quality virtual view images has become the key in the applications of 3D video, 3D TV and virtual reality. The traditional virtual view rendering technology maps the reference view into the virtual [...] Read more.
With the rapid development of stereoscopic display technology, how to generate high-quality virtual view images has become the key in the applications of 3D video, 3D TV and virtual reality. The traditional virtual view rendering technology maps the reference view into the virtual view by means of 3D transformation, but when the background area is occluded by the foreground object, the content of the occluded area cannot be inferred. To solve this problem, we propose a virtual view acquisition technique for complex scenes of monocular images based on a layered depth image (LDI). Firstly, the depth discontinuities of the edge of the occluded area are reasonably grouped by using the multilayer representation of the LDI, and the depth edge of the occluded area is inpainted by the edge inpainting network. Then, the generative adversarial network (GAN) is used to fill the information of color and depth in the occluded area, and the inpainting virtual view is generated. Finally, GAN is used to optimize the color and depth of the virtual view, and the high-quality virtual view is generated. The effectiveness of the proposed method is proved by experiments, and it is also applicable to complex scenes. Full article
Show Figures

Figure 1

31 pages, 13435 KiB  
Article
Real-Time Camera Operator Segmentation with YOLOv8 in Football Video Broadcasts
by Serhii Postupaiev, Robertas Damaševičius and Rytis Maskeliūnas
AI 2024, 5(2), 842-872; https://doi.org/10.3390/ai5020042 - 6 Jun 2024
Cited by 3 | Viewed by 3325
Abstract
Using instance segmentation and video inpainting provides a significant leap in real-time football video broadcast enhancements by removing potential visual distractions, such as an occasional person or another object accidentally occupying the frame. Despite its relevance and importance in the media industry, this [...] Read more.
Using instance segmentation and video inpainting provides a significant leap in real-time football video broadcast enhancements by removing potential visual distractions, such as an occasional person or another object accidentally occupying the frame. Despite its relevance and importance in the media industry, this area remains challenging and relatively understudied, thus offering potential for research. Specifically, the segmentation and inpainting of camera operator instances from video remains an underexplored research area. To address this challenge, this paper proposes a framework designed to accurately detect and remove camera operators while seamlessly hallucinating the background in real-time football broadcasts. The approach aims to enhance the quality of the broadcast by maintaining its consistency and level of engagement to retain and attract users during the game. To implement the inpainting task, firstly, the camera operators instance segmentation method should be developed. We used a YOLOv8 model for accurate real-time operator instance segmentation. The resulting model produces masked frames, which are used for further camera operator inpainting. Moreover, this paper presents an extensive “Cameramen Instances” dataset with more than 7500 samples, which serves as a solid foundation for future investigations in this area. The experimental results show that the YOLOv8 model performs better than other baseline algorithms in different scenarios. The precision of 95.5%, recall of 92.7%, mAP50-95 of 79.6, and a high FPS rate of 87 in low-volume environment prove the solution efficacy for real-time applications. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

19 pages, 13136 KiB  
Article
DSOMF: A Dynamic Environment Simultaneous Localization and Mapping Technique Based on Machine Learning
by Shengzhe Yue, Zhengjie Wang and Xiaoning Zhang
Sensors 2024, 24(10), 3063; https://doi.org/10.3390/s24103063 - 11 May 2024
Viewed by 1205
Abstract
To address the challenges of reduced localization accuracy and incomplete map construction demonstrated using classical semantic simultaneous localization and mapping (SLAM) algorithms in dynamic environments, this study introduces a dynamic scene SLAM technique that builds upon direct sparse odometry (DSO) and incorporates instance [...] Read more.
To address the challenges of reduced localization accuracy and incomplete map construction demonstrated using classical semantic simultaneous localization and mapping (SLAM) algorithms in dynamic environments, this study introduces a dynamic scene SLAM technique that builds upon direct sparse odometry (DSO) and incorporates instance segmentation and video completion algorithms. While prioritizing the algorithm’s real-time performance, we leverage the rapid matching capabilities of Direct Sparse Odometry (DSO) to link identical dynamic objects in consecutive frames. This association is achieved through merging semantic and geometric data, thereby enhancing the matching accuracy during image tracking through the inclusion of semantic probability. Furthermore, we incorporate a loop closure module based on video inpainting algorithms into our mapping thread. This allows our algorithm to rely on the completed static background for loop closure detection, further enhancing the localization accuracy of our algorithm. The efficacy of this approach is validated using the TUM and KITTI public datasets and the unmanned platform experiment. Experimental results show that, in various dynamic scenes, our method achieves an improvement exceeding 85% in terms of localization accuracy compared with the DSO system. Full article
Show Figures

Figure 1

18 pages, 16066 KiB  
Article
A Novel Frame-Selection Metric for Video Inpainting to Enhance Urban Feature Extraction
by Yuhu Feng, Jiahuan Zhang, Guang Li, Ren Togo, Keisuke Maeda, Takahiro Ogawa and Miki Haseyama
Sensors 2024, 24(10), 3035; https://doi.org/10.3390/s24103035 - 10 May 2024
Cited by 1 | Viewed by 1116
Abstract
In our digitally driven society, advances in software and hardware to capture video data allow extensive gathering and analysis of large datasets. This has stimulated interest in extracting information from video data, such as buildings and urban streets, to enhance understanding of the [...] Read more.
In our digitally driven society, advances in software and hardware to capture video data allow extensive gathering and analysis of large datasets. This has stimulated interest in extracting information from video data, such as buildings and urban streets, to enhance understanding of the environment. Urban buildings and streets, as essential parts of cities, carry valuable information relevant to daily life. Extracting features from these elements and integrating them with technologies such as VR and AR can contribute to more intelligent and personalized urban public services. Despite its potential benefits, collecting videos of urban environments introduces challenges because of the presence of dynamic objects. The varying shape of the target building in each frame necessitates careful selection to ensure the extraction of quality features. To address this problem, we propose a novel evaluation metric that considers the video-inpainting-restoration quality and the relevance of the target object, considering minimizing areas with cars, maximizing areas with the target building, and minimizing overlapping areas. This metric extends existing video-inpainting-evaluation metrics by considering the relevance of the target object and interconnectivity between objects. We conducted experiment to validate the proposed metrics using real-world datasets from Japanese cities Sapporo and Yokohama. The experiment results demonstrate feasibility of selecting video frames conducive to building feature extraction. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

22 pages, 4002 KiB  
Article
UFCC: A Unified Forensic Approach to Locating Tampered Areas in Still Images and Detecting Deepfake Videos by Evaluating Content Consistency
by Po-Chyi Su, Bo-Hong Huang and Tien-Ying Kuo
Electronics 2024, 13(4), 804; https://doi.org/10.3390/electronics13040804 - 19 Feb 2024
Cited by 2 | Viewed by 1743
Abstract
Image inpainting and Deepfake techniques have the potential to drastically alter the meaning of visual content, posing a serious threat to the integrity of both images and videos. Addressing this challenge requires the development of effective methods to verify the authenticity of investigated [...] Read more.
Image inpainting and Deepfake techniques have the potential to drastically alter the meaning of visual content, posing a serious threat to the integrity of both images and videos. Addressing this challenge requires the development of effective methods to verify the authenticity of investigated visual data. This research introduces UFCC (Unified Forensic Scheme by Content Consistency), a novel forensic approach based on deep learning. UFCC can identify tampered areas in images and detect Deepfake videos by examining content consistency, assuming that manipulations can create dissimilarity between tampered and intact portions of visual data. The term “Unified” signifies that the same methodology is applicable to both still images and videos. Recognizing the challenge of collecting a diverse dataset for supervised learning due to various tampering methods, we overcome this limitation by incorporating information from original or unaltered content in the training process rather than relying solely on tampered data. A neural network for feature extraction is trained to classify imagery patches, and a Siamese network measures the similarity between pairs of patches. For still images, tampered areas are identified as patches that deviate from the majority of the investigated image. In the case of Deepfake video detection, the proposed scheme involves locating facial regions and determining authenticity by comparing facial region similarity across consecutive frames. Extensive testing is conducted on publicly available image forensic datasets and Deepfake datasets with various manipulation operations. The experimental results highlight the superior accuracy and stability of the UFCC scheme compared to existing methods. Full article
(This article belongs to the Special Issue Image/Video Processing and Encoding for Contemporary Applications)
Show Figures

Figure 1

20 pages, 9989 KiB  
Article
FSTT: Flow-Guided Spatial Temporal Transformer for Deep Video Inpainting
by Ruixin Liu and Yuesheng Zhu
Electronics 2023, 12(21), 4452; https://doi.org/10.3390/electronics12214452 - 29 Oct 2023
Cited by 1 | Viewed by 1818
Abstract
Video inpainting aims to complete the missing regions with content that is consistent both spatially and temporally. How to effectively utilize the spatio-temporal information in videos is critical for video inpainting. Recent advances in video inpainting methods combine both optical flow and transformers [...] Read more.
Video inpainting aims to complete the missing regions with content that is consistent both spatially and temporally. How to effectively utilize the spatio-temporal information in videos is critical for video inpainting. Recent advances in video inpainting methods combine both optical flow and transformers to capture spatio-temporal information. However, these methods fail to fully explore the potential of optical flow within the transformer. Furthermore, the designed transformer block cannot effectively integrate spatio-temporal information across frames. To address the above problems, we propose a novel video inpainting model, named Flow-Guided Spatial Temporal Transformer (FSTT), which effectively establishes correspondences between missing regions and valid regions in both spatial and temporal dimensions under the guidance of completed optical flow. Specifically, a Flow-Guided Fusion Feed-Forward module is developed to enhance features with the assistance of optical flow, mitigating the inaccuracies caused by hole pixels when performing MHSA. Additionally, a decomposed spatio-temporal MHSA module is proposed to effectively capture spatio-temporal dependencies in videos. To improve the efficiency of the model, a Global–Local Temporal MHSA module is further designed based on the window partition strategy. Extensive quantitative and qualitative experiments on the DAVIS and YouTube-VOS datasets demonstrate the superiority of our proposed method. Full article
(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images)
Show Figures

Figure 1

19 pages, 4806 KiB  
Article
Video-Restoration-Net: Deep Generative Model with Non-Local Network for Inpainting and Super-Resolution Tasks
by Yuanfeng Zheng, Yuchen Yan and Hao Jiang
Appl. Sci. 2023, 13(18), 10001; https://doi.org/10.3390/app131810001 - 5 Sep 2023
Viewed by 1258
Abstract
Although deep learning-based approaches for video processing have been extensively investigated, the lack of generality in network construction makes it challenging for practical applications, particularly in video restoration. As a result, this paper presents a universal video restoration model that can simultaneously tackle [...] Read more.
Although deep learning-based approaches for video processing have been extensively investigated, the lack of generality in network construction makes it challenging for practical applications, particularly in video restoration. As a result, this paper presents a universal video restoration model that can simultaneously tackle video inpainting and super-resolution tasks. The network, called Video-Restoration-Net (VRN), consists of four components: (1) an encoder to extract features from each frame, (2) a non-local network that recombines features from adjacent frames or different locations of a given frame, (3) a decoder to restore the coarse video from the output of a non-local block, and (4) a refinement network to refine the coarse video on the frame level. The framework is trained in a three-step pipeline to improve training stability for both tasks. Specifically, we first suggest an automated technique to generate full video datasets for super-resolution reconstruction and another complete-incomplete video dataset for inpainting, respectively. A VRN is then trained to inpaint the incomplete videos. Meanwhile, the full video datasets are adopted to train another VRN frame-wisely and validate it against authoritative datasets. We show quantitative comparisons with several baseline models, achieving 40.5042 dB/0.99473 on PSNR/SSIM in the inpainting task, while during the SR task we obtained 28.41 dB/0.7953 and 27.25/0.8152 on BSD100 and Urban100, respectively. The qualitative comparisons demonstrate that our proposed model is able to complete masked regions and implement super-resolution reconstruction in videos of high quality. Furthermore, the above results show that our method has greater versatility both in video inpainting and super-resolution tasks compared to recent models. Full article
Show Figures

Figure 1

19 pages, 15710 KiB  
Article
Adaptable 2D to 3D Stereo Vision Image Conversion Based on a Deep Convolutional Neural Network and Fast Inpaint Algorithm
by Tomasz Hachaj
Entropy 2023, 25(8), 1212; https://doi.org/10.3390/e25081212 - 15 Aug 2023
Cited by 2 | Viewed by 4292
Abstract
Algorithms for converting 2D to 3D are gaining importance following the hiatus brought about by the discontinuation of 3D TV production; this is due to the high availability and popularity of virtual reality systems that use stereo vision. In this paper, several depth [...] Read more.
Algorithms for converting 2D to 3D are gaining importance following the hiatus brought about by the discontinuation of 3D TV production; this is due to the high availability and popularity of virtual reality systems that use stereo vision. In this paper, several depth image-based rendering (DIBR) approaches using state-of-the-art single-frame depth generation neural networks and inpaint algorithms are proposed and validated, including a novel very fast inpaint (FAST). FAST significantly exceeds the speed of currently used inpaint algorithms by reducing computational complexity, without degrading the quality of the resulting image. The role of the inpaint algorithm is to fill in missing pixels in the stereo pair estimated by DIBR. Missing estimated pixels appear at the boundaries of areas that differ significantly in their estimated distance from the observer. In addition, we propose parameterizing DIBR using a singular, easy-to-interpret adaptable parameter that can be adjusted online according to the preferences of the user who views the visualization. This single parameter governs both the camera parameters and the maximum binocular disparity. The proposed solutions are also compared with a fully automatic 2D to 3D mapping solution. The algorithm proposed in this work, which features intuitive disparity steering, the foundational deep neural network MiDaS, and the FAST inpaint algorithm, received considerable acclaim from evaluators. The mean absolute error of the proposed solution does not contain statistically significant differences from state-of-the-art approaches like Deep3D and other DIBR-based approaches using different inpaint functions. Since both the source codes and the generated videos are available for download, all experiments can be reproduced, and one can apply our algorithm to any selected video or single image to convert it. Full article
(This article belongs to the Special Issue Deep Learning Models and Applications to Computer Vision)
Show Figures

Figure 1

21 pages, 4424 KiB  
Article
A Salient Object Detection Method Based on Boundary Enhancement
by Falin Wen, Qinghui Wang, Ruirui Zou, Ying Wang, Fenglin Liu, Yang Chen, Linghao Yu, Shaoyi Du and Chengzhi Yuan
Sensors 2023, 23(16), 7077; https://doi.org/10.3390/s23167077 - 10 Aug 2023
Viewed by 1633
Abstract
Visual saliency refers to the human’s ability to quickly focus on important parts of their visual field, which is a crucial aspect of image processing, particularly in fields like medical imaging and robotics. Understanding and simulating this mechanism is crucial for solving complex [...] Read more.
Visual saliency refers to the human’s ability to quickly focus on important parts of their visual field, which is a crucial aspect of image processing, particularly in fields like medical imaging and robotics. Understanding and simulating this mechanism is crucial for solving complex visual problems. In this paper, we propose a salient object detection method based on boundary enhancement, which is applicable to both 2D and 3D sensors data. To address the problem of large-scale variation of salient objects, our method introduces a multi-level feature aggregation module that enhances the expressive ability of fixed-resolution features by utilizing adjacent features to complement each other. Additionally, we propose a multi-scale information extraction module to capture local contextual information at different scales for back-propagated level-by-level features, which allows for better measurement of the composition of the feature map after back-fusion. To tackle the low confidence issue of boundary pixels, we also introduce a boundary extraction module to extract the boundary information of salient regions. This information is then fused with salient target information to further refine the saliency prediction results. During the training process, our method uses a mixed loss function to constrain the model training from two levels: pixels and images. The experimental results demonstrate that our salient target detection method based on boundary enhancement shows good detection effects on targets of different scales, multi-targets, linear targets, and targets in complex scenes. We compare our method with the best method in four conventional datasets and achieve an average improvement of 6.2% on the mean absolute error (MAE) indicators. Overall, our approach shows promise for improving the accuracy and efficiency of salient object detection in a variety of settings, including those involving 2D/3D semantic analysis and reconstruction/inpainting of image/video/point cloud data. Full article
(This article belongs to the Special Issue Machine Learning Based 2D/3D Sensors Data Understanding and Analysis)
Show Figures

Figure 1

13 pages, 22083 KiB  
Article
SR-Inpaint: A General Deep Learning Framework for High Resolution Image Inpainting
by Haoran Xu, Xinya Li, Kaiyi Zhang, Yanbai He, Haoran Fan, Sijiang Liu, Chuanyan Hao and Bo Jiang
Algorithms 2021, 14(8), 236; https://doi.org/10.3390/a14080236 - 10 Aug 2021
Cited by 4 | Viewed by 5926
Abstract
Recently, deep learning has enabled a huge leap forward in image inpainting. However, due to the memory and computational limitation, most existing methods are able to handle only low-resolution inputs, typically less than 1 K. With the improvement of Internet transmission capacity and [...] Read more.
Recently, deep learning has enabled a huge leap forward in image inpainting. However, due to the memory and computational limitation, most existing methods are able to handle only low-resolution inputs, typically less than 1 K. With the improvement of Internet transmission capacity and mobile device cameras, the resolution of image and video sources available to users via the cloud or locally is increasing. For high-resolution images, the common inpainting methods simply upsample the inpainted result of the shrinked image to yield a blurry result. In recent years, there is an urgent need to reconstruct the missing high-frequency information in high-resolution images and generate sharp texture details. Hence, we propose a general deep learning framework for high-resolution image inpainting, which first hallucinates a semantically continuous blurred result using low-resolution inpainting and suppresses computational overhead. Then the sharp high-frequency details with original resolution are reconstructed using super-resolution refinement. Experimentally, our method achieves inspiring inpainting quality on 2K and 4K resolution images, ahead of the state-of-the-art high-resolution inpainting technique. This framework is expected to be popularized for high-resolution image editing tasks on personal computers and mobile devices in the future. Full article
(This article belongs to the Special Issue Algorithmic Aspects of Neural Networks)
Show Figures

Figure 1

15 pages, 36464 KiB  
Article
Joint Subtitle Extraction and Frame Inpainting for Videos with Burned-In Subtitles
by Haoran Xu, Yanbai He, Xinya Li, Xiaoying Hu, Chuanyan Hao and Bo Jiang
Information 2021, 12(6), 233; https://doi.org/10.3390/info12060233 - 29 May 2021
Viewed by 5210
Abstract
Subtitles are crucial for video content understanding. However, a large amount of videos have only burned-in, hardcoded subtitles that prevent video re-editing, translation, etc. In this paper, we construct a deep-learning-based system for the inverse conversion of a burned-in subtitle video to a [...] Read more.
Subtitles are crucial for video content understanding. However, a large amount of videos have only burned-in, hardcoded subtitles that prevent video re-editing, translation, etc. In this paper, we construct a deep-learning-based system for the inverse conversion of a burned-in subtitle video to a subtitle file and an inpainted video, by coupling three deep neural networks (CTPN, CRNN, and EdgeConnect). We evaluated the performance of the proposed method and found that the deep learning method achieved high-precision separation of the subtitles and video frames and significantly improved the video inpainting results compared to the existing methods. This research fills a gap in the application of deep learning to burned-in subtitle video reconstruction and is expected to be widely applied in the reconstruction and re-editing of videos with subtitles, advertisements, logos, and other occlusions. Full article
(This article belongs to the Special Issue Recent Advances in Video Compression and Coding)
Show Figures

Figure 1

22 pages, 32604 KiB  
Article
Object-Wise Video Editing
by Ashraf Siddique and Seungkyu Lee
Appl. Sci. 2021, 11(2), 671; https://doi.org/10.3390/app11020671 - 12 Jan 2021
Cited by 1 | Viewed by 2966
Abstract
Beyond time frame editing in video data, object level video editing is a challenging task; such as object removal in a video or viewpoint changes. These tasks involve dynamic object segmentation, novel view video synthesis and background inpainting. Background inpainting is a task [...] Read more.
Beyond time frame editing in video data, object level video editing is a challenging task; such as object removal in a video or viewpoint changes. These tasks involve dynamic object segmentation, novel view video synthesis and background inpainting. Background inpainting is a task of the reconstruction of unseen regions presented by object removal or viewpoint change. In this paper, we propose a video editing method including foreground object removal background inpainting and novel view video synthesis under challenging conditions such as complex visual pattern, occlusion, overlaid clutter and variation of depth in a moving camera. Our proposed method calculates a weighted confidence score on the basis of normalized difference between observed depth and predicted distance in 3D space. A set of potential points from epipolar lines from neighbor frames are collected, refined, and weighted to select a few number of highly qualified observations to fill the desired region of interest area in the current frame from video. Based on the background inpainting method, novel view video synthesis is conducted with arbitrary viewpoint. Our method is evaluated with both a public dataset and our own video clips and compared with multiple state of the art methods showing a superior performance. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

25 pages, 9089 KiB  
Article
Fast Hole Filling for View Synthesis in Free Viewpoint Video
by Hui-Yu Huang and Shao-Yu Huang
Electronics 2020, 9(6), 906; https://doi.org/10.3390/electronics9060906 - 29 May 2020
Cited by 7 | Viewed by 3551
Abstract
The recent emergence of three-dimensional (3D) movies and 3D television (TV) indicates an increasing interest in 3D content. Stereoscopic displays have enabled visual experiences to be enhanced, allowing the world to be viewed in 3D. Virtual view synthesis is the key technology to [...] Read more.
The recent emergence of three-dimensional (3D) movies and 3D television (TV) indicates an increasing interest in 3D content. Stereoscopic displays have enabled visual experiences to be enhanced, allowing the world to be viewed in 3D. Virtual view synthesis is the key technology to present 3D content, and depth image-based rendering (DIBR) is a classic virtual view synthesis method. With a texture image and its corresponding depth map, a virtual view can be generated using the DIBR technique. The depth and camera parameters are used to project the entire pixel in the image to the 3D world coordinate system. The results in the world coordinates are then reprojected into the virtual view, based on 3D warping. However, these projections will result in cracks (holes). Hence, we herein propose a new method of DIBR for free viewpoint videos to solve the hole problem due to these projection processes. First, the depth map is preprocessed to reduce the number of holes, which does not produce large-scale geometric distortions; subsequently, improved 3D warping projection is performed collectively to create the virtual view. A median filter is used to filter the hole regions in the virtual view, followed by 3D inverse warping blending to remove the holes. Next, brightness adjustment and adaptive image blending are performed. Finally, the synthesized virtual view is obtained using the inpainting method. Experimental results verify that our proposed method can produce a pleasant visibility of the synthetized virtual view, maintain a high peak signal-to-noise ratio (PSNR) value, and efficiently decrease execution time compared with state-of-the-art methods. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

8 pages, 219 KiB  
Editorial
Deep Learning Applications with Practical Measured Results in Electronics Industries
by Mong-Fong Horng, Hsu-Yang Kung, Chi-Hua Chen and Feng-Jang Hwang
Electronics 2020, 9(3), 501; https://doi.org/10.3390/electronics9030501 - 19 Mar 2020
Cited by 8 | Viewed by 3591
Abstract
This editorial introduces the Special Issue, entitled “Deep Learning Applications with Practical Measured Results in Electronics Industries”, of Electronics. Topics covered in this issue include four main parts: (I) environmental information analyses and predictions, (II) unmanned aerial vehicle (UAV) and object tracking [...] Read more.
This editorial introduces the Special Issue, entitled “Deep Learning Applications with Practical Measured Results in Electronics Industries”, of Electronics. Topics covered in this issue include four main parts: (I) environmental information analyses and predictions, (II) unmanned aerial vehicle (UAV) and object tracking applications, (III) measurement and denoising techniques, and (IV) recommendation systems and education systems. Four papers on environmental information analyses and predictions are as follows: (1) “A Data-Driven Short-Term Forecasting Model for Offshore Wind Speed Prediction Based on Computational Intelligence” by Panapakidis et al.; (2) “Multivariate Temporal Convolutional Network: A Deep Neural Networks Approach for Multivariate Time Series Forecasting” by Wan et al.; (3) “Modeling and Analysis of Adaptive Temperature Compensation for Humidity Sensors” by Xu et al.; (4) “An Image Compression Method for Video Surveillance System in Underground Mines Based on Residual Networks and Discrete Wavelet Transform” by Zhang et al. Three papers on UAV and object tracking applications are as follows: (1) “Trajectory Planning Algorithm of UAV Based on System Positioning Accuracy Constraints” by Zhou et al.; (2) “OTL-Classifier: Towards Imaging Processing for Future Unmanned Overhead Transmission Line Maintenance” by Zhang et al.; (3) “Model Update Strategies about Object Tracking: A State of the Art Review” by Wang et al. Five papers on measurement and denoising techniques are as follows: (1) “Characterization and Correction of the Geometric Errors in Using Confocal Microscope for Extended Topography Measurement. Part I: Models, Algorithms Development and Validation” by Wang et al.; (2) “Characterization and Correction of the Geometric Errors Using a Confocal Microscope for Extended Topography Measurement, Part II: Experimental Study and Uncertainty Evaluation” by Wang et al.; (3) “Deep Transfer HSI Classification Method Based on Information Measure and Optimal Neighborhood Noise Reduction” by Lin et al.; (4) “Quality Assessment of Tire Shearography Images via Ensemble Hybrid Faster Region-Based ConvNets” by Chang et al.; (5) “High-Resolution Image Inpainting Based on Multi-Scale Neural Network” by Sun et al. Two papers on recommendation systems and education systems are as follows: (1) “Deep Learning-Enhanced Framework for Performance Evaluation of a Recommending Interface with Varied Recommendation Position and Intensity Based on Eye-Tracking Equipment Data Processing” by Sulikowski et al. and (2) “Generative Adversarial Network Based Neural Audio Caption Model for Oral Evaluation” by Zhang et al. Full article
19 pages, 12068 KiB  
Article
Virtual View Synthesis Based on Asymmetric Bidirectional DIBR for 3D Video and Free Viewpoint Video
by Xiaodong Chen, Haitao Liang, Huaiyuan Xu, Siyu Ren, Huaiyu Cai and Yi Wang
Appl. Sci. 2020, 10(5), 1562; https://doi.org/10.3390/app10051562 - 25 Feb 2020
Cited by 12 | Viewed by 2763
Abstract
Depth image-based rendering (DIBR) plays an important role in 3D video and free viewpoint video synthesis. However, artifacts might occur in the synthesized view due to viewpoint changes and stereo depth estimation errors. Holes are usually out-of-field regions and disocclusions, and filling them [...] Read more.
Depth image-based rendering (DIBR) plays an important role in 3D video and free viewpoint video synthesis. However, artifacts might occur in the synthesized view due to viewpoint changes and stereo depth estimation errors. Holes are usually out-of-field regions and disocclusions, and filling them appropriately becomes a challenge. In this paper, a virtual view synthesis approach based on asymmetric bidirectional DIBR is proposed. A depth image preprocessing method is applied to detect and correct unreliable depth values around the foreground edges. For the primary view, all pixels are warped to the virtual view by the modified DIBR method. For the auxiliary view, only the selected regions are warped, which contain the contents that are not visible in the primary view. This approach reduces the computational cost and prevents irrelevant foreground pixels from being warped to the holes. During the merging process, a color correction approach is introduced to make the result appear more natural. In addition, a depth-guided inpainting method is proposed to handle the remaining holes in the merged image. Experimental results show that, compared with bidirectional DIBR, the proposed rendering method can reduce about 37% rendering time and achieve 97% hole reduction. In terms of visual quality and objective evaluation, our approach performs better than the previous methods. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop