Issue Downloads
Discriminative Visual Similarity Search with Semantically Cycle-consistent Hashing Networks
Deep hashing has great potential in large-scale visual similarity search due to its preferable efficiency in storage and computation. Technically, deep hashing for visual similarity search inherits the powerful representation capability of deep neural ...
Deepfake Video Detection via Predictive Representation Learning
Increasingly advanced deepfake approaches have made the detection of deepfake videos very challenging. We observe that the general deepfake videos often exhibit appearance-level temporal inconsistencies in some facial components between frames, resulting ...
LANBIQUE: LANguage-based Blind Image QUality Evaluation
Image quality assessment is often performed with deep networks that are fine-tuned to regress a human provided quality score of a given image. Usually, this approach may lack generalization capabilities and, while being highly precise on similar image ...
Smart City Construction and Management by Digital Twins and BIM Big Data in COVID-19 Scenario
With the rapid development of information technology and the spread of Corona Virus Disease 2019 (COVID-19), the government and urban managers are looking for ways to use technology to make the city smarter and safer. Intelligent transportation can play a ...
A Comprehensive Study of Deep Learning-based Covert Communication
Deep learning-based methods have been popular in multimedia analysis tasks, including classification, detection, segmentation, and so on. In addition to conventional applications, this model can be widely used for cover communication, i.e., information ...
Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition
Currently, many action recognition methods mostly consider the information from spatial streams. We propose a new perspective inspired by the human visual system to combine both spatial and temporal streams to measure their attention consistency. ...
Perturbation-enabled Deep Federated Learning for Preserving Internet of Things-based Social Networks
Federated Learning (FL), as an emerging form of distributed machine learning (ML), can protect participants’ private data from being substantially disclosed to cyber adversaries. It has potential uses in many large-scale, data-rich environments, such as ...
Dynamic Transfer Exemplar based Facial Emotion Recognition Model Toward Online Video
In this article, we focus on the dynamic facial emotion recognition from online video. We combine deep neural networks with transfer learning theory and propose a novel model named DT-EFER. In detail, DT-EFER uses GoogLeNet to extract the deep features of ...
SETTI: A Self-supervised AdvErsarial Malware DeTection ArchiTecture in an IoT Environment
In recent years, malware detection has become an active research topic in the area of Internet of Things (IoT) security. The principle is to exploit knowledge from large quantities of continuously generated malware. Existing algorithms practise available ...
PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications
- Abbas Khan,
- Ijaz Ul Haq,
- Tanveer Hussain,
- Khan Muhammad,
- Mohammad Hijji,
- Muhammad Sajjad,
- Victor Hugo C. De Albuquerque,
- Sung Wook Baik
Deep Learning models’ performance strongly correlate with availability of annotated data; however, massive data labeling is laborious, expensive, and error-prone when performed by human experts. Active Learning (AL) effectively handles this challenge by ...
Deep Q Network–Driven Task Offloading for Efficient Multimedia Data Analysis in Edge Computing–Assisted IoV
With the prosperity of Industry 4.0, numerous emerging industries continue to gain popularity and their market scales are expanding ceaselessly. The Internet of Vehicles (IoV), one of the thriving intelligent industries, enjoys bright development ...
Optimized Deep-Neural Network for Content-based Medical Image Retrieval in a Brownfield IoMT Network
In this paper, a brownfield Internet of Medical Things network is introduced for imaging data that can be easily scaled out depending on the objectives, functional requirements, and the number of facilities and devices connected to it. This is further ...
A Sorting Fuzzy Min-Max Model in an Embedded System for Atrial Fibrillation Detection
Atrial fibrillation detection (AFD) has attracted much attention in the field of embedded systems. In this study, we propose a sorting fuzzy min-max (SFMM) model, and then develop an SFMM-based embedded system for AF detection. The proposed SFMM model is ...
Revisiting Local Descriptor for Improved Few-Shot Classification
Few-shot classification studies the problem of quickly adapting a deep learner to understanding novel classes based on few support images. In this context, recent research efforts have been aimed at designing more and more complex classifiers that measure ...
GLPose: Global-Local Representation Learning for Human Pose Estimation
Multi-frame human pose estimation is at the core of many computer vision tasks. Although state-of-the-art approaches have demonstrated remarkable results for human pose estimation on static images, their performances inevitably come short when being ...
3D Skeleton and Two Streams Approach to Person Re-identification Using Optimized Region Matching
Person re-identification (Re-ID) is a challenging and arduous task due to non-overlapping views, complex background, and uncontrollable occlusion in video surveillance. An existing method for capturing pedestrian local region information is to divide ...
Rank-in-Rank Loss for Person Re-identification
Person re-identification (re-ID) is commonly investigated as a ranking problem. However, the performance of existing re-ID models drops dramatically, when they encounter extreme positive-negative class imbalance (e.g., very small ratio of positive and ...
Guided Graph Attention Learning for Video-Text Matching
As a bridge between videos and natural languages, video-text matching has been a hot multimedia research topic in recent years. Such cross-modal retrieval is usually achieved by learning a common embedding space where videos and text captions are directly ...
CL2R: Compatible Lifelong Learning Representations
In this article, we propose a method to partially mimic natural intelligence for the problem of lifelong learning representations that are compatible. We take the perspective of a learning agent that is interested in recognizing object instances in an ...