Application of neuroimaging in diagnosis of focal cortical dysplasia: A survey of computational techniques
Focal Cortical Dysplasia (FCD) is a neurodevelopmental disorder characterized by abnormal neuronal migration, differentiation, and maturation, resulting in a range of clinical symptoms including drug-resistant epilepsy. Accurate and timely ...
Voice-based age, gender, and language recognition based on ResNet deep model and transfer learning in spectro-temporal domain
In personal identity recognition systems, detecting a person's age, gender, and language using voice signal characteristics is a crucial issue, especially because of the importance of security considerations. Age, gender, and language ...
AdaMO: Adaptive Meta-Optimization for cold-start recommendation
Meta-learning has been proven effective in tackling the cold-start problem of recommendation systems. Most work in this line adopts the meta-optimization idea that learns global knowledge to initialize the base recommender and adapts it for ...
LPSRGAN: Generative adversarial networks for super-resolution of license plate image
This paper proposes a super-resolution algorithm for reconstructing license plate images based on generative adversarial networks (GAN) for improving the recognition rate of low-resolution license plate images. To this end, this paper first ...
Highlights
- A novel image degradation model called n-RCD has been designed.
- A powerful license plate image super-resolution algorithm, LPSRGAN, is designed.
- A downstream task-based perceptual OCR loss is introduced for license plate images.
SparseSwin: Swin transformer with sparse transformer block
- Krisna Pinasthika,
- Blessius Sheldo Putra Laksono,
- Riyandi Banovbi Putera Irsal,
- Syifa’ Hukma Shabiyya,
- Novanto Yudistira
Advancements in computer vision research have put transformer architecture as the state-of-the-art in computer vision tasks. One of the known drawbacks of the transformer architecture is the high number of parameters, this can lead to a more ...
Highlights
- Proposing Sparse Transformer Block with limited latent tokens to enhance Swin Transformer efficiency and performance.
- We examined the effect of regularization on the attention weight obtained in SparTa Block.
- We obtained accuracy ...
CAT: Continual Adapter Tuning for aspect sentiment classification
Humans can continually acquire, improve, and transfer knowledge throughout their lifespan so that they can accurately identify sentiment polarities of the data attributed to different domains. However, the continual learning of incrementally ...
CSPFormer: A cross-spatial pyramid transformer for visual place recognition
Recently, the Vision Transformer (ViT), which applied the Transformer structure to various visual detection tasks, has outperformed convolutional neural networks (CNNs). Nonetheless, due to the lack of scale representation ability of the ...
Learning a physics-based filter attachment for hyperspectral imaging with RGB cameras
Countless RGB cameras are ubiquitously distributed in our daily lives, serving to perceive and depict the diverse colors of the world. Reconstructing hyperspectral images (HSI) from these trichromatic cameras emerges as a promising solution to ...
Recurrent context layered radial basis function neural network for the identification of nonlinear dynamical systems
This paper proposes a novel recurrent context layered radial basis function neural network (RCLRBFNN) for the identification of nonlinear dynamical systems. The proposed model consists of an additional context layer in which the nodes represent ...
MRSLN: A Multimodal Residual Speaker-LSTM Network to alleviate the over-smoothing issue for Emotion Recognition in Conversation
Multimodal Emotion Recognition in Conversation (ERC) plays a significant role in the field of human–computer intelligent interaction since it enables computers to perceive and infer the emotions expressed by the individuals, thereby intelligently ...
Model-free adaptive optimal control for nonlinear multiplayer games with input disturbances
In this paper, we investigate a model-free identifier-critic-based optimal adaptive controller for multiplayer games with the input disturbances. Specifically, we first adopt the identifier neural network to identify the system dynamics. ...
Adaptive attention fusion network for cross-device GUI element re-identification in crowdsourced testing
The rapid growth of mobile devices has ushered in an era of different device platforms. Different devices require a consistent user experience, especially with similar graphical user interfaces (GUIs). However, the different code bases of the ...
Highlights
- ERINet: A novel CNN model for re-identifying GUI elements across devices.
- Attention mechanism with Learnable Factor enhances GUI element re-identification.
- Dataset: 31,098 training, 115,704 testing GUI images, and 67 backgrounds.
Graph-based geometric structure line parsing
Line drawings by artists capture the organization, relationships, and semantics of observable objects. To endow machines with similar capacities and improve the storage and processing of such drawings, we investigate uniform representation and ...
A nonlocal feature self-similarity based tensor completion method for video recovery
The nuclear norm-based tensor completion method effectively recovers missing multidimensional data in videos by minimizing the truncated nuclear norm. However, the conventional thresholding approach might overly punish larger singular values, ...
Fixed-precision randomized quaternion singular value decomposition algorithm for low-rank quaternion matrix approximations
The fixed-precision randomized quaternion singular value decomposition algorithm (FPRQSVD) is presented to compute the low-rank quaternion matrix approximation. The FPRQSVD algorithm estimates the appropriate rank according to the given tolerance ...
DARTS-PT-CORE: Collaborative and Regularized Perturbation-based Architecture Selection for differentiable NAS
DARTS-PT is a well-known differentiable NAS method that measures the operation strength through its contribution to the supernet performance, extracting architecture from the underlying supernet. However, persistent issues of degraded ...
Bumpless transfer consensus control for linear multi-agent systems under agent-dependent switching directed topologies
The bumpless transfer consensus control problem for linear homogeneous multi-agent systems subject to agent-dependent switching directed communication topologies is investigated in this paper. According to the information of graph Laplacian ...
A motion-aware and temporal-enhanced Spatial–Temporal Graph Convolutional Network for skeleton-based human action segmentation
Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed ...
DESReg: Dynamic Ensemble Selection library for Regression tasks
- María D. Pérez-Godoy,
- Marta Molina,
- Francisco Martínez,
- David Elizondo,
- Francisco Charte,
- Antonio J. Rivera
Nowadays, regression is a very demanded predictive task to solve a wide range of problems belonging to different research and society areas. Examples of applications include industry, economic, medical and energy fields. Ensemble methodology ...
Non-intrusive speech quality assessment: A survey
Speech quality is a critical consideration for applications such as speech enhancement, coding, transmission, and synthesis. Accurately evaluating the quality of degraded speech without a reference is particularly challenging. As a result, non-...
An interpretable lightweight deep network with ℓ p ( 0 < p < 1 ) model-driven for single image super-resolution
In order to address the expensive computation cost of deep networks, some Single Image Super-Resolution (SISR) methods tried to design the lightweight networks by means of recursion or expert prior. However, they discuss the theoretical ...
Domain adaptive remote sensing image semantic segmentation with prototype guidance
Current unsupervised domain adaptation (UDA) techniques in semantic segmentation effectively decrease the domain discrepancy between the labeled source domain and unlabeled target domain, thereby enhancing the model’s pixel-wise discriminative ...
ZVQAF: Zero-shot visual question answering with feedback from large language models
Due to the prominent zero-shot generalization in new language tasks shown by large language models (LLMs), applying LLMs for zero-shot visual question answering (VQA) has been a new trend. However, most prior approaches directly use off-the-shelf ...
Semi-supervised domain adaptation on graphs with contrastive learning and minimax entropy
Label scarcity in a graph is frequently encountered in real-world applications due to the high cost of data labeling. To this end, semi-supervised domain adaptation (SSDA) on graphs aims to leverage the knowledge of a labeled source graph to aid ...
Hierarchical vector transformer vehicle trajectories prediction with diffusion convolutional neural networks
In dynamic and interactive autonomous driving scenarios, accurately predicting the future movements of vehicle agents is crucial. However, current methods often fail to capture trajectory uncertainty, leading to limitations in trajectory ...
Enhancing long-term person re-identification using global, local body part, and head streams
This work addresses the task of long-term person re-identification. Typically, person re-identification assumes that people do not change their clothes, which limits its applications to short-term scenarios. To overcome this limitation, we ...
SOIRP: Subject-Object Interaction and Reasoning Path based joint relational triple extraction by table filling
Joint relational triple extraction methods based on table filling have gained considerable attention in recent years due to remarkable effectiveness and capabilities of extracting relational triples from complicated sentences. However, most of ...
EAP: An effective black-box impersonation adversarial patch attack method on face recognition in the physical world
Face recognition models and systems based on deep neural networks are vulnerable to adversarial examples. However, existing attacks on face recognition are either impractical or ineffective for black-box impersonation attacks in the physical ...