RL-NBV: A deep reinforcement learning based next-best-view method for unknown object reconstruction
The Next-Best-View (NBV) algorithm is a key component in autonomous unknown object reconstruction. It iteratively determines the optimal sensor pose to capture the maximum information about the object under reconstruction. However, prevailing ...
Highlights
- We propose an innovative RL-NBV method for unknown object reconstruction.
- The proposed method leverages point cloud data as a component of the observation.
- Experiments demonstrate the effectiveness and efficiency of the proposed ...
Weight Saliency search with Semantic Constraint for Neural Machine Translation attacks
Text adversarial attack is an effective way to improve the robustness of Neural Machine Translation (NMT) models. Existing NMT attack tasks are often completed by replacing words. However, most of previous works pursue a high attack success rate ...
Highlights
- Optimize word substitution with word saliency to reduce word replacement rate.
- Constrain objective function with semantic similarity loss to ensure inconspicuous semantic changes.
- Generate higher grammar accuracy and ...
Graph node matching for edit distance
Graphs are commonly used to model interactions between elements of a set, but computing the Graph Edit Distance between two graphs is an NP-complete problem that is particularly challenging for large graphs. To address this problem, we propose a ...
Graphical abstractDisplay Omitted
Highlights
- GNOME is a deep neural architecture for the approximation of the graph edit distance.
- Graph features are enriched by Random Walk descriptors to improve expressivity.
- Embeddings are learned through GIN and matched through Linear Sum ...
Social domain integrated semantic self-discovery method for recommendation
Recommender systems effectively improve the convenience for users to access interesting information from vast amounts of data resources. However, the issue of data sparsity significantly impacts recommendation performance. Heterogeneous ...
Highlights
- A social domain integrated semantic self-discovery recommendation method is proposed.
- A multidimensional semantic knowledge mining method is designed.
- A semantic self-discovery method is proposed to find the optimal meta-paths.
VTHSC-MIR: Vision Transformer Hashing with Supervised Contrastive learning based medical image retrieval
In past few years, deep learning based medical image analysis technologies have significantly improved computer-assisted tasks like detecting, diagnosing, and predicting medical outcomes. The monitoring and diagnosis of ailments such as cancer ...
Graphical abstractDisplay Omitted
Highlights
- Novel framework of Vision Transformer Hashing with Supervised Contrastive Learning Loss (VTHSC) model for medical image retrieval.
- Leverages the ViT model as a generic feature extractor utilizing hashing module, and training within a ...
Reading QR Codes on challenging surfaces using thin-plate splines
- Ismael Benito-Altamirano,
- David Martínez-Carpena,
- Hanna Lizarzaburu-Aguilar,
- Cristian Fàbrega,
- Joan Daniel Prades
In real world uses, QR Codes are printed or overlaid on top of complex surfaces, like cylindrical bottles or other daily objects with random topographies that pose big challenges to their readout with the conventional planar algorithms proposed ...
Graphical abstractDisplay Omitted
Highlights
- QR Codes on top of daily objects with complex surfaces are challenging to read.
- The visual features of standard QR Codes can be landmarks to predict such underlying surface.
- Basic transformations (like affine, projective or ...
Neighbors selective Graph Convolutional Network for homophily and heterophily
Graph Convolutional Networks (GCNs) gain remarkable success in graph-related tasks under homophily graph assumption—most connected nodes have the same label. However, this assumption is fragile since heterophily is common in real-world networks, ...
Highlights
- We develop a Neighborhood Distribution-induced Similarity measure.
- We design a Selective-Neighbors Gated Unit.
- We propose a novel Neighbors Selective Graph Convolutional Network.
- Better or comparable performance on ten datasets ...
Query-guided generalizable medical image segmentation
The practical implementation of deep neural networks in clinical settings faces hurdles due to variations in data distribution across different centers. While the incorporation of query-guided Transformer has improved performance across diverse ...
Highlights
- Introducing a plug-and-play module for adapting to varying distribution shifts.
- Segmenting directly on updated queries rather than parametric classification.
- Incorporating an auxiliary task to improve model convergence and ...
MFAE: Multimodal Fusion and Alignment for Entity-level Disinformation Detection
Nowadays, the dissemination of disinformation on social media has evolved from a purely textual form to multiple modalities consisting of both text and images. This further amplifies the misleading and deceptive nature of disinformation. ...
Highlights
- We propose a new disinformation detection model MFAE.
- We propose an improved dynamic routing algorithm surpasses the attention mechanism.
- We use a graph neural network to further align features.
- The MFAE model achieves the ...
Transforming gradient-based techniques into interpretable methods
The explication of Convolutional Neural Networks (CNN) through xAI techniques often poses challenges in interpretation. The inherent complexity of input features, notably pixels extracted from images, engenders complex correlations. Gradient-...
Highlights
- We propose GAD method to simplify explanations, providing easy-to-interpret maps.
- Our gradient-based method focuses on revealing feature importance in CNNs.
- GAD minimizes noise in explanations compared to usual gradient-based ...
Structural self-similarity pattern in global food prices: Utilizing a segmented multifractal detrended fluctuation analysis
This paper provides a comprehensive analysis of the structural self-similarity observed in global food prices, focusing specifically on key commodities such as olive oil, eggs, bread, chicken, and beef. Employing Segmented Multifractal Detrended ...
Highlights
- The structural self-similarity in global food prices is studied.
- We use a segmented multifractal detrended fluctuation analysis (SMF-DFA) for this aim.
- SMF-DFA allows a piecewise multifractal analysis tools.
- Levene’s test, ...
Edge-preserving image restoration based on a weighted anisotropic diffusion model
Partial differential equation-based methods have been widely applied in image restoration. The anisotropic diffusion model has a good noise removal capability without affecting significant edges. However, existing anisotropic diffusion-based ...
Highlights
- We find the weighted anisotropic diffusion coefficient function with high convergence speed.
- The adaptive threshold parameter helps keep more details in restored images.
- Multi-scale feature map fusing can reduce staircase artifacts ...
Self-supervised assisted multi-task learning network for one-shot defect segmentation with fake defect generation
Object surface defect segmentation is extremely crucial for automatic quality inspection in the field of industrial production. However, following deficiencies of most current defect semantic algorithms such as over-reliance on large-scale ...
Highlights
- multi-task learning for one-shot segmentation.
- High-level attention for the defect details.
- method solving domain gap.
Discovering the signal subgraph: An iterative screening approach on graphs
Supervised learning on graphs is a challenging task due to the high dimensionality and inherent structural dependencies in the data, where each edge depends on a pair of vertices. Existing conventional methods are designed for standard Euclidean ...
Highlights
- An iterative feature screening method for identifying signal vertices in graphs.
- Theoretical guarantee for high-probability recovery of ground-truth vertices.
- The signal subgraph is Bayes optimal under the Erdos-Renyi graph model.
Global-local graph neural networks for node-classification
The task of graph node classification is often approached by utilizing a local Graph Neural Network (GNN), that learns only local information from the node input features and their adjacency. In this paper, we propose to improve the performance ...
Highlights
- We propose to learn label features to capture global information of the input graph.
- We fuse label and node features to predict a node-classification map.
- We qualitatively demonstrate our method by illustrating the learnt label and ...
Zigzag persistence for image processing: New software and applications
Topological image analysis is a powerful tool for understanding the structure and topology of images, being persistent homology one of its most popular methods. However, persistent homology requires a chain of inclusions of topological spaces, ...
Highlights
- Algorithm to build a simplicial complex associated to a binary digital image.
- Algorithm computing the homology classes of a list of images via zigzag persistence.
- Easy-to-use software (with GUI) to compute the previous algorithm ...
Latent spectral regularization for continual learning
- Emanuele Frascaroli,
- Riccardo Benaglia,
- Matteo Boschini,
- Luca Moschella,
- Cosimo Fiorini,
- Emanuele Rodolà,
- Simone Calderara
While biological intelligence grows organically as new knowledge is gathered throughout life, Artificial Neural Networks forget catastrophically whenever they face a changing training data distribution. Rehearsal-based Continual Learning (CL) ...
Highlights
- We study the geometry of a model’s latent space in a Continual Learning setting.
- We propose Continual Spectral Regularizer, a geometrically motivated regularizer.
- We combine CaSpeR with SOTA rehearsal-based CL approaches in ...
HARWE: A multi-modal large-scale dataset for context-aware human activity recognition in smart working environments
- Alireza Esmaeilzehi,
- Ensieh Khazaei,
- Kai Wang,
- Navjot Kaur Kalsi,
- Pai Chet Ng,
- Huan Liu,
- Yuanhao Yu,
- Dimitrios Hatzinakos,
- Konstantinos Plataniotis
In recent years, deep neural networks (DNNs) have provided high performances for various tasks, such as human activity recognition (HAR), in view of their end-to-end training process between the input data and output labels. However, the ...
Highlights
- We have proposed a novel dataset for the task of human activity recognition.
- Our human activity recognition dataset is specified for the smart workplaces.
- The proposed human activity recognition is multi-modal and large-scale.
- ...
Self-supervised learning with automatic data augmentation for enhancing representation
Self-supervised learning has become an increasingly popular method for learning effective representations from unlabeled data. One prominent approach in self-supervised learning is contrastive learning, which trains models to distinguish between ...
Highlights
- Optimal augmentation for robust, discriminative representations in contrastive learning.
- Diverse transformations for adaptable augmentation strategies across datasets.
- Bayesian optimization to find effective augmentation policies ...
Learning to learn point signature for 3D shape geometry
Point signature is a representation that describes the structural geometry of a point within a neighborhood in 3D shapes. Conventional approaches apply a weight-sharing network, e.g., Graph Neural Network (GNN), to all neighborhoods of all points ...
Graphical abstractDisplay Omitted
Highlights
- A meta-learning-based 3D point signature generation for 3d shape geometry learning.
- A theoretical proof justifying the necessity of the meta-learning process.
- A bi-level optimiaztion framework to instantiate the 3D meta point ...
HDRfeat: A feature-rich network for high dynamic range image reconstruction
A major challenge for high dynamic range (HDR) image reconstruction from multi-exposed low dynamic range (LDR) images, especially with dynamic scenes, is the extraction and merging of relevant contextual features in order to suppress any ghosting ...
Highlights
- A novel, feature-rich extraction network for HDR image reconstruction from multi-exposure images.
- Hierarchical feature extraction, channel expansion, and bottleneck merging for summary features in reconstruction.
- Residual attention ...
Neural network based cognitive approaches from face perception with human performance benchmark
Artificial neural network models are able to achieve great performance at numerous computationally challenging tasks like face recognition. It is of significant importance to explore the difference between neural network models and human brains ...
Highlights
- Neural networks are able to provide cognitive approaches from face perception.
- Human perception is used as a benchmark to evaluate the performance of neural networks.
- Neural networks utilizing eye regions are confirmed to be better ...
Beyond supervision: An unsupervised spatio-temporal point cloud noise modeling for event vision sensor
Noise modeling is a fundamental unexplored problem in many event camera applications. The noise modeling methods of conventional cameras are inadequate for capturing the intricate noise patterns unique to event cameras. To address this gap, we ...
Highlights
- Noise modeling algorithm in the event camera domain.
- Applications include noise simulation, denoising, enhancement etc..
- Asynchronous and sparse statistical processing.
- Camera aware noise modeling.
- Adapts unsupervised ...
End-to-end latent fingerprint enhancement using multi-scale Generative Adversarial Network
Latent fingerprint enhancement is paramount as it dramatically influences matching accuracy. This process is often challenging due to varying structured noise and background patterns. The prints may be of arbitrary sizes and scales with a high ...
Highlights
- Handles fingerprints of multiple scales without the need for any pre-processing.
- The network learns the fingerprint features through the generation of auxiliary maps.
- Robust against various noise patterns due to well representative ...
Ensemble clustering via synchronized relabelling
Ensemble clustering is an important problem in unsupervised learning that aims at aggregating multiple noisy partitions into a unique clustering solution. It can be formulated in terms of relabelling and voting, where relabelling refers to the ...
Highlights
- Novel relabelling method for Ensemble Clustering based on permutation synchronization.
- Flexible formulation that can manage partitions with different numbers of clusters.
- Compares favourably against previous Ensemble Clustering ...
FDM: Document image seen-through removal via Fuzzy Diffusion Models
While scanning or shooting a document, factors like ink density and paper transparency may cause the content from the reverse side to become visible through the paper, resulting in a digital image with a ‘seen-through’ phenomenon, which will ...
Student State-aware knowledge tracing based on attention mechanism: A cognitive theory view
Knowledge tracing evaluates students’ knowledge state and predicts future performance by analyzing their past interactions. Recent research integrates features of learning activities into knowledge tracing to enhance interpretability. Ausubel’s ...
Highlights
- A novel knowledge tracing model, SSKT, is proposed to enhance the ability to predict student performance.
- Construct students’ learning process from the perspective of educational cognitive theory.
- The visualization experiment is ...
An efficient ensemble explainable AI (XAI) approach for morphed face detection
Numerous deep neural convolutional architectures have been proposed in literature for face Morphing Attack Detection (MADs) to prevent such attacks and lessen the risks associated with them. Although, deep learning models achieved optimal results ...
Graphical abstractDisplay Omitted
Highlights
- The ensemble of CAM, Grad-CAM and Saliency Map provides far more interpretability visually than individual techniques.
- Heatmaps obtained after image pre-processing are eventually used for both classification as well as explainability.
Patch-wise vector quantization for unsupervised medical anomaly detection
Radiography images inherently possess globally consistent structures while exhibiting significant diversity in local anatomical regions, making it challenging to model their normal features through unsupervised anomaly detection. Since ...
Highlights
- Diverse radiographic imaging conditions challenge unsupervised pathology detection.
- Learning normal features is essential under complex training data distribution.
- We improve the existing vector quantization to learn normal ...
Conditional Information Gain Trellis
Conditional computing processes an input using only part of the neural network’s computational units. Learning to execute parts of a deep convolutional network by routing individual samples has several advantages: This can facilitate the ...
Highlights
- We introduce Conditional Information Gain Trellis (CIGT) for conditional computing.
- We derive the CIGT loss function based on classification and information gain losses.
- CIGT performs better or comparably using a fraction of the ...
Self-supervised scheme for generalizing GAN image detection
Although the recent advancement in generative models brings diverse advantages to society, it can also be abused with malicious purposes, such as fraud, defamation, and fake news. To prevent such cases, vigorous research is conducted to ...
Graphical abstractDisplay Omitted
Highlights
- A novel framework to train a GAN detector in the self-supervision scheme.
- New architecture employing multiple autoencoders to reproduce the fingerprints of GANs.
- Outstanding robustness to unknown GANs compared to the supervised GAN ...
DSR-Diff: Depth map super-resolution with diffusion model
Color-guided depth map super-resolution (CDSR) improves the spatial resolution of a low-quality depth map with the corresponding high-quality color map, benefiting various applications such as 3D reconstruction, virtual reality, and augmented ...
Highlights
- We propose the first diffusion model-based framework for depth map super-resolution.
- The diffusion model in our DSR-Diff is significantly efficient in computation and is highly flexible in usage.
- DSR-Diff alleviates the impact of ...
Deep motion estimation through adversarial learning for gait recognition
Gait recognition is a form of identity verification that can be performed over long distances without requiring the subject’s cooperation, making it particularly valuable for applications such as access control, surveillance, and criminal ...
Highlights
- A novel GAN-based learning approach for gait motion extraction.
- A W-Net for enhanced gait silhouette extraction.
- A new dataset containing 40 subjects for gait recognition.