Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJanuary 2025JUST ACCEPTED
Towards Scene-Centric Multi-Level Interest Mining for Video Recommendation
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Just Accepted https://doi.org/10.1145/3712600Knowledge-aware video recommendation requires the ability of associating external knowledge to capture high-order connectivities between users and videos. One limitation of existing methods is that they only extract user interests at a granular level of ...
- ArticleDecember 2024
FOTV-HQS: A Fractional-Order Total Variation Model for LiDAR Super-Resolution with Deep Unfolding Network
AbstractLiDAR super-resolution can improve the quality of point cloud data, which is critical for improving many downstream tasks such as object detection, identification, and tracking. Traditional LiDAR super-resolution models often struggle with issues ...
- ArticleAugust 2024
LDCM-MVIT: A Lightweight Depth Completion Model Based on MViT
Advanced Intelligent Computing Technology and ApplicationsPages 481–489https://doi.org/10.1007/978-981-97-5666-7_41AbstractIn the field of computer vision, many perception methods rely on depth information captured by depth cameras. However, the integrity of depth maps is hindered by the reflection and refraction of light on transparent objects. Existing methods of ...
- research-articleJune 2024
TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model
ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 18, Issue 7Article No.: 171, Pages 1–19https://doi.org/10.1145/3654674Multi-modal large language models (MLLMs), such as GPT-4, exhibit great comprehension capabilities on human instruction, as well as zero-shot ability on new downstream multi-modal tasks. To integrate the different modalities within a unified embedding ...
- research-articleJune 2024
When I Fall in Love: Capturing Video-Oriented Social Relationship Evolution via Attentive GNN
IEEE Transactions on Circuits and Systems for Video Technology (IEEETCSVT), Volume 34, Issue 6Pages 5160–5175https://doi.org/10.1109/TCSVT.2023.3337838With the booming of streaming media platforms, viewers now get used to watching dramas and movies via online platforms with more intelligent services. Usually, character relationships may dynamically evolve with stories promoting in long videos. Therefore,...
- research-articleMay 2024
NoteLLM: A Retrievable Large Language Model for Note Recommendation
WWW '24: Companion Proceedings of the ACM Web Conference 2024Pages 170–179https://doi.org/10.1145/3589335.3648314People enjoy sharing "notes" including their experiences within online communities. Therefore, recommending notes aligned with user interests has become a crucial task. Existing online methods only input notes into BERT-based models to generate note ...
- ArticleOctober 2023
Multi-angle Prediction Based on Prompt Learning for Text Classification
Natural Language Processing and Chinese ComputingPages 281–291https://doi.org/10.1007/978-3-031-44699-3_25AbstractThe assessment of Chinese essays with respect to text coherence using deep learning has been relatively understudied due to the lack of large-scale, high-quality discourse coherence evaluation data resources. Existing research predominantly ...
- research-articleOctober 2023
A ku-band common-leg transceiver with built-in configurable register in 130-nm CMOS technology for phased-array systems
AbstractTo solve the bottleneck of amplitude and phase accuracy deterioration caused by process and unpredictable errors, this paper proposes a Ku-band common-leg transceiver with built-in configurable register to achieve 64-state amplitude ...
- research-articleOctober 2023
A robust and accurate centerline extraction method of multiple laser stripe for complex 3D measurement
Advanced Engineering Informatics (ADEI), Volume 58, Issue Chttps://doi.org/10.1016/j.aei.2023.102207AbstractMultiple laser stripe measurement (MLSM) is a vital technique in optical three-dimensional (3D) measurement. The accurate extraction of the laser stripe centerlines plays a decisive role in achieving high measurement accuracy. However, complex 3D ...
- research-articleAugust 2023
Comprehending the Gossips: Meme Explanation in Time-Sync Video Comment via Multimodal Cues
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Volume 22, Issue 8Article No.: 216, Pages 1–17https://doi.org/10.1145/3612920Recent years have witnessed the booming of online social media platforms with embracing the popular service called “Time-Sync Comment”, which supports the viewers to share their time-sync opinions along with video content. In this way, we observe that ...
- research-articleAugust 2023
Multi-Grained Multimodal Interaction Network for Entity Linking
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data MiningPages 1583–1594https://doi.org/10.1145/3580305.3599439Multimodal entity linking (MEL) task, which aims at resolving ambiguous mentions to a multimodal knowledge graph, has attracted wide attention in recent years. Though large efforts have been made to explore the complementary effect among multiple ...
- research-articleJune 2023
SGAT: Scene Graph Attention Network for Video Recommendation
IVSP '23: Proceedings of the 2023 5th International Conference on Image, Video and Signal ProcessingPages 117–125https://doi.org/10.1145/3591156.3591173As a widely studied topic in recommender systems, collaborative filtering (CF) methods help users discover potential items of interest by assuming that behavioral similar users would have similar preferences on items. A recent trend is to develop models ...
- research-articleOctober 2022
Unified QA-aware Knowledge Graph Generation Based on Multi-modal Modeling
MM '22: Proceedings of the 30th ACM International Conference on MultimediaPages 7185–7189https://doi.org/10.1145/3503161.3551604Understanding the long duration videos' storyline is often considered a major challenge in the field of video understanding. To promote research on understanding longer videos in the community, the deep video understanding (DVU) task is suggested for ...
- research-articleOctober 2022
Relation-enhanced Negative Sampling for Multimodal Knowledge Graph Completion
MM '22: Proceedings of the 30th ACM International Conference on MultimediaPages 3857–3866https://doi.org/10.1145/3503161.3548388Knowledge Graph Completion (KGC), aiming to infer the missing part of Knowledge Graphs (KGs), has long been treated as a crucial task to support downstream applications of KGs, especially for the multimodal KGs (MKGs) which suffer the incomplete ...
- research-articleOctober 2022
A spatiotemporal multi-stream learning framework based on attention mechanism for automatic modulation recognition
AbstractAutomatic modulation recognition (AMR) plays an essential role in wireless communication systems. Our paper proposes a novel multi-stream neural network (MSNN) to extract the features in parallel from the amplitude, phase, frequency, ...
- research-articleJanuary 2022
Deep Text Matching in Medical Question Answering System
ACM ICEA '21: Proceedings of the 2021 ACM International Conference on Intelligent Computing and its Emerging ApplicationsPages 134–138https://doi.org/10.1145/3491396.3506536The retrieval question-answering(Q&A) system based on Q&A library is a system that can retrieve the most similar question from Q&A library to get the correct answer. Classic approaches only use TF-IDF, BM25 and other algorithms to calculate the shallow ...
- research-articleOctober 2021
Linking the Characters: Video-oriented Social Graph Generation via Hierarchical-cumulative GCN
MM '21: Proceedings of the 29th ACM International Conference on MultimediaPages 4716–4724https://doi.org/10.1145/3474085.3475684Recent years have witnessed the booming of online video platforms. Along this line, a graph to illustrate social relation among characters has been long expected to not only benefit the audiences for better understanding the story, but also support the ...
- research-articleJanuary 2021
Is Heuristic Sampling Necessary in Training Deep Object Detectors?
IEEE Transactions on Image Processing (TIP), Volume 30Pages 8454–8467https://doi.org/10.1109/TIP.2021.3106802To train accurate deep object detectors under the extreme foreground-background imbalance, heuristic sampling methods are always necessary, which either re-sample a subset of all training samples (hard sampling methods, <italic>e.g.</italic> biased ...
- ArticleOctober 2012
Mining hub-based protein complexes in massive biological networks
BIBMW '12: Proceedings of the 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)Pages 166–173https://doi.org/10.1109/BIBMW.2012.6470299Advanced technologies are producing large-scale protein-protein interaction data at an ever increasing pace. Finding protein-protein interaction complexes from large PPI networks is a fundamental problem in bioinformatics. As a group of core proteins ...