Author: Ju, Chen : Search

Article

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Computer Vision – ECCV 2024Pages 124–142https://doi.org/10.1007/978-3-031-72630-9_8

Abstract

This paper introduces a novel framework for virtual try-on, termed Wear-Any-Way. Different from previous methods, Wear-Any-Way is a customizable solution. Besides generating high-fidelity results, our method supports users to precisely manipulate ...

Article

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models

Computer Vision – ECCV 2024Pages 436–455https://doi.org/10.1007/978-3-031-72952-2_25

Abstract

Vision-Language Large Models (VLMs) recently become primary backbone of AI, due to the impressive performance. However, their expensive computation costs, i.e., throughput and delay, impede potentials in the real-world scenarios. To achieve ...

research-article

Multi-modal Prototypes for Open-World Semantic Segmentation

International Journal of Computer Vision (IJCV), Volume 132, Issue 12Pages 6004–6020https://doi.org/10.1007/s11263-024-02165-w

Abstract

In semantic segmentation, generalizing a visual system to both seen categories and novel categories at inference time has always been practically valuable yet challenging. To enable such functionality, existing methods mainly rely on either ... $^{}$ $^{}$

research-article

Enhancing Cross-Domain Click-Through Rate Prediction via Explicit Feature Augmentation

WWW '24: Companion Proceedings of the ACM Web Conference 2024Pages 423–432https://doi.org/10.1145/3589335.3648341

Cross-domain CTR (CDCTR) prediction is an important research topic that studies how to leverage meaningful data from a related domain to help CTR prediction in target domain. Most existing CDCTR works design implicit ways to transfer knowledge across ...

research-article

AttrSeg: open-vocabulary semantic segmentation via attribute decomposition-aggregation

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 450, Pages 10258–10270

Open-vocabulary semantic segmentation is a challenging task that requires segmenting novel object categories at inference time. Recent works explore vision-language pre-training to handle this task, but suffer from unrealistic assumptions in practical ...

research-article

Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization

IEEE Transactions on Multimedia (TOM), Volume 25Pages 6688–6701https://doi.org/10.1109/TMM.2022.3213478

Weakly-supervised temporal action localization aims to localize actions from untrimmed long videos with only video-level category labels. Most previous methods ignore the incompleteness issue of Class Activation Sequences (CAS), suffering from trivial ...

Article

Prompting Visual-Language Models for Efficient Video Understanding

Computer Vision – ECCV 2022Pages 105–124https://doi.org/10.1007/978-3-031-19833-5_7

Abstract

Image-based visual-language (I-VL) pre-training has shown great success for learning joint visual-textual representations from large-scale web data, revealing remarkable ability for “zero-shot” generalisation. This paper presents a simple but ...

research-article

Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation

MM '22: Proceedings of the 30th ACM International Conference on MultimediaPages 3742–3753https://doi.org/10.1145/3503161.3548317

We present a simple yet effective self-supervised framework for audio-visual representation learning, to localize the sound source in videos. To understand what enables to learn useful representations, we systematically investigate the effects of data ...

research-article

MePark: Using Meters as Sensors for Citywide On-Street Parking Availability Prediction

IEEE Transactions on Intelligent Transportation Systems (ITS-TRANSACTIONS), Volume 23, Issue 7Pages 7244–7257https://doi.org/10.1109/TITS.2021.3067675

Real-time parking availability prediction is of great value to optimize the on-street parking resource utilization and improve traffic conditions, while the expensive costs of the existing parking availability sensing systems have limited their large-...

Article

The Introduction of Positive Position in “V_Wish + VP_neg” and Its Pragmatic Consequences

Chinese Lexical SemanticsPages 283–294https://doi.org/10.1007/978-3-031-28953-8_22

Abstract

When verbs about wish are followed by a negative clause, the positive attitude within these verbs is often suppressed. The sentence cannot express a “pure” wish, but rather express concern or curse. In interactive communication, negation always ...

Article

Unsupervised Domain Adaption via Similarity-Based Prototypes for Cross-Modality Segmentation

Domain Adaptation and Representation Transfer, and Affordable Healthcare and AI for Resource Diverse Global HealthPages 133–143https://doi.org/10.1007/978-3-030-87722-4_13

Abstract

Deep learning models have achieved great success on various vision challenges, but a well-trained model would face drastic performance degradation when applied to unseen data. Since the model is sensitive to domain shift, unsupervised domain ...

research-article

D2Park: Diversified Demand-aware On-street Parking Guidance

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Volume 4, Issue 4Article No.: 163, Pages 1–25https://doi.org/10.1145/3432214

To address the increasingly serious parking pain, numerous mobile Apps have emerged to help drivers to find a convenient parking spot with various auxiliary information. However, the phenomenon of "multiple cars chasing the same spot" still exists, ...

Article

Bottom-Up Temporal Action Localization with Mutual Regularization

Computer Vision – ECCV 2020Pages 539–555https://doi.org/10.1007/978-3-030-58598-3_32

Abstract

Recently, temporal action localization (TAL), i.e., finding specific action segments in untrimmed videos, has attracted increasing attentions of the computer vision community. State-of-the-art solutions for TAL involves evaluating the frame-level ...

Search Results

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Caption

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models

Multi-modal Prototypes for Open-World Semantic Segmentation

Enhancing Cross-Domain Click-Through Rate Prediction via Explicit Feature Augmentation

AttrSeg: open-vocabulary semantic segmentation via attribute decomposition-aggregation

Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization

Prompting Visual-Language Models for Efficient Video Understanding

Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation

MePark: Using Meters as Sensors for Citywide On-Street Parking Availability Prediction

The Introduction of Positive Position in “V_Wish + VP_neg” and Its Pragmatic Consequences

Unsupervised Domain Adaption via Similarity-Based Prototypes for Cross-Modality Segmentation

D2Park: Diversified Demand-aware On-street Parking Guidance

Bottom-Up Temporal Action Localization with Mutual Regularization

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder