Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–48 of 48 results for author: Zhuo, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10556  [pdf, other

    cs.AI cs.LG

    Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks

    Authors: Yun Qu, Boyuan Wang, Jianzhun Shao, Yuhang Jiang, Chen Chen, Zhenbin Ye, Lin Liu, Junfeng Yang, Lin Lai, Hongyang Qin, Minwen Deng, Juchao Zhuo, Deheng Ye, Qiang Fu, Wei Yang, Guang Yang, Lanxiao Huang, Xiangyang Ji

    Abstract: The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre-collected offline datasets that represent real-world complexities and practical applications. However, existing datasets often fall short in their simplicity and lack of realism. To address this gap, we propose Hokoff, a comprehens… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  2. arXiv:2407.21488  [pdf, other

    cs.IR cs.AI

    Breaking the Hourglass Phenomenon of Residual Quantization: Enhancing the Upper Bound of Generative Retrieval

    Authors: Zhirui Kuai, Zuxu Chen, Huimu Wang, Mingming Li, Dadong Miao, Binbin Wang, Xusong Chen, Li Kuang, Yuxing Han, Jiaxing Wang, Guoyu Tang, Lin Liu, Songlin Wang, Jingwei Zhuo

    Abstract: Generative retrieval (GR) has emerged as a transformative paradigm in search and recommender systems, leveraging numeric-based identifier representations to enhance efficiency and generalization. Notably, methods like TIGER employing Residual Quantization-based Semantic Identifiers (RQ-SID), have shown significant promise in e-commerce scenarios by effectively managing item IDs. However, a critica… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

  3. arXiv:2407.19829  [pdf, other

    cs.IR cs.AI

    Generative Retrieval with Preference Optimization for E-commerce Search

    Authors: Mingming Li, Huimu Wang, Zuxu Chen, Guangtao Nie, Yiming Qiu, Binbin Wang, Guoyu Tang, Lin Liu, Jingwei Zhuo

    Abstract: Generative retrieval introduces a groundbreaking paradigm to document retrieval by directly generating the identifier of a pertinent document in response to a specific query. This paradigm has demonstrated considerable benefits and potential, particularly in representation and generalization capabilities, within the context of large language models. However, it faces significant challenges in E-co… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  4. arXiv:2406.11546  [pdf, other

    eess.AS cs.CL cs.SD

    GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

    Authors: Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen

    Abstract: The evolution of speech technology has been spurred by the rapid increase in dataset sizes. Traditional speech models generally depend on a large amount of labeled training data, which is scarce for low-resource languages. This paper presents GigaSpeech 2, a large-scale, multi-domain, multilingual speech recognition corpus. It is designed for low-resource languages and does not rely on paired spee… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Under review

  5. arXiv:2406.06619  [pdf, other

    eess.AS cs.AI cs.CL

    LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR

    Authors: Zheshu Song, Jianheng Zhuo, Yifan Yang, Ziyang Ma, Shixiong Zhang, Xie Chen

    Abstract: Recent years have witnessed significant progress in multilingual automatic speech recognition (ASR), driven by the emergence of end-to-end (E2E) models and the scaling of multilingual datasets. Despite that, two main challenges persist in multilingual ASR: language interference and the incorporation of new languages without degrading the performance of the existing ones. This paper proposes LoRA-W… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 5 pages, 2 figures, conference

  6. arXiv:2403.17297  [pdf, other

    cs.CL cs.AI

    InternLM2 Technical Report

    Authors: Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang , et al. (75 additional authors not shown)

    Abstract: The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  7. arXiv:2403.12883  [pdf, other

    cs.CV

    Confusing Pair Correction Based on Category Prototype for Domain Adaptation under Noisy Environments

    Authors: Churan Zhi, Junbao Zhuo, Shuhui Wang

    Abstract: In this paper, we address unsupervised domain adaptation under noisy environments, which is more challenging and practical than traditional domain adaptation. In this scenario, the model is prone to overfitting noisy labels, resulting in a more pronounced domain shift and a notable decline in the overall model performance. Previous methods employed prototype methods for domain adaptation on robust… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: AAAI 2024

  8. arXiv:2402.06984  [pdf, other

    cs.SD cs.CV cs.MM eess.AS eess.IV

    Speech motion anomaly detection via cross-modal translation of 4D motion fields from tagged MRI

    Authors: Xiaofeng Liu, Fangxu Xing, Jiachen Zhuo, Maureen Stone, Jerry L. Prince, Georges El Fakhri, Jonghye Woo

    Abstract: Understanding the relationship between tongue motion patterns during speech and their resulting speech acoustic outcomes -- i.e., articulatory-acoustic relation -- is of great importance in assessing speech quality and developing innovative treatment and rehabilitative strategies. This is especially important when evaluating and detecting abnormal articulatory features in patients with speech-rela… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: SPIE Medical Imaging 2024: Image Processing

  9. arXiv:2401.17571  [pdf, other

    eess.IV cs.CV

    Is Registering Raw Tagged-MR Enough for Strain Estimation in the Era of Deep Learning?

    Authors: Zhangxing Bian, Ahmed Alshareef, Shuwen Wei, Junyu Chen, Yuli Wang, Jonghye Woo, Dzung L. Pham, Jiachen Zhuo, Aaron Carass, Jerry L. Prince

    Abstract: Magnetic Resonance Imaging with tagging (tMRI) has long been utilized for quantifying tissue motion and strain during deformation. However, a phenomenon known as tag fading, a gradual decrease in tag visibility over time, often complicates post-processing. The first contribution of this study is to model tag fading by considering the interplay between $T_1$ relaxation and the repeated application… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted to SPIE Medical Imaging 2024 (oral)

  10. arXiv:2312.14033  [pdf, other

    cs.CL

    T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step

    Authors: Zehui Chen, Weihua Du, Wenwei Zhang, Kuikun Liu, Jiangning Liu, Miao Zheng, Jingming Zhuo, Songyang Zhang, Dahua Lin, Kai Chen, Feng Zhao

    Abstract: Large language models (LLM) have achieved remarkable performance on various NLP tasks and are augmented by tools for broader applications. Yet, how to evaluate and analyze the tool-utilization capability of LLMs is still under-explored. In contrast to previous works that evaluate models holistically, we comprehensively decompose the tool utilization into multiple sub-processes, including instructi… ▽ More

    Submitted 14 January, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Project: https://open-compass.github.io/T-Eval

  11. arXiv:2311.18259  [pdf, other

    cs.CV cs.AI

    Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

    Authors: Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain , et al. (76 additional authors not shown)

    Abstract: We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from… ▽ More

    Submitted 29 April, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: updated baseline results and dataset statistics to match the released v2 data; added table to appendix comparing stats of Ego-Exo4D alongside other datasets

  12. arXiv:2309.14586  [pdf, other

    cs.SD cs.AI cs.CV eess.AS eess.SP

    Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer

    Authors: Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jiachen Zhuo, Sidney Fels, Jerry L. Prince, Georges El Fakhri, Jonghye Woo

    Abstract: The tongue's intricate 3D structure, comprising localized functional units, plays a crucial role in the production of speech. When measured using tagged MRI, these functional units exhibit cohesive displacements and derived quantities that facilitate the complex process of speech production. Non-negative matrix factorization-based approaches have been shown to estimate the functional units through… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: MICCAI 2023 (Oral presentation)

  13. Orthogonal Temporal Interpolation for Zero-Shot Video Recognition

    Authors: Yan Zhu, Junbao Zhuo, Bin Ma, Jiajia Geng, Xiaoming Wei, Xiaolin Wei, Shuhui Wang

    Abstract: Zero-shot video recognition (ZSVR) is a task that aims to recognize video categories that have not been seen during the model training process. Recently, vision-language models (VLMs) pre-trained on large-scale image-text pairs have demonstrated impressive transferability for ZSVR. To make VLMs applicable to the video domain, existing methods often use an additional temporal learning module after… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

    Journal ref: Proceedings of the 31st ACM International Conference on Multimedia (MM '23), October 29-November 3, 2023

  14. arXiv:2308.02949  [pdf, other

    eess.IV cs.CV physics.med-ph

    MomentaMorph: Unsupervised Spatial-Temporal Registration with Momenta, Shooting, and Correction

    Authors: Zhangxing Bian, Shuwen Wei, Yihao Liu, Junyu Chen, Jiachen Zhuo, Fangxu Xing, Jonghye Woo, Aaron Carass, Jerry L. Prince

    Abstract: Tagged magnetic resonance imaging (tMRI) has been employed for decades to measure the motion of tissue undergoing deformation. However, registration-based motion estimation from tMRI is difficult due to the periodic patterns in these images, particularly when the motion is large. With a larger motion the registration approach gets trapped in a local optima, leading to motion estimation errors. We… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: Accepted by MICCAI Workshop 2023: Time-Series Data Analytics and Learning (MTSAIL)

  15. arXiv:2305.14589  [pdf, other

    eess.IV cs.CV cs.LG physics.med-ph

    Attentive Continuous Generative Self-training for Unsupervised Domain Adaptive Medical Image Translation

    Authors: Xiaofeng Liu, Jerry L. Prince, Fangxu Xing, Jiachen Zhuo, Reese Timothy, Maureen Stone, Georges El Fakhri, Jonghye Woo

    Abstract: Self-training is an important class of unsupervised domain adaptation (UDA) approaches that are used to mitigate the problem of domain shift, when applying knowledge learned from a labeled source domain to unlabeled and heterogeneous target domains. While self-training-based UDA has shown considerable promise on discriminative tasks, including classification and segmentation, through reliable pseu… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to Medical Image Analysis

  16. Learning Multi-Stage Multi-Grained Semantic Embeddings for E-Commerce Search

    Authors: Binbin Wang, Mingming Li, Zhixiong Zeng, Jingwei Zhuo, Songlin Wang, Sulong Xu, Bo Long, Weipeng Yan

    Abstract: Retrieving relevant items that match users' queries from billion-scale corpus forms the core of industrial e-commerce search systems, in which embedding-based retrieval (EBR) methods are prevailing. These methods adopt a two-tower framework to learn embedding vectors for query and item separately and thus leverage efficient approximate nearest neighbor (ANN) search to retrieve relevant items. Howe… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  17. arXiv:2301.07234  [pdf, other

    eess.IV cs.CV

    DRIMET: Deep Registration for 3D Incompressible Motion Estimation in Tagged-MRI with Application to the Tongue

    Authors: Zhangxing Bian, Fangxu Xing, Jinglun Yu, Muhan Shao, Yihao Liu, Aaron Carass, Jiachen Zhuo, Jonghye Woo, Jerry L. Prince

    Abstract: Tagged magnetic resonance imaging~(MRI) has been used for decades to observe and quantify the detailed motion of deforming tissue. However, this technique faces several challenges such as tag fading, large motion, long computation times, and difficulties in obtaining diffeomorphic incompressible flow fields. To address these issues, this paper presents a novel unsupervised phase-based 3D motion es… ▽ More

    Submitted 30 April, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: Accepted to MIDL 2023 (oral)

  18. arXiv:2301.06114  [pdf, other

    eess.IV cs.LG

    Segmenting thalamic nuclei from manifold projections of multi-contrast MRI

    Authors: Chang Yan, Muhan Shao, Zhangxing Bian, Anqi Feng, Yuan Xue, Jiachen Zhuo, Rao P. Gullapalli, Aaron Carass, Jerry L. Prince

    Abstract: The thalamus is a subcortical gray matter structure that plays a key role in relaying sensory and motor signals within the brain. Its nuclei can atrophy or otherwise be affected by neurological disease and injuries including mild traumatic brain injury. Segmenting both the thalamus and its nuclei is challenging because of the relatively low contrast within and around the thalamus in conventional m… ▽ More

    Submitted 31 January, 2023; v1 submitted 15 January, 2023; originally announced January 2023.

    Comments: 8 pages, 3 figures, 2023 SPIE-MI Image Processing

  19. arXiv:2209.02970  [pdf, other

    cs.CL

    Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence

    Authors: Jiaxing Zhang, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Lin Zhang, Ping Yang, Xinyu Gao, Ziwei Wu, Xiaoqun Dong, Junqing He, Jianheng Zhuo, Qi Yang, Yongfeng Huang, Xiayu Li, Yanghan Wu, Junyu Lu, Xinyu Zhu, Weifeng Chen, Ting Han, Kunhao Pan, Rui Wang, Hao Wang, Xiaojun Wu, Zhongshen Zeng, Chongpei Chen

    Abstract: Nowadays, foundation models become one of fundamental infrastructures in artificial intelligence, paving ways to the general intelligence. However, the reality presents two urgent challenges: existing foundation models are dominated by the English-language community; users are often given limited resources and thus cannot always use foundation models. To support the development of the Chinese-lang… ▽ More

    Submitted 30 March, 2023; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: Added the Chinese version and is now a bilingual paper

  20. arXiv:2208.06150  [pdf, other

    cs.IR

    Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

    Authors: Yiming Qiu, Chenyu Zhao, Han Zhang, Jingwei Zhuo, Tianhao Li, Xiaowei Zhang, Songlin Wang, Sulong Xu, Bo Long, Wen-Yun Yang

    Abstract: BERT-style models pre-trained on the general corpus (e.g., Wikipedia) and fine-tuned on specific task corpus, have recently emerged as breakthrough techniques in many NLP tasks: question answering, text classification, sequence labeling and so on. However, this technique may not always work, especially for two scenarios: a corpus that contains very different text from the general corpus Wikipedia,… ▽ More

    Submitted 22 August, 2022; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: 5 pages, 3 figures; accepted by CIKM2022

    ACM Class: H.3.3

  21. arXiv:2206.02284  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Tagged-MRI Sequence to Audio Synthesis via Self Residual Attention Guided Heterogeneous Translator

    Authors: Xiaofeng Liu, Fangxu Xing, Jerry L. Prince, Jiachen Zhuo, Maureen Stone, Georges El Fakhri, Jonghye Woo

    Abstract: Understanding the underlying relationship between tongue and oropharyngeal muscle deformation seen in tagged-MRI and intelligible speech plays an important role in advancing speech motor control theories and treatment of speech related-disorders. Because of their heterogeneous representations, however, direct mapping between the two modalities -- i.e., two-dimensional (mid-sagittal slice) plus tim… ▽ More

    Submitted 25 September, 2022; v1 submitted 5 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022 (early accept, Oral Presentation ~3%)

  22. arXiv:2202.13616  [pdf, other

    cs.IR cs.LG

    WSLRec: Weakly Supervised Learning for Neural Sequential Recommendation Models

    Authors: Jingwei Zhuo, Bin Liu, Xiang Li, Han Zhu, Xiaoqiang Zhu

    Abstract: Learning the user-item relevance hidden in implicit feedback data plays an important role in modern recommender systems. Neural sequential recommendation models, which formulates learning the user-item relevance as a sequential classification problem to distinguish items in future behaviors from others based on the user's historical behaviors, have attracted a lot of interest in both industry and… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: 9 pages

  23. arXiv:2202.04219  [pdf, other

    stat.ML cs.LG math.ST

    Improving Computational Complexity in Statistical Models with Second-Order Information

    Authors: Tongzheng Ren, Jiacheng Zhuo, Sujay Sanghavi, Nhat Ho

    Abstract: It is known that when the statistical models are singular, i.e., the Fisher information matrix at the true parameter is degenerate, the fixed step-size gradient descent algorithm takes polynomial number of steps in terms of the sample size $n$ to converge to a final statistical radius around the true parameter, which can be unsatisfactory for the application. To further improve that computational… ▽ More

    Submitted 13 April, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

    Comments: 27 pages, 2 figures. Fixing a bug in the proof of Lemma 7

  24. Learning Explicit User Interest Boundary for Recommendation

    Authors: Jianhuan Zhuo, Qiannan Zhu, Yinliang Yue, Yuhong Zhao

    Abstract: The core objective of modelling recommender systems from implicit feedback is to maximize the positive sample score $s_p$ and minimize the negative sample score $s_n$, which can usually be summarized into two paradigms: the pointwise and the pairwise. The pointwise approaches fit each sample with its label individually, which is flexible in weighting and sampling on instance-level but ignores the… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

    Comments: 12 pages, 9 figures, 5 tables

  25. arXiv:2107.06154  [pdf, other

    cs.CV

    Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain Adaptation

    Authors: Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

    Abstract: Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain. A common solution is to minimize the Shannon Entropy to push the decision boundary away from the high density area. However, entropy minimization also leads to severe reduction of prediction diversity, and unfortunately… ▽ More

    Submitted 4 August, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

    Comments: TPAMI under revivew. arXiv admin note: text overlap with arXiv:2003.12237

  26. arXiv:2107.03008  [pdf, other

    cs.CV

    Learning Invariant Representation with Consistency and Diversity for Semi-supervised Source Hypothesis Transfer

    Authors: Xiaodong Wang, Junbao Zhuo, Shuhao Cui, Shuhui Wang

    Abstract: Semi-supervised domain adaptation (SSDA) aims to solve tasks in target domain by utilizing transferable information learned from the available source domain and a few labeled target data. However, source data is not always accessible in practical scenarios, which restricts the application of SSDA in real world circumstances. In this paper, we propose a novel task named Semi-supervised Source Hypot… ▽ More

    Submitted 19 July, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: 10 pages, 4 figures

  27. arXiv:2106.12499  [pdf, other

    cs.CV cs.AI cs.LG cs.NE

    Generative Self-training for Cross-domain Unsupervised Tagged-to-Cine MRI Synthesis

    Authors: Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jiachen Zhuo, Reese Timothy, Jerry L. Prince, Georges El Fakhri, Jonghye Woo

    Abstract: Self-training based unsupervised domain adaptation (UDA) has shown great potential to address the problem of domain shift, when applying a trained deep learning model in a source domain to unlabeled target domains. However, while the self-training UDA has demonstrated its effectiveness on discriminative tasks, such as classification and segmentation, via the reliable pseudo-label selection based o… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

    Comments: MICCAI 2021 (early accept <13%)

  28. arXiv:2102.05231  [pdf, other

    cs.CV

    Culture-inspired Multi-modal Color Palette Generation and Colorization: A Chinese Youth Subculture Case

    Authors: Yufan Li, Jinggang Zhuo, Ling Fan, Harry Jiannan Wang

    Abstract: Color is an essential component of graphic design, acting not only as a visual factor but also carrying cultural implications. However, existing research on algorithmic color palette generation and colorization largely ignores the cultural aspect. In this paper, we contribute to this line of research by first constructing a unique color dataset inspired by a specific culture, i.e., Chinese Youth S… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: accepted by the 3rd IEEE Workshop on Artificial Intelligence for Art Creation

  29. arXiv:2102.02756  [pdf, other

    cs.LG stat.ML

    On the computational and statistical complexity of over-parameterized matrix sensing

    Authors: Jiacheng Zhuo, Jeongyeol Kwon, Nhat Ho, Constantine Caramanis

    Abstract: We consider solving the low rank matrix sensing problem with Factorized Gradient Descend (FGD) method when the true rank is unknown and over-specified, which we refer to as over-parameterized matrix sensing. If the ground truth signal $\mathbf{X}^* \in \mathbb{R}^{d*d}$ is of rank $r$, but we try to recover it using $\mathbf{F} \mathbf{F}^\top$ where $\mathbf{F} \in \mathbb{R}^{d*k}$ and $k>r$, th… ▽ More

    Submitted 26 January, 2021; originally announced February 2021.

  30. arXiv:2012.00744  [pdf, other

    cs.CV

    A Framework and Dataset for Abstract Art Generation via CalligraphyGAN

    Authors: Jinggang Zhuo, Ling Fan, Harry Jiannan Wang

    Abstract: With the advancement of deep learning, artificial intelligence (AI) has made many breakthroughs in recent years and achieved superhuman performance in various tasks such as object detection, reading comprehension, and video games. Generative Modeling, such as various Generative Adversarial Networks (GAN) models, has been applied to generate paintings and music. Research in Natural Language Process… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Comments: Accepted by NeurIPS 2020 Workshop on Machine Learning for Creativity and Design, Vancouver, Canada

  31. arXiv:2008.01064  [pdf, other

    cs.LG stat.ML

    Predicting What You Already Know Helps: Provable Self-Supervised Learning

    Authors: Jason D. Lee, Qi Lei, Nikunj Saunshi, Jiacheng Zhuo

    Abstract: Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data to learn useful semantic representations. These pretext tasks are created solely using the input features, such as predicting a missing image patch, recovering the color channels of an image from context, or predicting missing words in text; yet predicting this \textit{… ▽ More

    Submitted 13 November, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

    Comments: NeurIPS 2021

  32. arXiv:2007.03572  [pdf, other

    cs.LG math.OC stat.ML

    Robust Structured Statistical Estimation via Conditional Gradient Type Methods

    Authors: Jiacheng Zhuo, Liu Liu, Constantine Caramanis

    Abstract: Structured statistical estimation problems are often solved by Conditional Gradient (CG) type methods to avoid the computationally expensive projection operation. However, the existing CG type methods are not robust to data corruption. To address this, we propose to robustify CG type methods against Huber's corruption model and heavy-tailed data. First, we show that the two Pairwise CG methods are… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

  33. arXiv:2006.15408  [pdf, other

    stat.ML cs.LG

    Learning Optimal Tree Models Under Beam Search

    Authors: Jingwei Zhuo, Ziru Xu, Wei Dai, Han Zhu, Han Li, Jian Xu, Kun Gai

    Abstract: Retrieving relevant targets from an extremely large target set under computational limits is a common challenge for information retrieval and recommendation systems. Tree models, which formulate targets as leaves of a tree with trainable node-wise scorers, have attracted a lot of interests in tackling this challenge due to their logarithmic computational complexity in both training and testing. Tr… ▽ More

    Submitted 27 June, 2020; originally announced June 2020.

    Comments: To appear in the 37th International Conference on Machine Learning (ICML 2020)

  34. arXiv:2003.13183  [pdf, other

    cs.CV

    Gradually Vanishing Bridge for Adversarial Domain Adaptation

    Authors: Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian

    Abstract: In unsupervised domain adaptation, rich domain-specific characteristics bring great challenge to learn domain-invariant representations. However, domain discrepancy is considered to be directly minimized in existing solutions, which is difficult to achieve in practice. Some methods alleviate the difficulty by explicitly modeling domain-invariant and domain-specific parts in the representations, bu… ▽ More

    Submitted 29 March, 2020; originally announced March 2020.

    Comments: CVPR2020

  35. arXiv:2003.12237  [pdf, other

    cs.CV

    Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

    Authors: Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

    Abstract: The learning of the deep networks largely relies on the data with human-annotated labels. In some label insufficient situations, the performance degrades on the decision boundary with high data density. A common solution is to directly minimize the Shannon Entropy, but the side effect caused by entropy minimization, i.e., reduction of the prediction diversity, is mostly ignored. To address this is… ▽ More

    Submitted 27 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR 2020 as Oral

  36. arXiv:1912.00858  [pdf, other

    cs.LG cs.CV math.OC stat.ML

    Efficient Relaxed Gradient Support Pursuit for Sparsity Constrained Non-convex Optimization

    Authors: Fanhua Shang, Bingkun Wei, Hongying Liu, Yuanyuan Liu, Jiacheng Zhuo

    Abstract: Large-scale non-convex sparsity-constrained problems have recently gained extensive attention. Most existing deterministic optimization methods (e.g., GraSP) are not suitable for large-scale and high-dimensional problems, and thus stochastic optimization methods with hard thresholding (e.g., SVRGHT) become more attractive. Inspired by GraSP, this paper proposes a new general relaxed gradient suppo… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

    Comments: 7 pages, 3 figures, Appeared at the Data Science Meets Optimization Workshop (DSO) at IJCAI'19

  37. arXiv:1910.07703  [pdf, other

    cs.LG cs.DC math.NA stat.ML

    Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls

    Authors: Jiacheng Zhuo, Qi Lei, Alexandros G. Dimakis, Constantine Caramanis

    Abstract: Large-scale machine learning training suffers from two prior challenges, specifically for nuclear-norm constrained problems with distributed systems: the synchronization slowdown due to the straggling workers, and high communication costs. In this work, we propose an asynchronous Stochastic Frank Wolfe (SFW-asyn) method, which, for the first time, solves the two problems simultaneously, while succ… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

  38. arXiv:1907.03253  [pdf, other

    cs.CV

    A Novel Teacher-Student Learning Framework For Occluded Person Re-Identification

    Authors: Jiaxuan Zhuo, Jianhuang Lai, Peijia Chen

    Abstract: Person re-identification (re-id) has made great progress in recent years, but occlusion is still a challenging problem which significantly degenerates the identification performance. In this paper, we design a teacher-student learning framework to learn an occlusion-robust model from the full-body person domain to the occluded person domain. Notably, the teacher network only uses large-scale full-… ▽ More

    Submitted 7 July, 2019; originally announced July 2019.

  39. arXiv:1906.02436  [pdf, other

    cs.LG math.OC stat.ML

    Primal-Dual Block Frank-Wolfe

    Authors: Qi Lei, Jiacheng Zhuo, Constantine Caramanis, Inderjit S. Dhillon, Alexandros G. Dimakis

    Abstract: We propose a variant of the Frank-Wolfe algorithm for solving a class of sparse/low-rank optimization problems. Our formulation includes Elastic Net, regularized SVMs and phase retrieval as special cases. The proposed Primal-Dual Block Frank-Wolfe algorithm reduces the per-iteration cost while maintaining linear convergence rate. The per iteration cost of our method depends on the structural compl… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  40. arXiv:1904.08631  [pdf, other

    cs.CV

    Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization

    Authors: Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang

    Abstract: We address the unsupervised open domain recognition (UODR) problem, where categories in labeled source domain S is only a subset of those in unlabeled target domain T. The task is to correctly classify all samples in T including known and unknown categories. UODR is challenging due to the domain discrepancy, which becomes even harder to bridge when a large number of unknown categories exist in T.… ▽ More

    Submitted 18 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019, 10 pages, 4 figures

  41. arXiv:1902.00282  [pdf, other

    stat.ML cs.LG

    Understanding MCMC Dynamics as Flows on the Wasserstein Space

    Authors: Chang Liu, Jingwei Zhuo, Jun Zhu

    Abstract: It is known that the Langevin dynamics used in MCMC is the gradient flow of the KL divergence on the Wasserstein space, which helps convergence analysis and inspires recent particle-based variational inference methods (ParVIs). But no more MCMC dynamics is understood in this way. In this work, by developing novel concepts, we propose a theoretical framework that recognizes a general MCMC dynamics… ▽ More

    Submitted 4 July, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: References refined

  42. arXiv:1901.08043  [pdf, other

    cs.CV

    Bottom-up Object Detection by Grouping Extreme and Center Points

    Authors: Xingyi Zhou, Jiacheng Zhuo, Philipp Krähenbühl

    Abstract: With the advent of deep learning, object detection drifted from a bottom-up to a top-down recognition problem. State of the art algorithms enumerate a near-exhaustive list of object locations and classify each into: object or not. In this paper, we show that bottom-up approaches still perform competitively. We detect four extreme points (top-most, left-most, bottom-most, right-most) and one center… ▽ More

    Submitted 25 April, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

  43. arXiv:1807.01750  [pdf, other

    stat.ML cs.LG

    Understanding and Accelerating Particle-Based Variational Inference

    Authors: Chang Liu, Jingwei Zhuo, Pengyu Cheng, Ruiyi Zhang, Jun Zhu, Lawrence Carin

    Abstract: Particle-based variational inference methods (ParVIs) have gained attention in the Bayesian inference literature, for their capacity to yield flexible and accurate approximations. We explore ParVIs from the perspective of Wasserstein gradient flows, and make both theoretical and practical contributions. We unify various finite-particle approximations that existing ParVIs use, and recognize that th… ▽ More

    Submitted 16 July, 2019; v1 submitted 4 July, 2018; originally announced July 2018.

    Comments: A typo for citation corrected

  44. arXiv:1804.02792  [pdf, other

    cs.CV cs.AI cs.MM

    Occluded Person Re-identification

    Authors: Jiaxuan Zhuo, Zeyu Chen, Jianhuang Lai, Guangcong Wang

    Abstract: Person re-identification (re-id) suffers from a serious occlusion problem when applied to crowded public places. In this paper, we propose to retrieve a full-body person image by using a person image with occlusions. This differs significantly from the conventional person re-id problem where it is assumed that person images are detected without any occlusion. We thus call this new problem the occl… ▽ More

    Submitted 20 April, 2018; v1 submitted 8 April, 2018; originally announced April 2018.

    Comments: 6 pages, 7 figures, IEEE International Conference of Multimedia and Expo 2018

  45. arXiv:1712.02527  [pdf, other

    stat.ML cs.CV cs.LG

    Learning Random Fourier Features by Hybrid Constrained Optimization

    Authors: Jianqiao Wangni, Jingwei Zhuo, Jun Zhu

    Abstract: The kernel embedding algorithm is an important component for adapting kernel methods to large datasets. Since the algorithm consumes a major computation cost in the testing phase, we propose a novel teacher-learner framework of learning computation-efficient kernel embeddings from specific data. In the framework, the high-precision embeddings (teacher) transfer the data information to the computat… ▽ More

    Submitted 7 December, 2017; originally announced December 2017.

  46. arXiv:1708.04781  [pdf, other

    cs.LG stat.ML

    Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors

    Authors: Yichi Zhou, Jun Zhu, Jingwei Zhuo

    Abstract: Thompson sampling has impressive empirical performance for many multi-armed bandit problems. But current algorithms for Thompson sampling only work for the case of conjugate priors since these algorithms require to infer the posterior, which is often computationally intractable when the prior is not conjugate. In this paper, we propose a novel algorithm for Thompson sampling which only requires to… ▽ More

    Submitted 16 August, 2017; originally announced August 2017.

  47. arXiv:1703.07948  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning

    Authors: Fanhua Shang, Yuanyuan Liu, James Cheng, Jiacheng Zhuo

    Abstract: Recently, research on accelerated stochastic gradient descent methods (e.g., SVRG) has made exciting progress (e.g., linear convergence for strongly convex problems). However, the best-known methods (e.g., Katyusha) requires at least two auxiliary variables and two momentum parameters. In this paper, we propose a fast stochastic variance reduction gradient (FSVRG) method, in which we design a nove… ▽ More

    Submitted 17 April, 2017; v1 submitted 23 March, 2017; originally announced March 2017.

    Comments: Corrected a few typos in this version

  48. Estimation of Fiber Orientations Using Neighborhood Information

    Authors: Chuyang Ye, Jiachen Zhuo, Rao P. Gullapalli, Jerry L. Prince

    Abstract: Data from diffusion magnetic resonance imaging (dMRI) can be used to reconstruct fiber tracts, for example, in muscle and white matter. Estimation of fiber orientations (FOs) is a crucial step in the reconstruction process and these estimates can be corrupted by noise. In this paper, a new method called Fiber Orientation Reconstruction using Neighborhood Information (FORNI) is described and shown… ▽ More

    Submitted 16 May, 2016; v1 submitted 15 January, 2016; originally announced January 2016.

    Comments: Journal paper accepted in Medical Image Analysis. 35 pages and 16 figures