Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 128 results for author: Yeh, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07882  [pdf, other

    cs.CL cs.AI cs.HC

    Designing a Dashboard for Transparency and Control of Conversational AI

    Authors: Yida Chen, Aoyu Wu, Trevor DePodesta, Catherine Yeh, Kenneth Li, Nicholas Castillo Marin, Oam Patel, Jan Riecke, Shivam Raval, Olivia Seow, Martin Wattenberg, Fernanda Viégas

    Abstract: Conversational LLMs function as black box systems, leaving users guessing about why they see the output they do. This lack of transparency is potentially problematic, especially given concerns around bias and truthfulness. To address this issue, we present an end-to-end prototype-connecting interpretability techniques with user experience design-that seeks to make chatbots more transparent. We beg… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Project page: https://bit.ly/talktuner-project-page 38 pages, 23 figures

  2. arXiv:2406.06523  [pdf, other

    cs.CV

    NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing

    Authors: Ting-Hsuan Chen, Jiewen Chan, Hau-Shiang Shiu, Shih-Han Yen, Chang-Han Yeh, Yu-Lun Liu

    Abstract: We propose a video editing framework, NaRCan, which integrates a hybrid deformation field and diffusion prior to generate high-quality natural canonical images to represent the input video. Our approach utilizes homography to model global motion and employs multi-layer perceptrons (MLPs) to capture local residual deformations, enhancing the model's ability to handle complex video dynamics. By intr… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Project page: https://koi953215.github.io/NaRCan_page/

  3. arXiv:2405.06345  [pdf, other

    cs.CV

    Evaluating Adversarial Robustness in the Spatial Frequency Domain

    Authors: Keng-Hsin Liao, Chin-Yuan Yeh, Hsi-Wen Chen, Ming-Syan Chen

    Abstract: Convolutional Neural Networks (CNNs) have dominated the majority of computer vision tasks. However, CNNs' vulnerability to adversarial attacks has raised concerns about deploying these models to safety-critical applications. In contrast, the Human Visual System (HVS), which utilizes spatial frequency channels to process visual signals, is immune to adversarial attacks. As such, this paper presents… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 14 pages

  4. Masked Graph Transformer for Large-Scale Recommendation

    Authors: Huiyuan Chen, Zhe Xu, Chin-Chia Michael Yeh, Vivian Lai, Yan Zheng, Minghua Xu, Hanghang Tong

    Abstract: Graph Transformers have garnered significant attention for learning graph-structured data, thanks to their superb ability to capture long-range dependencies among nodes. However, the quadratic space and time complexity hinders the scalability of Graph Transformers, particularly for large-scale recommendation. Here we propose an efficient Masked Graph Transformer, named MGFormer, capable of capturi… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  5. arXiv:2405.00483  [pdf, other

    cs.CV cs.MM

    In Anticipation of Perfect Deepfake: Identity-anchored Artifact-agnostic Detection under Rebalanced Deepfake Detection Protocol

    Authors: Wei-Han Wang, Chin-Yuan Yeh, Hsi-Wen Chen, De-Nian Yang, Ming-Syan Chen

    Abstract: As deep generative models advance, we anticipate deepfakes achieving "perfection"-generating no discernible artifacts or noise. However, current deepfake detectors, intentionally or inadvertently, rely on such artifacts for detection, as they are exclusive to deepfakes and absent in genuine examples. To bridge this gap, we introduce the Rebalanced Deepfake Detection Protocol (RDDP) to stress-test… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  6. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  7. arXiv:2402.15504  [pdf, other

    cs.CV cs.AI

    Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition

    Authors: Chun-Hsiao Yeh, Ta-Ying Cheng, He-Yen Hsieh, Chuan-En Lin, Yi Ma, Andrew Markham, Niki Trigoni, H. T. Kung, Yubei Chen

    Abstract: Recent text-to-image diffusion models are able to learn and synthesize images containing novel, personalized concepts (e.g., their own pets or specific items) with just a few examples for training. This paper tackles two interconnected issues within this realm of personalizing text-to-image diffusion models. First, current personalization techniques fail to reliably extend to multiple concepts --… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Preprint; Project Page: https://danielchyeh.github.io/Gen4Gen/

  8. arXiv:2402.13025  [pdf, other

    cs.CL cs.AI

    CFEVER: A Chinese Fact Extraction and VERification Dataset

    Authors: Ying-Jia Lin, Chun-Yi Lin, Chia-Jen Yeh, Yi-Ting Li, Yun-Yu Hu, Chih-Hao Hsu, Mei-Feng Lee, Hung-Yu Kao

    Abstract: We present CFEVER, a Chinese dataset designed for Fact Extraction and VERification. CFEVER comprises 30,012 manually created claims based on content in Chinese Wikipedia. Each claim in CFEVER is labeled as "Supports", "Refutes", or "Not Enough Info" to depict its degree of factualness. Similar to the FEVER dataset, claims in the "Supports" and "Refutes" categories are also annotated with correspon… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: AAAI-24

  9. arXiv:2402.10487  [pdf, other

    cs.LG cs.AI

    RPMixer: Shaking Up Time Series Forecasting with Random Projections for Large Spatial-Temporal Data

    Authors: Chin-Chia Michael Yeh, Yujie Fan, Xin Dai, Uday Singh Saini, Vivian Lai, Prince Osei Aboagye, Junpeng Wang, Huiyuan Chen, Yan Zheng, Zhongfang Zhuang, Liang Wang, Wei Zhang

    Abstract: Spatial-temporal forecasting systems play a crucial role in addressing numerous real-world challenges. In this paper, we investigate the potential of addressing spatial-temporal forecasting problems using general time series forecasting models, i.e., models that do not leverage the spatial relationships among the nodes. We propose a all-Multi-Layer Perceptron (all-MLP) time series forecasting arch… ▽ More

    Submitted 12 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  10. arXiv:2402.09368  [pdf, other

    cs.CV cs.AI

    Magic-Me: Identity-Specific Video Customized Diffusion

    Authors: Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li, Huanrui Yang, Zhen Dong, Kurt Keutzer, Jiashi Feng

    Abstract: Creating content with specified identities (ID) has attracted significant interest in the field of generative models. In the field of text-to-image generation (T2I), subject-driven creation has achieved great progress with the identity controlled via reference images. However, its extension to video generation is not well explored. In this work, we propose a simple yet effective subject identity c… ▽ More

    Submitted 20 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Project Page at https://magic-me-webpage.github.io

  11. arXiv:2402.08855  [pdf, other

    cs.HC cs.AI

    GhostWriter: Augmenting Collaborative Human-AI Writing Experiences Through Personalization and Agency

    Authors: Catherine Yeh, Gonzalo Ramos, Rachel Ng, Andy Huntington, Richard Banks

    Abstract: Large language models (LLMs) are becoming more prevalent and have found a ubiquitous use in providing different forms of writing assistance. However, LLM-powered writing systems can frustrate users due to their limited personalization and control, which can be exacerbated when users lack experience with prompt engineering. We see design as one way to address these challenges and introduce GhostWri… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 29 pages, 12 figures

  12. arXiv:2402.08096  [pdf, other

    cs.LG

    Which Pretrain Samples to Rehearse when Finetuning Pretrained Models?

    Authors: Andrew Bai, Chih-Kuan Yeh, Cho-Jui Hsieh, Ankur Taly

    Abstract: Fine-tuning pretrained foundational models on specific tasks is now the de facto approach for text and vision tasks. A known pitfall of this approach is the forgetting of pretraining knowledge that happens during finetuning. Rehearsing samples randomly from the pretrain dataset is a common approach to alleviate such forgetting. However, we find that random mixing unintentionally includes samples w… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 17 pages, 13 figures

  13. arXiv:2401.09489  [pdf

    cs.LG cs.AI

    PUPAE: Intuitive and Actionable Explanations for Time Series Anomalies

    Authors: Audrey Der, Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Wei Zhang, Eamonn J. Keogh

    Abstract: In recent years there has been significant progress in time series anomaly detection. However, after detecting an (perhaps tentative) anomaly, can we explain it? Such explanations would be useful to triage anomalies. For example, in an oil refinery, should we respond to an anomaly by dispatching a hydraulic engineer, or an intern to replace the battery on a sensor? There have been some parallel ef… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 9 Page Manuscript, 1 Page Supplementary (Supplement not published in conference proceedings.)

    Journal ref: SIAM SDM 2024

  14. arXiv:2312.15393  [pdf, other

    cs.CV

    Debiased Learning for Remote Sensing Data

    Authors: Chun-Hsiao Yeh, Xudong Wang, Stella X. Yu, Charles Hill, Zackery Steck, Scott Kangas, Aaron Reite

    Abstract: Deep learning has had remarkable success at analyzing handheld imagery such as consumer photos due to the availability of large-scale human annotations (e.g., ImageNet). However, remote sensing data lacks such extensive annotation and thus potential for supervised learning. To address this, we propose a highly effective semi-supervised approach tailored specifically to remote sensing data. Our app… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: Accepted to CVPR 2023 MultiEarth Workshop

  15. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  16. arXiv:2312.06134  [pdf, other

    cs.CL cs.LG

    Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

    Authors: Dami Choi, Derrick Xin, Hamid Dadkhahi, Justin Gilmer, Ankush Garg, Orhan Firat, Chih-Kuan Yeh, Andrew M. Dai, Behrooz Ghorbani

    Abstract: In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's be… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  17. arXiv:2311.03393  [pdf, other

    cs.DB cs.AI

    Sketching Multidimensional Time Series for Fast Discord Mining

    Authors: Chin-Chia Michael Yeh, Yan Zheng, Menghai Pan, Huiyuan Chen, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei Zhang, Jeff M. Phillips, Eamonn Keogh

    Abstract: Time series discords are a useful primitive for time series anomaly detection, and the matrix profile is capable of capturing discord effectively. There exist many research efforts to improve the scalability of discord discovery with respect to the length of time series. However, there is surprisingly little work focused on reducing the time complexity of matrix profile computation associated with… ▽ More

    Submitted 7 December, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

  18. arXiv:2311.02772  [pdf, ps, other

    cs.SD cs.CL eess.AS

    Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

    Authors: Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel

    Abstract: In this paper, we show that a simple self-supervised pre-trained audio model can achieve comparable inference efficiency to more complicated pre-trained models with speech transformer encoders. These speech transformers rely on mixing convolutional modules with self-attention modules. They achieve state-of-the-art performance on ASR with top efficiency. We first show that employing these speech tr… ▽ More

    Submitted 8 February, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: 5 pages; accepted to Self-supervision in Audio, Speech and Beyond (SASB) workshop in ICASSP24

  19. arXiv:2311.02563  [pdf, other

    cs.DB cs.AI cs.CR cs.LG

    Time Series Synthesis Using the Matrix Profile for Anonymization

    Authors: Audrey Der, Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Huiyuan Chen, Zhongfang Zhuang, Liang Wang, Wei Zhang, Eamonn Keogh

    Abstract: Publishing and sharing data is crucial for the data mining community, allowing collaboration and driving open innovation. However, many researchers cannot release their data due to privacy regulations or fear of leaking confidential business information. To alleviate such issues, we propose the Time Series Synthesis Using the Matrix Profile (TSSUMP) method, where synthesized time series can be rel… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  20. arXiv:2311.02561  [pdf, other

    cs.LG cs.AI

    Ego-Network Transformer for Subsequence Classification in Time Series Data

    Authors: Chin-Chia Michael Yeh, Huiyuan Chen, Yujie Fan, Xin Dai, Yan Zheng, Vivian Lai, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Wei Zhang, Eamonn Keogh

    Abstract: Time series classification is a widely studied problem in the field of time series data mining. Previous research has predominantly focused on scenarios where relevant or foreground subsequences have already been extracted, with each subsequence corresponding to a single label. However, real-world time series data often contain foreground subsequences that are intertwined with background subsequen… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  21. arXiv:2311.02560  [pdf, other

    cs.IR cs.LG

    Temporal Treasure Hunt: Content-based Time Series Retrieval System for Discovering Insights

    Authors: Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Yujie Fan, Vivian Lai, Junpeng Wang, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei Zhang

    Abstract: Time series data is ubiquitous across various domains such as finance, healthcare, and manufacturing, but their properties can vary significantly depending on the domain they originate from. The ability to perform Content-based Time Series Retrieval (CTSR) is crucial for identifying unknown time series examples. However, existing CTSR works typically focus on retrieving time series from a single d… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  22. arXiv:2311.01615  [pdf, other

    cs.SD cs.CL eess.AS

    FLAP: Fast Language-Audio Pre-training

    Authors: Ching-Feng Yeh, Po-Yao Huang, Vasu Sharma, Shang-Wen Li, Gargi Gosh

    Abstract: We propose Fast Language-Audio Pre-training (FLAP), a self-supervised approach that efficiently and effectively learns aligned audio and language representations through masking, contrastive learning and reconstruction. For efficiency, FLAP randomly drops audio spectrogram tokens, focusing solely on the remaining ones for self-supervision. Through inter-modal contrastive learning, FLAP learns to a… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 6 pages

  23. arXiv:2310.18526  [pdf, other

    cs.LG cs.AI

    Sample based Explanations via Generalized Representers

    Authors: Che-Ping Tsai, Chih-Kuan Yeh, Pradeep Ravikumar

    Abstract: We propose a general class of sample based explanations of machine learning models, which we term generalized representers. To measure the effect of a training sample on a model's test prediction, generalized representers use two components: a global sample importance that quantifies the importance of the training point to the model and is invariant to test samples, and a local sample importance t… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted by Neurips 2023

  24. arXiv:2310.17419  [pdf, other

    cs.CV

    AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors

    Authors: You-Ming Chang, Chen Yeh, Wei-Chen Chiu, Ning Yu

    Abstract: Deep generative models can create remarkably photorealistic fake images while raising concerns about misinformation and copyright infringement, known as deepfake threats. Deepfake detection technique is developed to distinguish between real and fake images, where the existing methods typically train classifiers in the image domain or various feature domains. However, the generalizability of deepfa… ▽ More

    Submitted 2 November, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

  25. FATA-Trans: Field And Time-Aware Transformer for Sequential Tabular Data

    Authors: Dongyu Zhang, Liang Wang, Xin Dai, Shubham Jain, Junpeng Wang, Yujie Fan, Chin-Chia Michael Yeh, Yan Zheng, Zhongfang Zhuang, Wei Zhang

    Abstract: Sequential tabular data is one of the most commonly used data types in real-world applications. Different from conventional tabular data, where rows in a table are independent, sequential tabular data contains rich contextual and sequential information, where some fields are dynamically changing over time and others are static. Existing transformer-based approaches analyzing sequential tabular dat… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: This work is accepted by ACM International Conference on Information and Knowledge Management (CIKM) 2023

  26. arXiv:2310.03925  [pdf, other

    cs.LG cs.AI

    Multitask Learning for Time Series Data with 2D Convolution

    Authors: Chin-Chia Michael Yeh, Xin Dai, Yan Zheng, Junpeng Wang, Huiyuan Chen, Yujie Fan, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei Zhang

    Abstract: Multitask learning (MTL) aims to develop a unified model that can handle a set of closely related tasks simultaneously. By optimizing the model across multiple tasks, MTL generally surpasses its non-MTL counterparts in terms of generalizability. Although MTL has been extensively researched in various domains such as computer vision, natural language processing, and recommendation systems, its appl… ▽ More

    Submitted 10 October, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

  27. arXiv:2310.03919  [pdf, other

    cs.IR cs.AI cs.LG

    An Efficient Content-based Time Series Retrieval System

    Authors: Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Junpeng Wang, Vivian Lai, Yujie Fan, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei Zhang, Jeff M. Phillips

    Abstract: A Content-based Time Series Retrieval (CTSR) system is an information retrieval system for users to interact with time series emerged from multiple domains, such as finance, healthcare, and manufacturing. For example, users seeking to learn more about the source of a time series can submit the time series as a query to the CTSR system and retrieve a list of relevant time series with associated met… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  28. arXiv:2310.03916  [pdf, other

    cs.LG cs.AI

    Toward a Foundation Model for Time Series Data

    Authors: Chin-Chia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Audrey Der, Vivian Lai, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei Zhang

    Abstract: A foundation model is a machine learning model trained on a large and diverse set of data, typically using self-supervised learning-based pre-training techniques, that can be adapted to various downstream tasks. However, current research on time series pre-training has mostly focused on models pre-trained solely on data from a single domain, resulting in a lack of knowledge about other types of ti… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  29. arXiv:2309.16058  [pdf, other

    cs.LG cs.CL cs.CV

    AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

    Authors: Seungwhan Moon, Andrea Madotto, Zhaojiang Lin, Tushar Nagarajan, Matt Smith, Shashank Jain, Chun-Fu Yeh, Prakash Murugesan, Peyman Heidari, Yue Liu, Kavya Srinet, Babak Damavandi, Anuj Kumar

    Abstract: We present Any-Modality Augmented Language Model (AnyMAL), a unified model that reasons over diverse input modality signals (i.e. text, image, video, audio, IMU motion sensor), and generates textual responses. AnyMAL inherits the powerful text-based reasoning abilities of the state-of-the-art LLMs including LLaMA-2 (70B), and converts modality-specific signals to the joint textual space through a… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  30. arXiv:2309.15169  [pdf, other

    cs.LG cs.AI

    Revealing the Power of Spatial-Temporal Masked Autoencoders in Multivariate Time Series Forecasting

    Authors: Jiarui Sun, Yujie Fan, Chin-Chia Michael Yeh, Wei Zhang, Girish Chowdhary

    Abstract: Multivariate time series (MTS) forecasting involves predicting future time series data based on historical observations. Existing research primarily emphasizes the development of complex spatial-temporal models that capture spatial dependencies and temporal correlations among time series variables explicitly. However, recent advances have been impeded by challenges relating to data scarcity and mo… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  31. Hessian-aware Quantized Node Embeddings for Recommendation

    Authors: Huiyuan Chen, Kaixiong Zhou, Kwei-Herng Lai, Chin-Chia Michael Yeh, Yan Zheng, Xia Hu, Hao Yang

    Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance in recommender systems. Nevertheless, the process of searching and ranking from a large item corpus usually requires high latency, which limits the widespread deployment of GNNs in industry-scale applications. To address this issue, many methods compress user/item representations into the binary embedding space to reduce space… ▽ More

    Submitted 2 September, 2023; originally announced September 2023.

  32. Adversarial Collaborative Filtering for Free

    Authors: Huiyuan Chen, Xiaoting Li, Vivian Lai, Chin-Chia Michael Yeh, Yujie Fan, Yan Zheng, Mahashweta Das, Hao Yang

    Abstract: Collaborative Filtering (CF) has been successfully used to help users discover the items of interest. Nevertheless, existing CF methods suffer from noisy data issue, which negatively impacts the quality of recommendation. To tackle this problem, many prior studies leverage adversarial learning to regularize the representations of users/items, which improves both generalizability and robustness. Th… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  33. Enhancing Transformers without Self-supervised Learning: A Loss Landscape Perspective in Sequential Recommendation

    Authors: Vivian Lai, Huiyuan Chen, Chin-Chia Michael Yeh, Minghua Xu, Yiwei Cai, Hao Yang

    Abstract: Transformer and its variants are a powerful class of architectures for sequential recommendation, owing to their ability of capturing a user's dynamic interests from their past interactions. Despite their success, Transformer-based models often require the optimization of a large number of parameters, making them difficult to train from sparse data in sequential recommendation. To address the prob… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  34. EmbeddingTree: Hierarchical Exploration of Entity Features in Embedding

    Authors: Yan Zheng, Junpeng Wang, Chin-Chia Michael Yeh, Yujie Fan, Huiyuan Chen, Liang Wang, Wei Zhang

    Abstract: Embedding learning transforms discrete data entities into continuous numerical representations, encoding features/properties of the entities. Despite the outstanding performance reported from different embedding learning algorithms, few efforts were devoted to structurally interpreting how features are encoded in the learned embedding space. This work proposes EmbeddingTree, a hierarchical embeddi… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: 5 pages, 3 figures, accepted by PacificVis 2023

  35. arXiv:2307.08910  [pdf, other

    cs.LG cs.IR

    Sharpness-Aware Graph Collaborative Filtering

    Authors: Huiyuan Chen, Chin-Chia Michael Yeh, Yujie Fan, Yan Zheng, Junpeng Wang, Vivian Lai, Mahashweta Das, Hao Yang

    Abstract: Graph Neural Networks (GNNs) have achieved impressive performance in collaborative filtering. However, GNNs tend to yield inferior performance when the distributions of training and test data are not aligned well. Also, training GNNs requires optimizing non-convex neural networks with an abundance of local and global minima, which may differ widely in their performance at test time. Thus, it is es… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  36. arXiv:2306.14649  [pdf, other

    cs.NE

    CIMulator: A Comprehensive Simulation Platform for Computing-In-Memory Circuit Macros with Low Bit-Width and Real Memory Materials

    Authors: Hoang-Hiep Le, Md. Aftab Baig, Wei-Chen Hong, Cheng-Hsien Tsai, Cheng-Jui Yeh, Fu-Xiang Liang, I-Ting Huang, Wei-Tzu Tsai, Ting-Yin Cheng, Sourav De, Nan-Yow Chen, Wen-Jay Lee, Ing-Chao Lin, Da-Wei Chang, Darsen D. Lu

    Abstract: This paper presents a simulation platform, namely CIMulator, for quantifying the efficacy of various synaptic devices in neuromorphic accelerators for different neural network architectures. Nonvolatile memory devices, such as resistive random-access memory, ferroelectric field-effect transistor, and volatile static random-access memory devices, can be selected as synaptic devices. A multilayer pe… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  37. arXiv:2306.10169  [pdf, other

    cs.CV cs.CL cs.LG

    Meta-Personalizing Vision-Language Models to Find Named Instances in Video

    Authors: Chun-Hsiao Yeh, Bryan Russell, Josef Sivic, Fabian Caba Heilbron, Simon Jenni

    Abstract: Large-scale vision-language models (VLM) have shown impressive results for language-guided search applications. While these models allow category-level queries, they currently struggle with personalized searches for moments in a video where a specific object instance such as ``My dog Biscuit'' appears. We present the following three contributions to address this problem. First, we describe a metho… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted to CVPR 2023. Project webpage: https://danielchyeh.github.io/metaper/

  38. arXiv:2306.04997  [pdf, other

    eess.SP cs.AI

    Blockage Prediction in Directional mmWave Links Using Liquid Time Constant Network

    Authors: Martin H. Nielsen, Chia-Yi Yeh, Ming Shen, Muriel Médard

    Abstract: We propose to use a liquid time constant (LTC) network to predict the future blockage status of a millimeter wave (mmWave) link using only the received signal power as the input to the system. The LTC network is based on an ordinary differential equation (ODE) system inspired by biology and specialized for near-future prediction for time sequence observation as the input. Using an experimental dat… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 2 pages, pre-print for IRMMW 2023 conference

  39. arXiv:2306.01913  [pdf, other

    cs.AI

    PDT: Pretrained Dual Transformers for Time-aware Bipartite Graphs

    Authors: Xin Dai, Yujie Fan, Zhongfang Zhuang, Shubham Jain, Chin-Chia Michael Yeh, Junpeng Wang, Liang Wang, Yan Zheng, Prince Osei Aboagye, Wei Zhang

    Abstract: Pre-training on large models is prevalent and emerging with the ever-growing user-generated content in many machine learning application categories. It has been recognized that learning contextual knowledge from the datasets depicting user-content interaction plays a vital role in downstream tasks. Despite several studies attempting to learn contextual knowledge via pre-training methods, finding a… ▽ More

    Submitted 25 September, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  40. arXiv:2306.01449  [pdf, other

    cs.CV

    SASMU: boost the performance of generalized recognition model using synthetic face dataset

    Authors: Chia-Chun Chung, Pei-Chun Chang, Yong-Sheng Chen, HaoYuan He, Chinson Yeh

    Abstract: Nowadays, deploying a robust face recognition product becomes easy with the development of face recognition techniques for decades. Not only profile image verification but also the state-of-the-art method can handle the in-the-wild image almost perfectly. However, the concern of privacy issues raise rapidly since mainstream research results are powered by tons of web-crawled data, which faces the… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: under review

  41. arXiv:2305.03210  [pdf, other

    cs.HC cs.CL cs.CV cs.LG

    AttentionViz: A Global View of Transformer Attention

    Authors: Catherine Yeh, Yida Chen, Aoyu Wu, Cynthia Chen, Fernanda Viégas, Martin Wattenberg

    Abstract: Transformer models are revolutionizing machine learning, but their inner workings remain mysterious. In this work, we present a new visualization technique designed to help researchers understand the self-attention mechanism in transformers that allows these models to learn rich, contextual relationships between elements of a sequence. The main idea behind our method is to visualize a joint embedd… ▽ More

    Submitted 9 August, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: 11 pages, 13 figures

  42. arXiv:2305.02301  [pdf, other

    cs.CL cs.AI cs.LG

    Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

    Authors: Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, Tomas Pfister

    Abstract: Deploying large language models (LLMs) is challenging because they are memory inefficient and compute-intensive for practical applications. In reaction, researchers train smaller task-specific models by either finetuning with human labels or distilling using LLM-generated labels. However, finetuning and distillation require large amounts of training data to achieve comparable performance to LLMs.… ▽ More

    Submitted 5 July, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL 2023

  43. arXiv:2303.13731  [pdf, other

    cs.LG cs.CV cs.HC

    How Does Attention Work in Vision Transformers? A Visual Analytics Attempt

    Authors: Yiran Li, Junpeng Wang, Xin Dai, Liang Wang, Chin-Chia Michael Yeh, Yan Zheng, Wei Zhang, Kwan-Liu Ma

    Abstract: Vision transformer (ViT) expands the success of transformer models from sequential data to images. The model decomposes an image into many smaller patches and arranges them into a sequence. Multi-head self-attentions are then applied to the sequence to learn the attention between patches. Despite many successful interpretations of transformers on sequential data, little effort has been devoted to… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted by PacificVis 2023 and selected to be published in TVCG

  44. arXiv:2302.07492  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Envisioning the Next-Gen Document Reader

    Authors: Catherine Yeh, Nedim Lipka, Franck Dernoncourt

    Abstract: People read digital documents on a daily basis to share, exchange, and understand information in electronic settings. However, current document readers create a static, isolated reading experience, which does not support users' goals of gaining more knowledge and performing additional tasks through document interaction. In this work, we present our vision for the next-gen document reader that stri… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: Paper accepted at the AAAI 2023 Workshop on Scientific Document Understanding

  45. arXiv:2301.00652  [pdf, other

    eess.AS cs.CL

    Efficient Speech Representation Learning with Low-Bit Quantization

    Authors: Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Abdelrahman Mohamed

    Abstract: With the development of hardware for machine learning, newer models often come at the cost of both increased sizes and computational complexity. In effort to improve the efficiency for these models, we apply and investigate recent quantization techniques on speech representation learning models. The quantization techniques were evaluated on the SUPERB benchmark. On the ASR task, with aggressive qu… ▽ More

    Submitted 14 December, 2022; originally announced January 2023.

    Comments: 7 pages

  46. arXiv:2212.06146  [pdf

    cs.LG cs.AI

    Matrix Profile XXVII: A Novel Distance Measure for Comparing Long Time Series

    Authors: Audrey Der, Chin-Chia Michael Yeh, Renjie Wu, Junpeng Wang, Yan Zheng, Zhongfang Zhuang, Liang Wang, Wei Zhang, Eamonn Keogh

    Abstract: The most useful data mining primitives are distance measures. With an effective distance measure, it is possible to perform classification, clustering, anomaly detection, segmentation, etc. For single-event time series Euclidean Distance and Dynamic Time Warping distance are known to be extremely effective. However, for time series containing cyclical behaviors, the semantic meaningfulness of such… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

    Comments: Accepted at IEEE ICKG 2022. (Previously entitled IEEE ICBK.) Abridged abstract as per arxiv's requirements

  47. TinyKG: Memory-Efficient Training Framework for Knowledge Graph Neural Recommender Systems

    Authors: Huiyuan Chen, Xiaoting Li, Kaixiong Zhou, Xia Hu, Chin-Chia Michael Yeh, Yan Zheng, Hao Yang

    Abstract: There has been an explosion of interest in designing various Knowledge Graph Neural Networks (KGNNs), which achieve state-of-the-art performance and provide great explainability for recommendation. The promising performance is mainly resulting from their capability of capturing high-order proximity messages over the knowledge graphs. However, training KGNNs at scale is challenging due to the high… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  48. Denoising Self-attentive Sequential Recommendation

    Authors: Huiyuan Chen, Yusan Lin, Menghai Pan, Lan Wang, Chin-Chia Michael Yeh, Xiaoting Li, Yan Zheng, Fei Wang, Hao Yang

    Abstract: Transformer-based sequential recommenders are very powerful for capturing both short-term and long-term sequential item dependencies. This is mainly attributed to their unique self-attention networks to exploit pairwise item-item interactions within the sequence. However, real-world item sequences are often noisy, which is particularly true for implicit feedback. For example, a large portion of cl… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  49. arXiv:2212.03449  [pdf, other

    cs.LG cs.SI

    Dynamic Graph Node Classification via Time Augmentation

    Authors: Jiarui Sun, Mengting Gu, Chin-Chia Michael Yeh, Yujie Fan, Girish Chowdhary, Wei Zhang

    Abstract: Node classification for graph-structured data aims to classify nodes whose labels are unknown. While studies on static graphs are prevalent, few studies have focused on dynamic graph node classification. Node classification on dynamic graphs is challenging for two reasons. First, the model needs to capture both structural and temporal information, particularly on dynamic graphs with a long history… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: Accepted by IEEE Big Data 2022

  50. arXiv:2212.01393  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Continual Learning for On-Device Speech Recognition using Disentangled Conformers

    Authors: Anuj Diwan, Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Eunsol Choi, David Harwath, Abdelrahman Mohamed

    Abstract: Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as speech models are increasingly deployed on personal devices, such models encounter user-specific distributional shifts. To simulate this real-world scenario, we introduce LibriContinual, a continual learning benchmark for speaker-specific domain adaptation derived from LibriVox audiobooks, with dat… ▽ More

    Submitted 13 December, 2022; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: 8 pages, 2 figures. Submitted to ICASSP 2023