Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 75 results for author: Lu, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.07921  [pdf

    cs.LG

    Physics-Informed Neural Network for Predicting Out-of-Training-Range TCAD Solution with Minimized Domain Expertise

    Authors: Albert Lu, Yu Foon Chau, Hiu Yung Wong

    Abstract: Machine learning (ML) is promising in assisting technology computer-aided design (TCAD) simulations to alleviate difficulty in convergence and prolonged simulation time. While ML is widely used in TCAD, they either require access to the internal solver, require extensive domain expertise, are only trained by terminal quantities such as currents and voltages, and/or lack out-of-training-range predi… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  2. arXiv:2408.06907  [pdf, other

    cs.IT

    An Information Geometry Interpretation for Approximate Message Passing

    Authors: Bingyan Liu, An-An Lu, Mingrui Fan, Jiyuan Yang, Xiqi Gao

    Abstract: In this paper, we propose an information geometry (IG) framework to solve the standard linear regression problem. The proposed framework is an extension of the one for computing the mean of complex multivariate Gaussian distribution. By applying the proposed framework, the information geometry approach (IGA) and the approximate information geometry approach (AIGA) for basis pursuit de-noising (BPD… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 30 pages, 5 figures

  3. arXiv:2408.04579  [pdf, other

    cs.CV

    SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More

    Authors: Tianrun Chen, Ankang Lu, Lanyun Zhu, Chaotao Ding, Chunan Yu, Deyi Ji, Zejian Li, Lingyun Sun, Papa Mao, Ying Zang

    Abstract: The advent of large models, also known as foundation models, has significantly transformed the AI research landscape, with models like Segment Anything (SAM) achieving notable success in diverse image segmentation scenarios. Despite its advancements, SAM encountered limitations in handling some complex low-level segmentation tasks like camouflaged object and medical imaging. In response, in 2023,… ▽ More

    Submitted 10 August, 2024; v1 submitted 8 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: text overlap with arXiv:2304.09148

  4. arXiv:2408.02222  [pdf, other

    cs.CV

    Cross-modulated Attention Transformer for RGBT Tracking

    Authors: Yun Xiao, Jiacong Zhao, Andong Lu, Chenglong Li, Yin Lin, Bing Yin, Cong Liu

    Abstract: Existing Transformer-based RGBT trackers achieve remarkable performance benefits by leveraging self-attention to extract uni-modal features and cross-attention to enhance multi-modal feature interaction and template-search correlation computation. Nevertheless, the independent search-template correlation calculations ignore the consistency between branches, which can result in ambiguous and inappr… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

    Comments: 10 pages, 5 figures

  5. arXiv:2407.18175  [pdf, other

    cs.LG cs.AI cs.CV

    Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers

    Authors: Zhengang Li, Alec Lu, Yanyue Xie, Zhenglun Kong, Mengshu Sun, Hao Tang, Zhong Jia Xue, Peiyan Dong, Caiwen Ding, Yanzhi Wang, Xue Lin, Zhenman Fang

    Abstract: Vision transformers (ViTs) have demonstrated their superior accuracy for computer vision tasks compared to convolutional neural networks (CNNs). However, ViT models are often computation-intensive for efficient deployment on resource-limited edge devices. This work proposes Quasar-ViT, a hardware-oriented quantization-aware architecture search framework for ViTs, to design efficient ViT models for… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: Accepted by ICS 2024

  6. arXiv:2407.12322  [pdf, other

    cs.CV

    Frequency Guidance Matters: Skeletal Action Recognition by Frequency-Aware Mixed Transformer

    Authors: Wenhan Wu, Ce Zheng, Zihao Yang, Chen Chen, Srijan Das, Aidong Lu

    Abstract: Recently, transformers have demonstrated great potential for modeling long-term dependencies from skeleton sequences and thereby gained ever-increasing attention in skeleton action recognition. However, the existing transformer-based approaches heavily rely on the naive attention mechanism for capturing the spatiotemporal features, which falls short in learning discriminative representations that… ▽ More

    Submitted 29 July, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM Multimedia 2024

  7. arXiv:2406.19583  [pdf, other

    cs.IT

    Interference Cancellation Information Geometry Approach for Massive MIMO Channel Estimation

    Authors: An-An Lu, Bingyan Liu, Xiqi Gao

    Abstract: In this paper, the interference cancellation information geometry approaches (IC-IGAs) for massive MIMO channel estimation are proposed. The proposed algorithms are low-complexity approximations of the minimum mean square error (MMSE) estimation. To illustrate the proposed algorithms, a unified framework of the information geometry approach for channel estimation and its geometric explanation are… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 38 pages, 9 figures

  8. arXiv:2405.05428  [pdf, other

    cs.CV cs.CR cs.LG

    Adversary-Guided Motion Retargeting for Skeleton Anonymization

    Authors: Thomas Carr, Depeng Xu, Aidong Lu

    Abstract: Skeleton-based motion visualization is a rising field in computer vision, especially in the case of virtual reality (VR). With further advancements in human-pose estimation and skeleton extracting sensors, more and more applications that utilize skeleton data have come about. These skeletons may appear to be anonymous but they contain embedded personally identifiable information (PII). In this pap… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  9. arXiv:2405.02717  [pdf, other

    cs.CV

    AFter: Attention-based Fusion Router for RGBT Tracking

    Authors: Andong Lu, Wanyu Wang, Chenglong Li, Jin Tang, Bin Luo

    Abstract: Multi-modal feature fusion as a core investigative component of RGBT tracking emerges numerous fusion studies in recent years. However, existing RGBT tracking methods widely adopt fixed fusion structures to integrate multi-modal feature, which are hard to handle various challenges in dynamic scenarios. To address this problem, this work presents a novel \emph{A}ttention-based \emph{F}usion rou\emp… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: Peer review

  10. arXiv:2404.14829  [pdf, other

    cs.LG cs.CV

    Revisiting Neural Networks for Continual Learning: An Architectural Perspective

    Authors: Aojun Lu, Tao Feng, Hangjie Yuan, Xiaotian Song, Yanan Sun

    Abstract: Efforts to overcome catastrophic forgetting have primarily centered around developing more effective Continual Learning (CL) methods. In contrast, less attention was devoted to analyzing the role of network architecture design (e.g., network depth, width, and components) in contributing to CL. This paper seeks to bridge this gap between network architecture design and CL, and to present a holistic… ▽ More

    Submitted 28 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  11. arXiv:2404.07425  [pdf, ps, other

    eess.SP cs.IT

    Precoder Design for User-Centric Network Massive MIMO with Matrix Manifold Optimization

    Authors: Rui Sun, Li You, An-An Lu, Chen Sun, Xiqi Gao, Xiang-Gen Xia

    Abstract: In this paper, we investigate the precoder design for user-centric network (UCN) massive multiple-input multiple-output (mMIMO) downlink with matrix manifold optimization. In UCN mMIMO systems, each user terminal (UT) is served by a subset of base stations (BSs) instead of all the BSs, facilitating the implementation of the system and lowering the dimension of the precoders to be designed. By prov… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures, journal

  12. arXiv:2404.00986  [pdf, other

    cs.LG cs.CV

    Make Continual Learning Stronger via C-Flat

    Authors: Ang Bian, Wei Li, Hangjie Yuan, Chengrong Yu, Zixiang Zhao, Mang Wang, Aojun Lu, Tao Feng

    Abstract: Model generalization ability upon incrementally acquiring dynamically updating knowledge from sequentially arriving tasks is crucial to tackle the sensitivity-stability dilemma in Continual Learning (CL). Weight loss landscape sharpness minimization seeking for flat minima lying in neighborhoods with uniform low loss or smooth gradient is proven to be a strong training regime improving model gener… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  13. arXiv:2403.16151  [pdf, other

    cs.MA cs.IR

    Ultra Low-Cost Two-Stage Multimodal System for Non-Normative Behavior Detection

    Authors: Albert Lu, Stephen Cranefield

    Abstract: The online community has increasingly been inundated by a toxic wave of harmful comments. In response to this growing challenge, we introduce a two-stage ultra-low-cost multimodal harmful behavior detection method designed to identify harmful comments and images with high precision and recall rates. We first utilize the CLIP-ViT model to transform tweets and images into embeddings, effectively cap… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: to be appear in International Workshop on Coordination, Organizations, Institutions, Norms and Ethics for Governance of Multi-Agent Systems

  14. arXiv:2403.13588  [pdf, other

    cs.SE cs.CL

    Genetic Auto-prompt Learning for Pre-trained Code Intelligence Language Models

    Authors: Chengzhe Feng, Yanan Sun, Ke Li, Pan Zhou, Jiancheng Lv, Aojun Lu

    Abstract: As Pre-trained Language Models (PLMs), a popular approach for code intelligence, continue to grow in size, the computational cost of their usage has become prohibitively expensive. Prompt learning, a recent development in the field of natural language processing, emerges as a potential solution to address this challenge. In this paper, we investigate the effectiveness of prompt learning in code in… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  15. arXiv:2403.02563  [pdf, ps, other

    cs.CV cs.CL

    Systemic Biases in Sign Language AI Research: A Deaf-Led Call to Reevaluate Research Agendas

    Authors: Aashaka Desai, Maartje De Meulder, Julie A. Hochgesang, Annemarie Kocab, Alex X. Lu

    Abstract: Growing research in sign language recognition, generation, and translation AI has been accompanied by calls for ethical development of such technologies. While these works are crucial to helping individual researchers do better, there is a notable lack of discussion of systemic biases or analysis of rhetoric that shape the research questions and methods in the field, especially as it remains domin… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  16. arXiv:2402.00033  [pdf, other

    cs.CV cs.AI

    LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition

    Authors: Youbing Hu, Yun Cheng, Anqi Lu, Zhiqiang Cao, Dawei Wei, Jie Liu, Zhijun Li

    Abstract: The Vision Transformer (ViT) excels in accuracy when handling high-resolution images, yet it confronts the challenge of significant spatial redundancy, leading to increased computational and memory requirements. To address this, we present the Localization and Focus Vision Transformer (LF-ViT). This model operates by strategically curtailing computational demands without impinging on performance.… ▽ More

    Submitted 7 January, 2024; originally announced February 2024.

  17. arXiv:2401.02035  [pdf, ps, other

    cs.IT

    Efficient Information Geometry Approach for Massive MIMO-OFDM Channel Estimation

    Authors: Jiyuan Yang, Yan Chen, Mingrui Fan, An-An Lu, Wen Zhong, Xiqi Gao, Xiaohu You, Xiang-Gen Xia, Dirk Slock

    Abstract: We investigate the channel estimation for massive multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) systems. We revisit the information geometry approach (IGA) for massive MIMO-OFDM channel estimation. By using the constant magnitude property of the entries of the measurement matrix, we find that the second-order natural parameters of the distributions on all th… ▽ More

    Submitted 3 June, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  18. arXiv:2401.01674  [pdf, other

    cs.CV

    Transformer RGBT Tracking with Spatio-Temporal Multimodal Tokens

    Authors: Dengdi Sun, Yajie Pan, Andong Lu, Chenglong Li, Bin Luo

    Abstract: Many RGBT tracking researches primarily focus on modal fusion design, while overlooking the effective handling of target appearance changes. While some approaches have introduced historical frames or fuse and replace initial templates to incorporate temporal information, they have the risk of disrupting the original target appearance and accumulating errors over time. To alleviate these limitation… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  19. arXiv:2312.16246  [pdf, other

    cs.CV

    Nighttime Person Re-Identification via Collaborative Enhancement Network with Multi-domain Learning

    Authors: Andong Lu, Tianrui Zha, Chenglong Li, Jin Tang, Xiaofeng Wang, Bin Luo

    Abstract: Prevalent nighttime ReID methods typically combine relighting networks and ReID networks in a sequential manner, which not only restricts the ReID performance by the quality of relighting images, but also neglects the effective collaborative modeling between image relighting and person ReID tasks. To handle these problems, we propose a novel Collaborative Enhancement Network called CENet, which pe… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  20. arXiv:2312.16244  [pdf, other

    cs.CV

    Modality-missing RGBT Tracking: Invertible Prompt Learning and High-quality Benchmarks

    Authors: Andong Lu, Jiacong Zhao, Chenglong Li, Jin Tang, Bin Luo

    Abstract: Current RGBT tracking research relies on the complete multi-modal input, but modal information might miss due to some factors such as thermal sensor self-calibration and data transmission error, called modality-missing challenge in this work. To address this challenge, we propose a novel invertible prompt learning approach, which integrates the content-preserving prompts into a well-trained tracki… ▽ More

    Submitted 20 March, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

  21. arXiv:2311.00690  [pdf, other

    cs.HC cs.CV cs.LG

    What User Behaviors Make the Differences During the Process of Visual Analytics?

    Authors: Zekun Wu, Shahin Doroudian, Aidong Lu

    Abstract: The understanding of visual analytics process can benefit visualization researchers from multiple aspects, including improving visual designs and developing advanced interaction functions. However, the log files of user behaviors are still hard to analyze due to the complexity of sensemaking and our lack of knowledge on the related user behaviors. This work presents a study on a comprehensive data… ▽ More

    Submitted 3 December, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: This version corrects the issues of previous versions

  22. arXiv:2310.07822  [pdf, other

    cs.RO

    Body-mounted MR-conditional Robot for Minimally Invasive Liver Intervention

    Authors: Zhefeng Huang, Anthony L. Gunderman, Samuel E. Wilcox, Saikat Sengupta, Jay Shah, Aiming Lu, David Woodrum, Yue Chen

    Abstract: MR-guided microwave ablation (MWA) has proven effective in treating hepatocellular carcinoma (HCC) with small-sized tumors, but the state-of-the-art technique suffers from sub-optimal workflow due to speed and accuracy of needle placement. This paper presents a compact body-mounted MR-conditional robot that can operate in closed-bore MR scanners for accurate needle guidance. The robotic platform c… ▽ More

    Submitted 25 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 10 figures

  23. arXiv:2308.16486  [pdf, other

    cs.CV

    Illumination Distillation Framework for Nighttime Person Re-Identification and A New Benchmark

    Authors: Andong Lu, Zhang Zhang, Yan Huang, Yifan Zhang, Chenglong Li, Jin Tang, Liang Wang

    Abstract: Nighttime person Re-ID (person re-identification in the nighttime) is a very important and challenging task for visual surveillance but it has not been thoroughly investigated. Under the low illumination condition, the performance of person Re-ID methods usually sharply deteriorates. To address the low illumination challenge in nighttime person Re-ID, this paper proposes an Illumination Distillati… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: Accepted by TMM

  24. arXiv:2306.15123  [pdf, other

    cs.CL cs.CE

    Investigating Cross-Domain Behaviors of BERT in Review Understanding

    Authors: Albert Lu, Meng Jiang

    Abstract: Review score prediction requires review text understanding, a critical real-world application of natural language processing. Due to dissimilar text domains in product reviews, a common practice is fine-tuning BERT models upon reviews of differing domains. However, there has not yet been an empirical study of cross-domain behaviors of BERT models in the various tasks of product review understandin… ▽ More

    Submitted 27 June, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: 9 pages, 1 figure, 2 tables

  25. arXiv:2305.00666  [pdf, other

    cs.CV cs.AI

    Part Aware Contrastive Learning for Self-Supervised Action Recognition

    Authors: Yilei Hua, Wenhan Wu, Ce Zheng, Aidong Lu, Mengyuan Liu, Chen Chen, Shiqian Wu

    Abstract: In recent years, remarkable results have been achieved in self-supervised action recognition using skeleton sequences with contrastive learning. It has been observed that the semantic distinction of human action features is often represented by local body parts, such as legs or hands, which are advantageous for skeleton-based action recognition. This paper proposes an attention-based contrastive l… ▽ More

    Submitted 11 May, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

    Comments: 7 pages, 4 figures, accepted by IJCAI 2023

  26. arXiv:2304.05934  [pdf, other

    cs.CV cs.CL

    ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign Language Recognition

    Authors: Aashaka Desai, Lauren Berger, Fyodor O. Minakov, Vanessa Milan, Chinmay Singh, Kriston Pumphrey, Richard E. Ladner, Hal Daumé III, Alex X. Lu, Naomi Caselli, Danielle Bragg

    Abstract: Sign languages are used as a primary language by approximately 70 million D/deaf people world-wide. However, most communication technologies operate in spoken and written languages, creating inequities in access. To help tackle this problem, we release ASL Citizen, the first crowdsourced Isolated Sign Language Recognition (ISLR) dataset, collected with consent and containing 83,399 videos for 2,73… ▽ More

    Submitted 19 June, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

  27. arXiv:2304.03879  [pdf, other

    cs.IR cs.LG

    GPT4Rec: A Generative Framework for Personalized Recommendation and User Interests Interpretation

    Authors: Jinming Li, Wentao Zhang, Tian Wang, Guanglei Xiong, Alan Lu, Gerard Medioni

    Abstract: Recent advancements in Natural Language Processing (NLP) have led to the development of NLP-based recommender systems that have shown superior performance. However, current models commonly treat items as mere IDs and adopt discriminative modeling, resulting in limitations of (1) fully leveraging the content information of items and the language modeling capabilities of NLP models; (2) interpreting… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

  28. Device Image-IV Mapping using Variational Autoencoder for Inverse Design and Forward Prediction

    Authors: Thomas Lu, Albert Lu, Hiu Yung Wong

    Abstract: This paper demonstrates the learning of the underlying device physics by mapping device structure images to their corresponding Current-Voltage (IV) characteristics using a novel framework based on variational autoencoders (VAE). Since VAE is used, domain expertise is not required and the framework can be quickly deployed on any new device and measurement. This is expected to be useful in the comp… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: 5 pages 6 figures

    Journal ref: 2023 International Conference on Simulation of Semiconductor Processes and Devices (SISPAD), Kobe, Japan, 2023, pp. 161-164

  29. Precoder Design for Massive MIMO Downlink with Matrix Manifold Optimization

    Authors: Rui Sun, Chen Wang, An-An Lu, Xiqi Gao, Xiang-Gen Xia

    Abstract: We investigate the weighted sum-rate (WSR) maximization linear precoder design for massive multiple-input multiple-output (MIMO) downlink. We consider a single-cell system with multiple users and propose a unified matrix manifold optimization framework applicable to total power constraint (TPC), per-user power constraint (PUPC) and per-antenna power constraint (PAPC). We prove that the precoders u… ▽ More

    Submitted 10 April, 2024; v1 submitted 31 March, 2023; originally announced April 2023.

    Comments: 16 pages, 11 figures, journal

    Journal ref: IEEE Transactions on Signal Processing, vol. 72, pp. 1065-1080, 2024

  30. arXiv:2303.02241  [pdf, other

    cs.CV cs.LG

    Domain adaptation using optimal transport for invariant learning using histopathology datasets

    Authors: Kianoush Falahkheirkhah, Alex Lu, David Alvarez-Melis, Grace Huynh

    Abstract: Histopathology is critical for the diagnosis of many diseases, including cancer. These protocols typically require pathologists to manually evaluate slides under a microscope, which is time-consuming and subjective, leading to interest in machine learning to automate analysis. However, computational techniques are limited by batch effects, where technical factors like differences in preparation pr… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

  31. arXiv:2302.09185  [pdf, other

    cs.CL cs.AI cs.LG

    Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints

    Authors: Albert Lu, Hongxin Zhang, Yanzhe Zhang, Xuezhi Wang, Diyi Yang

    Abstract: The limits of open-ended generative models are unclear, yet increasingly important. What causes them to succeed and what causes them to fail? In this paper, we take a prompt-centric approach to analyzing and bounding the abilities of open-ended generative models. We present a generic methodology of analysis with two challenging prompt constraint types: structural and stylistic. These constraint ty… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: 27 pages, 13 figures, 11 tables, to be published in EACL 2023 Findings

  32. arXiv:2301.03410  [pdf, other

    cs.CV

    In Defense of Structural Symbolic Representation for Video Event-Relation Prediction

    Authors: Andrew Lu, Xudong Lin, Yulei Niu, Shih-Fu Chang

    Abstract: Understanding event relationships in videos requires a model to understand the underlying structures of events (i.e. the event type, the associated argument roles, and corresponding entities) and factual knowledge for reasoning. Structural symbolic representation (SSR) based methods directly take event types and associated argument roles/entities as inputs to perform reasoning. However, the state-… ▽ More

    Submitted 12 April, 2023; v1 submitted 6 January, 2023; originally announced January 2023.

    Comments: CVPRW 23, Learning with Limited Labelled Data

  33. arXiv:2211.08110  [pdf, other

    cs.AR cs.AI cs.CV

    HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers

    Authors: Peiyan Dong, Mengshu Sun, Alec Lu, Yanyue Xie, Kenneth Liu, Zhenglun Kong, Xin Meng, Zhengang Li, Xue Lin, Zhenman Fang, Yanzhi Wang

    Abstract: While vision transformers (ViTs) have continuously achieved new milestones in the field of computer vision, their sophisticated network architectures with high computation and memory costs have impeded their deployment on resource-limited edge devices. In this paper, we propose a hardware-efficient image-adaptive token pruning framework called HeatViT for efficient yet accurate ViT acceleration on… ▽ More

    Submitted 24 February, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: HPCA 2023

  34. arXiv:2211.05256  [pdf, other

    eess.IV cs.CV

    Power Efficient Video Super-Resolution on Mobile NPUs with Deep Learning, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Cheng-Ming Chiang, Hsien-Kai Kuo, Yu-Syuan Xu, Man-Yu Lee, Allen Lu, Chia-Ming Cheng, Chih-Cheng Chen, Jia-Ying Yong, Hong-Han Shuai, Wen-Huang Cheng, Zhuang Jia, Tianyu Xu, Yijian Zhang, Long Bao, Heng Sun, Diankai Zhang, Si Gao, Shaoli Liu, Biao Wu, Xiaofeng Zhang, Chengjian Zheng, Kaidi Lu, Ning Wang , et al. (29 additional authors not shown)

    Abstract: Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this prob… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.08826, arXiv:2105.07809, arXiv:2211.04470, arXiv:2211.03885

  35. arXiv:2209.15611  [pdf, other

    q-bio.BM cs.AI

    Protein structure generation via folding diffusion

    Authors: Kevin E. Wu, Kevin K. Yang, Rianne van den Berg, James Y. Zou, Alex X. Lu, Ava P. Amini

    Abstract: The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a new diffusion-based generative model th… ▽ More

    Submitted 23 November, 2022; v1 submitted 30 September, 2022; originally announced September 2022.

    ACM Class: I.2.0; J.3

  36. arXiv:2209.02399  [pdf, other

    cs.CV

    SkeletonMAE: Spatial-Temporal Masked Autoencoders for Self-supervised Skeleton Action Recognition

    Authors: Wenhan Wu, Yilei Hua, Ce Zheng, Shiqian Wu, Chen Chen, Aidong Lu

    Abstract: Fully supervised skeleton-based action recognition has achieved great progress with the blooming of deep learning techniques. However, these methods require sufficient labeled data which is not easy to obtain. In contrast, self-supervised skeleton-based action recognition has attracted more attention. With utilizing the unlabeled data, more generalizable features can be learned to alleviate the ov… ▽ More

    Submitted 9 May, 2023; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: ICME 2023 Workshop

  37. arXiv:2208.10770  [pdf, other

    cs.AR

    SASA: A Scalable and Automatic Stencil Acceleration Framework for Optimized Hybrid Spatial and Temporal Parallelism on HBM-based FPGAs

    Authors: Xingyu Tian, Zhifan Ye, Alec Lu, Licheng Guo, Yuze Chi, Zhenman Fang

    Abstract: Stencil computation is one of the fundamental computing patterns in many application domains such as scientific computing and image processing. While there are promising studies that accelerate stencils on FPGAs, there lacks an automated acceleration framework to systematically explore both spatial and temporal parallelisms for iterative stencils that could be either computation-bound or memory-bo… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    ACM Class: C.3

  38. arXiv:2208.05163  [pdf, other

    cs.CV cs.LG eess.IV

    Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization

    Authors: Zhengang Li, Mengshu Sun, Alec Lu, Haoyu Ma, Geng Yuan, Yanyue Xie, Hao Tang, Yanyu Li, Miriam Leeser, Zhangyang Wang, Xue Lin, Zhenman Fang

    Abstract: Vision transformers (ViTs) are emerging with significantly improved accuracy in computer vision tasks. However, their complex architecture and enormous computation/storage demand impose urgent needs for new hardware accelerator design methodology. This work proposes an FPGA-aware automatic ViT acceleration framework based on the proposed mixed-scheme quantization. To the best of our knowledge, thi… ▽ More

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: Published in FPL2022

  39. Vertical GaN Diode BV Maximization through Rapid TCAD Simulation and ML-enabled Surrogate Model

    Authors: Albert Lu, Jordan Marshall, Yifan Wang, Ming Xiao, Yuhao Zhang, Hiu Yung Wong

    Abstract: In this paper, two methodologies are used to speed up the maximization of the breakdown volt-age (BV) of a vertical GaN diode that has a theoretical maximum BV of ~2100V. Firstly, we demonstrated a 5X faster accurate simulation method in Technology Computer-Aided-Design (TCAD). This allows us to find 50% more numbers of high BV (>1400V) designs at a given simulation time. Secondly, a machine learn… ▽ More

    Submitted 18 July, 2022; originally announced August 2022.

    Comments: 4 pages, 7 figures

  40. arXiv:2207.12033  [pdf, other

    cs.IR

    Contrastive Learning for Interactive Recommendation in Fashion

    Authors: Karin Sevegnani, Arjun Seshadri, Tian Wang, Anurag Beniwal, Julian McAuley, Alan Lu, Gerard Medioni

    Abstract: Recommender systems and search are both indispensable in facilitating personalization and ease of browsing in online fashion platforms. However, the two tools often operate independently, failing to combine the strengths of recommender systems to accurately capture user tastes with search systems' ability to process user queries. We propose a novel remedy to this problem by automatically recommend… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

  41. arXiv:2207.04695  [pdf, ps, other

    cs.IT

    2D Beam Domain Statistical CSI Estimation for Massive MIMO Uplink

    Authors: An-An Lu, Yan Chen, Xiqi Gao

    Abstract: In this paper, we investigate the beam domain statistical channel state information (CSI) estimation for the two dimensional (2D) beam based statistical channel model (BSCM) in massive MIMO systems.The problem is to estimate the beam domain channel power matrices (BDCPMs) based on multiple receive pilot signals. A receive model shows the relation between the statistical property of the receive pil… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: 28 pages, 7 figures, submitted to IEEE Trans. Wirel. Commun

  42. Serving Hybrid-Cloud SQL Interactive Queries at Twitter

    Authors: Chunxu Tang, Beinan Wang, Huijun Wu, Zhenzhao Wang, Yao Li, Vrushali Channapattan, Zhenxiao Luo, Ruchin Kabra, Mainak Ghosh, Nikhil Kantibhai Navadiya, Prachi Mishra, Prateek Mukhedkar, Anneliese Lu

    Abstract: The demand for data analytics has been consistently increasing in the past years at Twitter. In order to fulfill the requirements and provide a highly scalable and available query experience, a large-scale in-house SQL system is heavily relied on. Recently, we evolved the SQL system into a hybrid-cloud SQL federation system, compliant with Twitter's Partly Cloudy strategy. The hybrid-cloud SQL fed… ▽ More

    Submitted 9 July, 2022; originally announced July 2022.

    Comments: Submitted to ECSA 2021 post-proceedings

  43. arXiv:2204.11338  [pdf, other

    cs.DB cs.DC

    Taming Hybrid-Cloud Fast and Scalable Graph Analytics at Twitter

    Authors: Chunxu Tang, Yao Li, Zhenxiao Luo, Mainak Ghosh, Huijun Wu, Lu Zhang, Anneliese Lu, Ruchin Kabra, Nikhil Kantibhai Navadiya, Prachi Mishra, Prateek Mukhedkar, Vrushali Channapattan

    Abstract: We have witnessed a boosted demand for graph analytics at Twitter in recent years, and graph analytics has become one of the key parts of Twitter's large-scale data analytics and machine learning for driving engagement, serving the most relevant content, and promoting healthier conversations. However, infrastructure for graph analytics has historically not been an area of investment at Twitter, re… ▽ More

    Submitted 25 August, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

    Comments: 6 pages, 7 figures, accepted at IEEE GLOBECOM 2022

  44. arXiv:2204.04812  [pdf, other

    cs.CV cs.AI cs.IR cs.LG

    OutfitTransformer: Learning Outfit Representations for Fashion Recommendation

    Authors: Rohan Sarkar, Navaneeth Bodla, Mariya I. Vasileva, Yen-Liang Lin, Anurag Beniwal, Alan Lu, Gerard Medioni

    Abstract: Learning an effective outfit-level representation is critical for predicting the compatibility of items in an outfit, and retrieving complementary items for a partial outfit. We present a framework, OutfitTransformer, that uses the proposed task-specific tokens and leverages the self-attention mechanism to learn effective outfit-level representations encoding the compatibility relationships betwee… ▽ More

    Submitted 15 April, 2022; v1 submitted 10 April, 2022; originally announced April 2022.

  45. A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level

    Authors: Iddo Drori, Sarah Zhang, Reece Shuttleworth, Leonard Tang, Albert Lu, Elizabeth Ke, Kevin Liu, Linda Chen, Sunny Tran, Newman Cheng, Roman Wang, Nikhil Singh, Taylor L. Patti, Jayson Lynch, Avi Shporer, Nakul Verma, Eugene Wu, Gilbert Strang

    Abstract: We demonstrate that a neural network pre-trained on text and fine-tuned on code solves mathematics course problems, explains solutions, and generates new questions at a human level. We automatically synthesize programs using few-shot learning and OpenAI's Codex transformer and execute them to solve course problems at 81% automatic accuracy. We curate a new dataset of questions from MIT's largest m… ▽ More

    Submitted 30 May, 2022; v1 submitted 31 December, 2021; originally announced December 2021.

    Comments: 181 pages, 8 figures, 280 tables

  46. arXiv:2111.12696  [pdf, other

    cs.CV cs.AI cs.HC

    A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose

    Authors: Ce Zheng, Matias Mendieta, Pu Wang, Aidong Lu, Chen Chen

    Abstract: Existing deep learning-based human mesh reconstruction approaches have a tendency to build larger networks in order to achieve higher accuracy. Computational complexity and model size are often neglected, despite being key characteristics for practical use of human mesh reconstruction models (e.g. virtual try-on systems). In this paper, we present GTRS, a lightweight pose-based method that can rec… ▽ More

    Submitted 16 July, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: ACM Multimedia 2022

  47. arXiv:2111.11646  [pdf, other

    cs.CV cs.AI q-bio.QM

    CytoImageNet: A large-scale pretraining dataset for bioimage transfer learning

    Authors: Stanley Bryan Z. Hua, Alex X. Lu, Alan M. Moses

    Abstract: Motivation: In recent years, image-based biological assays have steadily become high-throughput, sparking a need for fast automated methods to extract biologically-meaningful information from hundreds of thousands of images. Taking inspiration from the success of ImageNet, we curate CytoImageNet, a large-scale dataset of openly-sourced and weakly-labeled microscopy images (890K images, 894 classes… ▽ More

    Submitted 23 November, 2021; v1 submitted 22 November, 2021; originally announced November 2021.

    Comments: Accepted paper at NeurIPS 2021 Learning Meaningful Representations for Life (LMRL) Workshop

  48. arXiv:2111.00228  [pdf, other

    cs.CV cs.IR cs.MM

    whu-nercms at trecvid2021:instance search task

    Authors: Yanrui Niu, Jingyao Yang, Ankang Lu, Baojin Huang, Yue Zhang, Ji Huang, Shishi Wen, Dongshu Xu, Chao Liang, Zhongyuan Wang, Jun Chen

    Abstract: We will make a brief introduction of the experimental methods and results of the WHU-NERCMS in the TRECVID2021 in the paper. This year we participate in the automatic and interactive tasks of Instance Search (INS). For the automatic task, the retrieval target is divided into two parts, person retrieval, and action retrieval. We adopt a two-stage method including face detection and face recognition… ▽ More

    Submitted 17 June, 2022; v1 submitted 30 October, 2021; originally announced November 2021.

    Comments: 9 pages, 4 figures

  49. arXiv:2110.10714  [pdf, other

    cs.GT eess.SY

    Auction Design through Multi-Agent Learning in Peer-to-Peer Energy Trading

    Authors: Zibo Zhao, Chen Feng, Andrew L. Lu

    Abstract: Distributed energy resources (DERs), such as rooftop solar panels, are growing rapidly and are reshaping power systems. To promote DERs, feed-in-tariff (FIT) is usually adopted by utilities to pay DER owners certain fixed rates for supplying energy to the grid. An alternative to FIT is a market-based approach; that is, consumers and DER owners trade energy in an auction-based peer-to-peer (P2P) ma… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

  50. arXiv:2110.08932  [pdf, other

    cs.LG cs.AI cs.CR cs.CY

    Poisoning Attacks on Fair Machine Learning

    Authors: Minh-Hao Van, Wei Du, Xintao Wu, Aidong Lu

    Abstract: Both fair machine learning and adversarial learning have been extensively studied. However, attacking fair machine learning models has received less attention. In this paper, we present a framework that seeks to effectively generate poisoning samples to attack both model accuracy and algorithmic fairness. Our attacking framework can target fair machine learning models trained with a variety of gro… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.