Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 62 results for author: Huang, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07049  [pdf, other

    cs.NE cs.LG

    GridPE: Unifying Positional Encoding in Transformers with a Grid Cell-Inspired Framework

    Authors: Boyang Li, Yulin Wu, Nuoxian Huang

    Abstract: Understanding spatial location and relationships is a fundamental capability for modern artificial intelligence systems. Insights from human spatial cognition provide valuable guidance in this domain. Recent neuroscientific discoveries have highlighted the role of grid cells as a fundamental neural component for spatial representation, including distance computation, path integration, and scale di… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2405.20323  [pdf, other

    cs.CV cs.AI

    $\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

    Authors: Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang

    Abstract: Photorealistic 3D reconstruction of street scenes is a critical technique for developing real-world simulators for autonomous driving. Despite the efficacy of Neural Radiance Fields (NeRF) for driving scenes, 3D Gaussian Splatting (3DGS) emerges as a promising direction due to its faster speed and more explicit representation. However, most existing street 3DGS methods require tracked 3D vehicle b… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Code is available at: https://github.com/nnanhuang/S3Gaussian/

  3. arXiv:2405.19363  [pdf, other

    eess.SP cs.AI cs.LG

    Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification

    Authors: Yihe Wang, Nan Huang, Taida Li, Yujun Yan, Xiang Zhang

    Abstract: Medical time series data, such as Electroencephalography (EEG) and Electrocardiography (ECG), play a crucial role in healthcare, such as diagnosing brain and heart diseases. Existing methods for medical time series classification primarily rely on handcrafted biomarkers extraction and CNN-based models, with limited exploration of transformers tailored for medical time series. In this paper, we int… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 20pages (14 pages main paper + 6 pages supplementary materials)

  4. arXiv:2405.15903  [pdf, other

    cs.LG

    UnitNorm: Rethinking Normalization for Transformers in Time Series

    Authors: Nan Huang, Christian Kümmerle, Xiang Zhang

    Abstract: Normalization techniques are crucial for enhancing Transformer models' performance and stability in time series analysis tasks, yet traditional methods like batch and layer normalization often lead to issues such as token shift, attention shift, and sparse attention. We propose UnitNorm, a novel approach that scales input vectors by their norms and modulates attention patterns, effectively circumv… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  5. arXiv:2405.08097  [pdf, other

    cs.LG math.AC

    Learning functions on symmetric matrices and point clouds via lightweight invariant features

    Authors: Ben Blum-Smith, Ningyuan Huang, Marco Cuturi, Soledad Villar

    Abstract: In this work, we present a mathematical formulation for machine learning of (1) functions on symmetric matrices that are invariant with respect to the action of permutations by conjugation, and (2) functions on point clouds that are invariant with respect to rotations, reflections, and permutations of the points. To achieve this, we construct $O(n^2)$ invariant features derived from generators for… ▽ More

    Submitted 15 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: 28 pages, 2 figures, 2 tables

    MSC Class: 68P01; 13A50

  6. arXiv:2405.03352  [pdf, other

    cs.CV

    Salient Object Detection From Arbitrary Modalities

    Authors: Nianchang Huang, Yang Yang, Ruida Xi, Qiang Zhang, Jungong Han, Jin Huang

    Abstract: Toward desirable saliency prediction, the types and numbers of inputs for a salient object detection (SOD) algorithm may dynamically change in many real-life applications. However, existing SOD algorithms are mainly designed or trained for one particular type of inputs, failing to be generalized to other types of inputs. Consequentially, more types of SOD algorithms need to be prepared in advance… ▽ More

    Submitted 9 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 15 Pages, 7 Figures, 8 Tables

  7. arXiv:2405.03351  [pdf, other

    cs.CV

    Modality Prompts for Arbitrary Modality Salient Object Detection

    Authors: Nianchang Huang, Yang Yang, Qiang Zhang, Jungong Han, Jin Huang

    Abstract: This paper delves into the task of arbitrary modality salient object detection (AM SOD), aiming to detect salient objects from arbitrary modalities, eg RGB images, RGB-D images, and RGB-D-T images. A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD, ie more diverse modality discrepancies caused by varying modality types that need to be… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 13 pages, 7 Figures, 3 Tables

  8. An inexact augmented Lagrangian algorithm for unsymmetric saddle-point systems

    Authors: N. Huang, Y. -H. Dai, D. Orban, M. A. Saunders

    Abstract: Augmented Lagrangian (AL) methods are a well known class of algorithms for solving constrained optimization problems. They have been extended to the solution of saddle-point systems of linear equations. We study an AL (SPAL) algorithm for unsymmetric saddle-point systems and derive convergence and semi-convergence properties, even when the system is singular. At each step, our SPAL requires the ex… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Report number: G-2024-30 MSC Class: 65F10; 65F50

  9. arXiv:2402.19273  [pdf, other

    cs.CL

    PlanGPT: Enhancing Urban Planning with Tailored Language Model and Efficient Retrieval

    Authors: He Zhu, Wenjia Zhang, Nuoxian Huang, Boyang Li, Luyao Niu, Zipei Fan, Tianle Lun, Yicheng Tao, Junyou Su, Zhaoya Gong, Chenyu Fang, Xing Liu

    Abstract: In the field of urban planning, general-purpose large language models often struggle to meet the specific needs of planners. Tasks like generating urban planning texts, retrieving related information, and evaluating planning documents pose unique challenges. To enhance the efficiency of urban professionals and overcome these obstacles, we introduce PlanGPT, the first specialized Large Language Mod… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  10. arXiv:2402.00322  [pdf, other

    cs.CL

    Bias in Opinion Summarisation from Pre-training to Adaptation: A Case Study in Political Bias

    Authors: Nannan Huang, Haytham Fayek, Xiuzhen Zhang

    Abstract: Opinion summarisation aims to summarise the salient information and opinions presented in documents such as product reviews, discussion forums, and social media texts into short summaries that enable users to effectively understand the opinions therein. Generating biased summaries has the risk of potentially swaying public opinion. Previous studies focused on studying bias in opinion summarisation… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

    Comments: 15 pages, 1 figure, 6 tables, Accepted to EACL 2024

  11. arXiv:2401.14066  [pdf, other

    cs.CV cs.AI

    CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion

    Authors: Nisha Huang, Weiming Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu

    Abstract: Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images. However, adapting these models for artistic image editing presents two significant challenges. Firstly, users struggle to craft textual prompts that meticulously detail visual elements of the input image. Secondly, prevalent models, when effecting mo… ▽ More

    Submitted 30 January, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

  12. arXiv:2312.12065  [pdf, other

    cs.LG cs.AI

    PPO-Clip Attains Global Optimality: Towards Deeper Understandings of Clipping

    Authors: Nai-Chieh Huang, Ping-Chun Hsieh, Kuo-Hao Ho, I-Chen Wu

    Abstract: Proximal Policy Optimization algorithm employing a clipped surrogate objective (PPO-Clip) is a prominent exemplar of the policy optimization methods. However, despite its remarkable empirical success, PPO-Clip lacks theoretical substantiation to date. In this paper, we contribute to the field by establishing the first global convergence results of a PPO-Clip variant in both tabular and neural func… ▽ More

    Submitted 19 February, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  13. arXiv:2312.11535  [pdf, other

    cs.CV cs.AI

    Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior

    Authors: Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang

    Abstract: In this paper, we present a novel two-stage approach that fully utilizes the information provided by the reference image to establish a customized knowledge prior for image-to-3D generation. While previous approaches primarily rely on a general diffusion prior, which struggles to yield consistent results with the reference image, we propose a subject-specific and multi-modal diffusion model. This… ▽ More

    Submitted 9 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Project Page: https://nnanhuang.github.io/projects/customize-it-3d/

  14. arXiv:2312.05288  [pdf, other

    cs.CV

    MotionCrafter: One-Shot Motion Customization of Diffusion Models

    Authors: Yuxin Zhang, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Weiming Dong, Changsheng Xu

    Abstract: The essence of a video lies in its dynamic motions, including character actions, object movements, and camera movements. While text-to-video generative diffusion models have recently advanced in creating diverse contents, controlling specific motions through text prompts remains a significant challenge. A primary issue is the coupling of appearance and motion, often leading to overfitting on appea… ▽ More

    Submitted 2 January, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

  15. arXiv:2311.03912  [pdf, other

    cs.CV cs.LG

    FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer

    Authors: Chi-Chih Chang, Yuan-Yao Sung, Shixing Yu, Ning-Chi Huang, Diana Marculescu, Kai-Chiang Wu

    Abstract: Vision Transformers (ViT) have recently demonstrated success across a myriad of computer vision tasks. However, their elevated computational demands pose significant challenges for real-world deployment. While low-rank approximation stands out as a renowned method to reduce computational loads, efficiently automating the target rank selection in ViT remains a challenge. Drawing from the notable si… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted by WACV 2024

  16. arXiv:2310.12911  [pdf, other

    cs.CC cs.DS

    Tight approximability of MAX 2-SAT and relatives, under UGC

    Authors: Joshua Brakensiek, Neng Huang, Uri Zwick

    Abstract: Austrin showed that the approximation ratio $β\approx 0.94016567$ obtained by the MAX 2-SAT approximation algorithm of Lewin, Livnat and Zwick (LLZ) is optimal modulo the Unique Games Conjecture (UGC) and modulo a Simplicity Conjecture that states that the worst performance of the algorithm is obtained on so called simple configurations. We prove Austrin's conjecture, thereby showing the optimalit… ▽ More

    Submitted 1 November, 2023; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: 32 pages, 1 figure; to appear in SODA 2024

  17. arXiv:2310.11897  [pdf, other

    cs.LG

    Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning

    Authors: Yen-Ju Chen, Nai-Chieh Huang, Ching-Pei Lee, Ping-Chun Hsieh

    Abstract: Various acceleration approaches for Policy Gradient (PG) have been analyzed within the realm of Reinforcement Learning (RL). However, the theoretical understanding of the widely used momentum-based acceleration method on PG remains largely open. In response to this gap, we adapt the celebrated Nesterov's accelerated gradient (NAG) method to policy optimization in RL, termed \textit{Accelerated Pol… ▽ More

    Submitted 6 June, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: 69 pages, 17 figures

  18. arXiv:2310.01563  [pdf, other

    quant-ph cs.DM cs.DS math.PR

    Local algorithms and the failure of log-depth quantum advantage on sparse random CSPs

    Authors: Antares Chen, Neng Huang, Kunal Marwaha

    Abstract: We construct and analyze a message-passing algorithm for random constraint satisfaction problems (CSPs) at large clause density, generalizing work of El Alaoui, Montanari, and Sellke for Maximum Cut [arXiv:2111.06813] through a connection between random CSPs and mean-field Ising spin glasses. For CSPs with even predicates, the algorithm asymptotically solves a stochastic optimal control problem du… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 29+13 pages, 3 figures, 1 table

  19. arXiv:2308.10436  [pdf, other

    stat.ML cs.LG

    Approximately Equivariant Graph Networks

    Authors: Ningyuan Huang, Ron Levie, Soledad Villar

    Abstract: Graph neural networks (GNNs) are commonly described as being permutation equivariant with respect to node relabeling in the graph. This symmetry of GNNs is often compared to the translation equivariance of Euclidean convolution neural networks (CNNs). However, these two symmetries are fundamentally different: The translation equivariance of CNNs corresponds to symmetries of the fixed domain acting… ▽ More

    Submitted 17 November, 2023; v1 submitted 20 August, 2023; originally announced August 2023.

    Comments: Accepted at NeurIPS 2023

  20. arXiv:2306.04424  [pdf, other

    cs.CL

    Examining Bias in Opinion Summarisation Through the Perspective of Opinion Diversity

    Authors: Nannan Huang, Lin Tian, Haytham Fayek, Xiuzhen Zhang

    Abstract: Opinion summarisation is a task that aims to condense the information presented in the source documents while retaining the core message and opinions. A summary that only represents the majority opinions will leave the minority opinions unrepresented in the summary. In this paper, we use the stance towards a certain target as an opinion. We study bias in opinion summarisation from the perspective… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: 9 pages, 3 figures, accepted at WASSA, ACL 2023

  21. arXiv:2306.03698  [pdf, other

    cs.LG cs.DM cs.NE

    Fine-grained Expressivity of Graph Neural Networks

    Authors: Jan Böker, Ron Levie, Ningyuan Huang, Soledad Villar, Christopher Morris

    Abstract: Numerous recent works have analyzed the expressive power of message-passing graph neural networks (MPNNs), primarily utilizing combinatorial techniques such as the $1$-dimensional Weisfeiler-Leman test ($1$-WL) for the graph isomorphism problem. However, the graph isomorphism objective is inherently binary, not giving insights into the degree of similarity between two given graphs. This work resol… ▽ More

    Submitted 2 November, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPS 2023

  22. arXiv:2305.16225  [pdf, other

    cs.GR cs.CV

    ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

    Authors: Yuxin Zhang, Weiming Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu

    Abstract: Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing and editing specific visual attributes such as material, style, and layout remains a challenge, leading to… ▽ More

    Submitted 7 December, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

  23. arXiv:2305.11081  [pdf, other

    cs.IR

    Contrastive State Augmentations for Reinforcement Learning-Based Recommender Systems

    Authors: Zhaochun Ren, Na Huang, Yidan Wang, Pengjie Ren, Jun Ma, Jiahuan Lei, Xinlei Shi, Hengliang Luo, Joemon M Jose, Xin Xin

    Abstract: Learning reinforcement learning (RL)-based recommenders from historical user-item interaction sequences is vital to generate high-reward recommendations and improve long-term cumulative benefits. However, existing RL recommendation methods encounter difficulties (i) to estimate the value functions for states which are not contained in the offline training data, and (ii) to learn effective state re… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  24. arXiv:2305.09893  [pdf, other

    cs.CV

    Integrating Multiple Sources Knowledge for Class Asymmetry Domain Adaptation Segmentation of Remote Sensing Images

    Authors: Kuiliang Gao, Anzhu Yu, Xiong You, Wenyue Guo, Ke Li, Ningbo Huang

    Abstract: In the existing unsupervised domain adaptation (UDA) methods for remote sensing images (RSIs) semantic segmentation, class symmetry is an widely followed ideal assumption, where the source and target RSIs have exactly the same class space. In practice, however, it is often very difficult to find a source RSI with exactly the same classes as the target RSI. More commonly, there are multiple source… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: 17 pages, 10 figures

  25. arXiv:2305.05464  [pdf, other

    cs.CV cs.MM

    Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer

    Authors: Nisha Huang, Yuxin Zhang, Weiming Dong

    Abstract: Large-scale text-to-video diffusion models have demonstrated an exceptional ability to synthesize diverse videos. However, due to the lack of extensive text-to-video datasets and the necessary computational resources for training, directly applying these models for video stylization remains difficult. Also, given that the noise addition process on the input content is random and destructive, fulfi… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  26. arXiv:2302.11797  [pdf, other

    cs.CV cs.GR cs.MM

    Region-Aware Diffusion for Zero-shot Text-driven Image Editing

    Authors: Nisha Huang, Fan Tang, Weiming Dong, Tong-Yee Lee, Changsheng Xu

    Abstract: Image manipulation under the guidance of textual descriptions has recently received a broad range of attention. In this study, we focus on the regional editing of images with the guidance of given text prompts. Different from current mask-based image editing methods, we propose a novel region-aware diffusion model (RDM) for entity-level image editing, which could automatically locate the region of… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

  27. arXiv:2212.11191  [pdf, other

    cs.CC cs.DS math.NA

    Separating MAX 2-AND, MAX DI-CUT and MAX CUT

    Authors: Joshua Brakensiek, Neng Huang, Aaron Potechin, Uri Zwick

    Abstract: Assuming the Unique Games Conjecture (UGC), the best approximation ratio that can be obtained in polynomial time for the MAX CUT problem is $α_{\text{CUT}}\simeq 0.87856$, obtained by the celebrated SDP-based approximation algorithm of Goemans and Williamson. The currently best approximation algorithm for MAX DI-CUT, i.e., the MAX CUT problem in directed graphs, achieves a ratio of about… ▽ More

    Submitted 12 April, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: 39 pages, 5 figures, 7 tables

  28. arXiv:2211.13203  [pdf, other

    cs.CV cs.GR

    Inversion-Based Style Transfer with Diffusion Models

    Authors: Yuxin Zhang, Nisha Huang, Fan Tang, Haibin Huang, Chongyang Ma, Weiming Dong, Changsheng Xu

    Abstract: The artistic style within a painting is the means of expression, which includes not only the painting material, colors, and brushstrokes, but also the high-level attributes including semantic elements, object shapes, etc. Previous arbitrary example-guided artistic image generation methods often fail to control shape changes or convey elements. The pre-trained text-to-image synthesis diffusion prob… ▽ More

    Submitted 20 March, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: accepted by CVPR 2023

  29. arXiv:2211.10682  [pdf, other

    cs.CV cs.GR

    DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

    Authors: Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, Weiming Dong, Changsheng Xu

    Abstract: Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural image into a stylized one according to textual descriptions of the target style provided by the user. Unlike the previous image-to-image transfer approaches, text-guided stylization progress provides users with a more precise and intui… ▽ More

    Submitted 18 December, 2023; v1 submitted 19 November, 2022; originally announced November 2022.

  30. arXiv:2211.03231  [pdf, other

    cs.SI cs.LG eess.SP

    A Spectral Analysis of Graph Neural Networks on Dense and Sparse Graphs

    Authors: Luana Ruiz, Ningyuan Huang, Soledad Villar

    Abstract: In this work we propose a random graph model that can produce graphs at different levels of sparsity. We analyze how sparsity affects the graph spectra, and thus the performance of graph neural networks (GNNs) in node classification on dense and sparse graphs. We compare GNNs with spectral methods known to provide consistent estimators for community detection on dense graphs, a closely related tas… ▽ More

    Submitted 13 September, 2023; v1 submitted 6 November, 2022; originally announced November 2022.

    Comments: Extended version of ICASSP 2024 submission

  31. arXiv:2210.15083  [pdf, other

    stat.ML cs.LG

    Deep Learning is Provably Robust to Symmetric Label Noise

    Authors: Carey E. Priebe, Ningyuan Huang, Soledad Villar, Cong Mu, Li Chen

    Abstract: Deep neural networks (DNNs) are capable of perfectly fitting the training data, including memorizing noisy data. It is commonly believed that memorization hurts generalization. Therefore, many recent works propose mitigation strategies to avoid noisy data or correct memorization. In this work, we step back and ask the question: Can deep learning be robust against massive label noise without any mi… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  32. arXiv:2209.13360  [pdf, other

    cs.CV

    Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion

    Authors: Nisha Huang, Fan Tang, Weiming Dong, Changsheng Xu

    Abstract: Digital art synthesis is receiving increasing attention in the multimedia community because of engaging the public with art effectively. Current digital art synthesis methods usually use single-modality inputs as guidance, thereby limiting the expressiveness of the model and the diversity of generated results. To solve this problem, we propose the multimodal guided artwork diffusion (MGAD) model,… ▽ More

    Submitted 28 September, 2022; v1 submitted 27 September, 2022; originally announced September 2022.

    Comments: Accepted by ACM MM 2022

  33. arXiv:2209.12054  [pdf, other

    stat.ML cs.LG

    From Local to Global: Spectral-Inspired Graph Neural Networks

    Authors: Ningyuan Huang, Soledad Villar, Carey E. Priebe, Da Zheng, Chengyue Huang, Lin Yang, Vladimir Braverman

    Abstract: Graph Neural Networks (GNNs) are powerful deep learning methods for Non-Euclidean data. Popular GNNs are message-passing algorithms (MPNNs) that aggregate and combine signals in a local graph neighborhood. However, shallow MPNNs tend to miss long-range signals and perform poorly on some heterophilous graphs, while deep MPNNs can suffer from issues like over-smoothing or over-squashing. To mitigate… ▽ More

    Submitted 4 November, 2022; v1 submitted 24 September, 2022; originally announced September 2022.

    Comments: Accepted for publication at the NeurIPS 2022 GLFrontiers Workshop

  34. arXiv:2206.13163  [pdf, other

    cs.CL cs.AI

    Endowing Language Models with Multimodal Knowledge Graph Representations

    Authors: Ningyuan Huang, Yash R. Deshpande, Yibo Liu, Houda Alberts, Kyunghyun Cho, Clara Vania, Iacer Calixto

    Abstract: We propose a method to make natural language understanding models more parameter efficient by storing knowledge in an external knowledge graph (KG) and retrieving from this KG using a dense index. Given (possibly multilingual) downstream task data, e.g., sentences in German, we retrieve entities from the KG and use their multimodal representations to improve downstream task performance. We use the… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: 14 pages with appendix, 2 figures, 15 tables

    MSC Class: 68T50 ACM Class: I.2.7; I.2.10; I.2.4

  35. arXiv:2206.12401  [pdf, other

    cs.IR cs.CR cs.LG

    Debiasing Learning for Membership Inference Attacks Against Recommender Systems

    Authors: Zihan Wang, Na Huang, Fei Sun, Pengjie Ren, Zhumin Chen, Hengliang Luo, Maarten de Rijke, Zhaochun Ren

    Abstract: Learned recommender systems may inadvertently leak information about their training data, leading to privacy violations. We investigate privacy threats faced by recommender systems through the lens of membership inference. In such attacks, an adversary aims to infer whether a user's data is used to train the target recommender. To achieve this, previous work has used a shadow recommender to derive… ▽ More

    Submitted 28 June, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

    Comments: Accepted by KDD 2022

  36. A semi-conjugate gradient method for solving unsymmetric positive definite linear systems

    Authors: Na Huang, Yu-Hong Dai, Dominique Orban, Michael A Saunders

    Abstract: The conjugate gradient (CG) method is a classic Krylov subspace method for solving symmetric positive definite linear systems. We introduce an analogous semi-conjugate gradient (SCG) method for unsymmetric positive definite linear systems. Unlike CG, SCG requires the solution of a lower triangular linear system to produce each semi-conjugate direction. We prove that SCG is theoretically equivalent… ▽ More

    Submitted 8 June, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Report number: G-2022-25 MSC Class: 15A06; 65F10; 65F25; 65F50

  37. arXiv:2205.14299  [pdf, other

    cs.LG cs.CV

    Deep Learning with Label Noise: A Hierarchical Approach

    Authors: Li Chen, Ningyuan Huang, Cong Mu, Hayden S. Helm, Kate Lytvynets, Weiwei Yang, Carey E. Priebe

    Abstract: Deep neural networks are susceptible to label noise. Existing methods to improve robustness, such as meta-learning and regularization, usually require significant change to the network architecture or careful tuning of the optimization procedure. In this work, we propose a simple hierarchical approach that incorporates a label hierarchy when training the deep learning models. Our approach requires… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: 8 pages, 7 figures

  38. arXiv:2202.04405  [pdf, other

    eess.SP cs.SD eess.AS

    Time-Frequency Mask Aware Bi-directional LSTM: A Deep Learning Approach for Underwater Acoustic Signal Separation

    Authors: Jie Chen, Chang Liu, Jiawu Xie, Jie An, Nan Huang

    Abstract: The underwater acoustic signals separation is a key technique for the underwater communications. The existing methods are mostly model-based, and could not accurately characterise the practical underwater acoustic communication environment. They are only suitable for binary signal separation, but cannot handle multivariate signal separation. On the other hand, the recurrent neural network (RNN) sh… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

    Comments: 28 pages, 14 figures

  39. A Short Tutorial on The Weisfeiler-Lehman Test And Its Variants

    Authors: Ningyuan Huang, Soledad Villar

    Abstract: Graph neural networks are designed to learn functions on graphs. Typically, the relevant target functions are invariant with respect to actions by permutations. Therefore the design of some graph neural network architectures has been inspired by graph-isomorphism algorithms. The classical Weisfeiler-Lehman algorithm (WL) -- a graph-isomorphism test based on color refinement -- became relevant to t… ▽ More

    Submitted 1 November, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021

  40. arXiv:2201.06995  [pdf, ps, other

    cs.IT

    Improved Receivers for Optical Wireless OFDM: An Information Theoretic Perspective

    Authors: Xiaozhen Liu, Jing Zhou, Nuo Huang, Wenyi Zhang

    Abstract: We consider performance enhancement of asymmetrically-clipped optical orthogonal frequency division multiplexing (ACO-OFDM) and related optical OFDM schemes, which are variations of OFDM in intensity-modulated optical wireless communications. Unlike most existing studies on specific designs of improved receivers, this paper investigates information theoretic limits of all possible receivers. For i… ▽ More

    Submitted 4 May, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: 15 pages, 17 figures.To appear in IEEE Transactions on Communications

  41. arXiv:2201.03859  [pdf, other

    cs.CV

    On Exploring Pose Estimation as an Auxiliary Learning Task for Visible-Infrared Person Re-identification

    Authors: Yunqi Miao, Nianchang Huang, Xiao Ma, Qiang Zhang, Jungong Han

    Abstract: Visible-infrared person re-identification (VI-ReID) has been challenging due to the existence of large discrepancies between visible and infrared modalities. Most pioneering approaches reduce intra-class variations and inter-modality discrepancies by learning modality-shared and ID-related features. However, an explicit modality-shared cue, i.e., body keypoints, has not been fully exploited in VI-… ▽ More

    Submitted 23 February, 2022; v1 submitted 11 January, 2022; originally announced January 2022.

  42. arXiv:2111.02921  [pdf, ps, other

    cs.IT eess.SP

    Map-Assisted Constellation Design for mmWave WDM with OAM in Short-Range LOS Environment

    Authors: Yuan Wang, Chen Gong, Nuo Huang, Zhengyuan Xu

    Abstract: We consider a system that integrates positioning and single-user millimeter wave (mmWave) communication, where the communication part adopts wavelength division multiplexing (WDM) and orbital angular momentum (OAM). This paper addresses the multi-dimensional constellation design in shortrange line-of-sight (LOS) environment, with stable communication links. We propose a map-assisted method to quan… ▽ More

    Submitted 11 October, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

  43. arXiv:2110.13799  [pdf, other

    cs.LG

    Neural PPO-Clip Attains Global Optimality: A Hinge Loss Perspective

    Authors: Nai-Chieh Huang, Ping-Chun Hsieh, Kuo-Hao Ho, Hsuan-Yu Yao, Kai-Chun Hu, Liang-Chun Ouyang, I-Chen Wu

    Abstract: Policy optimization is a fundamental principle for designing reinforcement learning algorithms, and one example is the proximal policy optimization algorithm with a clipped surrogate objective (PPO-Clip), which has been popularly used in deep reinforcement learning due to its simplicity and effectiveness. Despite its superior empirical performance, PPO-Clip has not been justified via theoretical p… ▽ More

    Submitted 31 August, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

    Comments: 33 pages, 1 figure

  44. Pulmonary Vessel Segmentation based on Orthogonal Fused U-Net++ of Chest CT Images

    Authors: Hejie Cui, Xinglong Liu, Ning Huang

    Abstract: Pulmonary vessel segmentation is important for clinical diagnosis of pulmonary diseases, while is also challenging due to the complicated structure. In this work, we present an effective framework and refinement process of pulmonary vessel segmentation from chest computed tomographic (CT) images. The key to our approach is a 2.5D segmentation network applied from three orthogonal axes, which prese… ▽ More

    Submitted 3 July, 2021; originally announced July 2021.

    Comments: Published in Medical Image Computing and Computer Assisted Intervention (MICCAI 2019)

    MSC Class: 68T45; 68T07 ACM Class: I.2.10; J.3

  45. arXiv:2107.00719  [pdf, other

    q-bio.BM cs.LG q-bio.QM

    Toward Drug-Target Interaction Prediction via Ensemble Modeling and Transfer Learning

    Authors: Po-Yu Kao, Shu-Min Kao, Nan-Lan Huang, Yen-Chu Lin

    Abstract: Drug-target interaction (DTI) prediction plays a crucial role in drug discovery, and deep learning approaches have achieved state-of-the-art performance in this field. We introduce an ensemble of deep learning models (EnsembleDLM) for DTI prediction. EnsembleDLM only uses the sequence information of chemical compounds and proteins, and it aggregates the predictions from multiple deep neural networ… ▽ More

    Submitted 18 November, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: 8 pages, 1 figure, 10 tables

  46. Middle-level Fusion for Lightweight RGB-D Salient Object Detection

    Authors: Nianchang Huang, Qiang Zhang, Jungong Han

    Abstract: Most existing lightweight RGB-D salient object detection (SOD) models are based on two-stream structure or single-stream structure. The former one first uses two sub-networks to extract unimodal features from RGB and depth images, respectively, and then fuses them for SOD. While, the latter one directly extracts multi-modal features from the input RGB-D images and then focuses on exploiting cross-… ▽ More

    Submitted 5 June, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

    Comments: 11 pages, 6 figures

  47. arXiv:2104.11539  [pdf, other

    cs.CV

    Exploring Modality-shared Appearance Features and Modality-invariant Relation Features for Cross-modality Person Re-Identification

    Authors: Nianchang Huang, Jianan Liu, Qiang Zhang, Jungong Han

    Abstract: Most existing cross-modality person re-identification works rely on discriminative modality-shared features for reducing cross-modality variations and intra-modality variations. Despite some initial success, such modality-shared appearance features cannot capture enough modality-invariant discriminative information due to a massive discrepancy between RGB and infrared images. To address this issue… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: 13 pages, 8 figures, submitted to TIP

  48. arXiv:2102.13123  [pdf, other

    astro-ph.CO cs.CV cs.LG

    DeepSZ: Identification of Sunyaev-Zel'dovich Galaxy Clusters using Deep Learning

    Authors: Zhen Lin, Nicholas Huang, Camille Avestruz, W. L. Kimmy Wu, Shubhendu Trivedi, João Caldeira, Brian Nord

    Abstract: Galaxy clusters identified from the Sunyaev Zel'dovich (SZ) effect are a key ingredient in multi-wavelength cluster-based cosmology. We present a comparison between two methods of cluster identification: the standard Matched Filter (MF) method in SZ cluster finding and a method using Convolutional Neural Networks (CNN). We further implement and show results for a `combined' identifier. We apply th… ▽ More

    Submitted 8 March, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

    Report number: FERMILAB-PUB-21-077-SCD

  49. arXiv:2011.11477  [pdf, other

    stat.ML cs.LG

    Dimensionality reduction, regularization, and generalization in overparameterized regressions

    Authors: Ningyuan Huang, David W. Hogg, Soledad Villar

    Abstract: Overparameterization in deep learning is powerful: Very large models fit the training data perfectly and yet often generalize well. This realization brought back the study of linear models for regression, including ordinary least squares (OLS), which, like deep learning, shows a "double-descent" behavior: (1) The risk (expected out-of-sample prediction error) can grow arbitrarily when the number o… ▽ More

    Submitted 19 October, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

    Journal ref: SIAM Journal on Mathematics of Data Science Vol.4 Iss.1, 2022

  50. A Simple Spectral Failure Mode for Graph Convolutional Networks

    Authors: Carey E. Priebe, Cencheng Shen, Ningyuan Huang, Tianyi Chen

    Abstract: Neural networks have achieved remarkable successes in machine learning tasks. This has recently been extended to graph learning using neural networks. However, there is limited theoretical work in understanding how and when they perform well, especially relative to established statistical learning techniques such as spectral embedding. In this short paper, we present a simple generative model wher… ▽ More

    Submitted 11 August, 2021; v1 submitted 25 October, 2020; originally announced October 2020.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 44(11), 8689-8693, 2022