Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–49 of 49 results for author: Nie, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (508 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

  2. arXiv:2407.05740  [pdf, other

    cs.CL

    Do Multilingual Large Language Models Mitigate Stereotype Bias?

    Authors: Shangrui Nie, Michael Fromm, Charles Welch, Rebekka Görge, Akbar Karimi, Joan Plepi, Nazia Afsan Mowmita, Nicolas Flores-Herr, Mehdi Ali, Lucie Flek

    Abstract: While preliminary findings indicate that multilingual LLMs exhibit reduced bias compared to monolingual ones, a comprehensive understanding of the effect of multilingual training on bias mitigation, is lacking. This study addresses this gap by systematically training six LLMs of identical size (2.6B parameters) and architecture: five monolingual models (English, German, French, Italian, and Spanis… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 19 pages, 8 figures, C3NLP 2024

  3. arXiv:2406.03736  [pdf, other

    cs.LG cs.CL

    Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data

    Authors: Jingyang Ou, Shen Nie, Kaiwen Xue, Fengqi Zhu, Jiacheng Sun, Zhenguo Li, Chongxuan Li

    Abstract: Discrete diffusion models with absorbing processes have shown promise in language modeling. The key quantities to be estimated are the ratios between the marginal probabilities of two transitive states at all timesteps, called the concrete score. In this paper, we reveal that the concrete score in absorbing diffusion can be expressed as conditional probabilities of clean data, multiplied by a time… ▽ More

    Submitted 6 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2405.05741  [pdf, ps, other

    cs.CL cs.AI

    Can large language models understand uncommon meanings of common words?

    Authors: Jinyang Wu, Feihu Che, Xinxin Zheng, Shuai Zhang, Ruihan Jin, Shuai Nie, Pengpeng Shao, Jianhua Tao

    Abstract: Large language models (LLMs) like ChatGPT have shown significant advancements across diverse natural language understanding (NLU) tasks, including intelligent dialogue and autonomous agents. Yet, lacking widely acknowledged testing mechanisms, answering `whether LLMs are stochastic parrots or genuinely comprehend the world' remains unclear, fostering numerous studies and sparking heated debates. P… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  5. arXiv:2404.15766  [pdf, other

    cs.LG cs.AI

    Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations

    Authors: Kaiwen Xue, Yuhao Zhou, Shen Nie, Xu Min, Xiaolu Zhang, Jun Zhou, Chongxuan Li

    Abstract: Bayesian flow networks (BFNs) iteratively refine the parameters, instead of the samples in diffusion models (DMs), of distributions at various noise levels through Bayesian inference. Owing to its differentiable nature, BFNs are promising in modeling both continuous and discrete data, while simultaneously maintaining fast sampling capabilities. This paper aims to understand and enhance BFNs by con… ▽ More

    Submitted 2 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: Published as a conference paper at ICML 2024

  6. arXiv:2404.15660  [pdf, other

    cs.CL

    KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering

    Authors: Xinxin Zheng, Feihu Che, Jinyang Wu, Shuai Zhang, Shuai Nie, Kang Liu, Jianhua Tao

    Abstract: Large language models (LLMs) suffer from the hallucination problem and face significant challenges when applied to knowledge-intensive tasks. A promising approach is to leverage evidence documents as extra supporting knowledge, which can be obtained through retrieval or generation. However, existing methods directly leverage the entire contents of the evidence document, which may introduce noise i… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  7. arXiv:2404.12980  [pdf, other

    cs.HC

    Ring-a-Pose: A Ring for Continuous Hand Pose Tracking

    Authors: Tianhong Catherine Yu, Guilin Hu, Ruidong Zhang, Hyunchul Lim, Saif Mahmud, Chi-Jung Lee, Ke Li, Devansh Agarwal, Shuyang Nie, Jinseok Oh, François Guimbretière, Cheng Zhang

    Abstract: We present Ring-a-Pose, a single untethered ring that tracks continuous 3D hand poses. Located in the center of the hand, the ring emits an inaudible acoustic signal that each hand pose reflects differently. Ring-a-Pose imposes minimal obtrusions on the hand, unlike multi-ring or glove systems. It is not affected by the choice of clothing that may cover wrist-worn systems. In a series of three use… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  8. arXiv:2401.11161  [pdf, other

    cs.SE

    BinaryAI: Binary Software Composition Analysis via Intelligent Binary Source Code Matching

    Authors: Ling Jiang, Junwen An, Huihui Huang, Qiyi Tang, Sen Nie, Shi Wu, Yuqun Zhang

    Abstract: While third-party libraries are extensively reused to enhance productivity during software development, they can also introduce potential security risks such as vulnerability propagation. Software composition analysis, proposed to identify reused TPLs for reducing such risks, has become an essential procedure within modern DevSecOps. As one of the mainstream SCA techniques, binary-to-source SCA id… ▽ More

    Submitted 23 January, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

    Comments: In Proceedings of the 46th International Conference on Software Engineering (ICSE'24)

  9. arXiv:2311.01410  [pdf, other

    cs.CV cs.LG

    The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing

    Authors: Shen Nie, Hanzhong Allan Guo, Cheng Lu, Yuhao Zhou, Chenyu Zheng, Chongxuan Li

    Abstract: We present a unified probabilistic formulation for diffusion-based image editing, where a latent variable is edited in a task-specific manner and generally deviates from the corresponding marginal distribution induced by the original stochastic or ordinary differential equation (SDE or ODE). Instead, it defines a corresponding SDE or ODE for editing. In the formulation, we prove that the Kullback-… ▽ More

    Submitted 29 February, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  10. arXiv:2310.11738  [pdf, other

    cs.SE

    Unleashing the Power of Clippy in Real-World Rust Projects

    Authors: Chunmiao Li, Yijun Yu, Haitao Wu, Luca Carlig, Shijie Nie, Lingxiao Jiang

    Abstract: Clippy lints are considered as essential tools for Rust developers, as they can be configured as gate-keeping rules for a Rust project during continuous integration. Despite their availability, little was known about practical application and cost-effectiveness of the lints in reducing code quality issues. In this study, we embark on a comprehensive analysis to unveil the true impact of Clippy lin… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  11. arXiv:2310.06530  [pdf, other

    cs.SE

    Refining Decompiled C Code with Large Language Models

    Authors: Wai Kin Wong, Huaijin Wang, Zongjie Li, Zhibo Liu, Shuai Wang, Qiyi Tang, Sen Nie, Shi Wu

    Abstract: A C decompiler converts an executable into source code. The recovered C source code, once re-compiled, is expected to produce an executable with the same functionality as the original executable. With over twenty years of development, C decompilers have been widely used in production to support reverse engineering applications. Despite the prosperous development of C decompilers, it is widely ackn… ▽ More

    Submitted 28 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  12. arXiv:2310.00183  [pdf, other

    cs.LG cs.AI

    On the Equivalence of Graph Convolution and Mixup

    Authors: Xiaotian Han, Hanqing Zeng, Yu Chen, Shaoliang Nie, Jingzhou Liu, Kanika Narang, Zahra Shakeri, Karthik Abinav Sankararaman, Song Jiang, Madian Khabsa, Qifan Wang, Xia Hu

    Abstract: This paper investigates the relationship between graph convolution and Mixup techniques. Graph convolution in a graph neural network involves aggregating features from neighboring samples to learn representative features for a specific node or sample. On the other hand, Mixup is a data augmentation technique that generates new examples by averaging features and one-hot labels from multiple samples… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  13. arXiv:2305.13774  [pdf, other

    cs.SD eess.AS

    ADD 2023: the Second Audio Deepfake Detection Challenge

    Authors: Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li

    Abstract: Audio deepfake detection is an emerging topic in the artificial intelligence community. The second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around the world to build new innovative technologies that can further accelerate and foster research on detecting and analyzing deepfake speech utterances. Different from previous challenges (e.g. ADD 2022), ADD 2023 focuses on s… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  14. arXiv:2305.07095  [pdf, other

    cs.CL cs.AI cs.LG

    Are Machine Rationales (Not) Useful to Humans? Measuring and Improving Human Utility of Free-Text Rationales

    Authors: Brihi Joshi, Ziyi Liu, Sahana Ramnath, Aaron Chan, Zhewei Tong, Shaoliang Nie, Qifan Wang, Yejin Choi, Xiang Ren

    Abstract: Among the remarkable emergent capabilities of large language models (LMs) is free-text rationalization; beyond a certain scale, large LMs are capable of generating seemingly useful rationalizations, which in turn, can dramatically enhance their performances on leaderboards. This phenomenon raises a question: can machine generated rationales also be useful for humans, especially when lay humans try… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  15. arXiv:2304.02838  [pdf, other

    cs.CR cs.AI cs.LG

    TBDetector:Transformer-Based Detector for Advanced Persistent Threats with Provenance Graph

    Authors: Nan Wang, Xuezhi Wen, Dalin Zhang, Xibin Zhao, Jiahui Ma, Mengxia Luo, Sen Nie, Shi Wu, Jiqiang Liu

    Abstract: APT detection is difficult to detect due to the long-term latency, covert and slow multistage attack patterns of Advanced Persistent Threat (APT). To tackle these issues, we propose TBDetector, a transformer-based advanced persistent threat detection method for APT attack detection. Considering that provenance graphs provide rich historical information and have the powerful attacks historic correl… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: 10 pages, 7 figures

  16. arXiv:2303.06555  [pdf, other

    cs.LG cs.CV

    One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

    Authors: Fan Bao, Shen Nie, Kaiwen Xue, Chongxuan Li, Shi Pu, Yaole Wang, Gang Yue, Yue Cao, Hang Su, Jun Zhu

    Abstract: This paper proposes a unified diffusion framework (dubbed UniDiffuser) to fit all distributions relevant to a set of multi-modal data in one model. Our key insight is -- learning diffusion models for marginal, conditional, and joint distributions can be unified as predicting the noise in the perturbed data, where the perturbation levels (i.e. timesteps) can be different for different modalities. I… ▽ More

    Submitted 30 May, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

    Comments: Accepted to ICML2023

  17. arXiv:2302.12247  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.IT

    Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework

    Authors: Paul Pu Liang, Yun Cheng, Xiang Fan, Chun Kai Ling, Suzanne Nie, Richard Chen, Zihao Deng, Nicholas Allen, Randy Auerbach, Faisal Mahmood, Ruslan Salakhutdinov, Louis-Philippe Morency

    Abstract: The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different modalities. Despite these empirical advances, there remain fundamental research questions: How can we quantify the interactions that are necessary to solve a multimodal task? Subsequently, what are the most suitable multimo… ▽ More

    Submitted 10 December, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: NeurIPS 2023. Code available at: https://github.com/pliang279/PID

  18. arXiv:2210.15500  [pdf, other

    cs.CL cs.CY cs.IR cs.LG

    COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation

    Authors: Nan Wang, Qifan Wang, Yi-Chia Wang, Maziar Sanjabi, Jingzhou Liu, Hamed Firooz, Hongning Wang, Shaoliang Nie

    Abstract: As language models become increasingly integrated into our digital lives, Personalized Text Generation (PTG) has emerged as a pivotal component with a wide range of applications. However, the bias inherent in user written text, often used for PTG model training, can inadvertently associate different levels of linguistic quality with users' protected attributes. The model can inherit the bias and p… ▽ More

    Submitted 22 October, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: This is a long paper accepted by the Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

  19. arXiv:2210.15159  [pdf, other

    cs.SE

    Comparing One with Many -- Solving Binary2source Function Matching Under Function Inlining

    Authors: Ang Jia, Ming Fan, Xi Xu, Wuxia Jin, Haijun Wang, Qiyi Tang, Sen Nie, Shi Wu, Ting Liu

    Abstract: Binary2source function matching is a fundamental task for many security applications, including Software Component Analysis (SCA). The "1-to-1" mechanism has been applied in existing binary2source matching works, in which one binary function is matched against one source function. However, we discovered that such mapping could be "1-to-n" (one query binary function maps multiple source functions),… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  20. arXiv:2210.05883  [pdf, other

    cs.CL

    AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning

    Authors: Tao Yang, Jinghao Deng, Xiaojun Quan, Qifan Wang, Shaoliang Nie

    Abstract: Fine-tuning large pre-trained language models on downstream tasks is apt to suffer from overfitting when limited training data is available. While dropout proves to be an effective antidote by randomly dropping a proportion of units, existing research has not examined its effect on the self-attention mechanism. In this paper, we investigate this problem through self-attention attribution and find… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  21. arXiv:2209.12152  [pdf, other

    cs.CV cs.AI cs.LG

    All are Worth Words: A ViT Backbone for Diffusion Models

    Authors: Fan Bao, Shen Nie, Kaiwen Xue, Yue Cao, Chongxuan Li, Hang Su, Jun Zhu

    Abstract: Vision transformers (ViT) have shown promise in various vision tasks while the U-Net based on a convolutional neural network (CNN) remains dominant in diffusion models. We design a simple and general ViT-based architecture (named U-ViT) for image generation with diffusion models. U-ViT is characterized by treating all inputs including the time, condition and noisy image patches as tokens and emplo… ▽ More

    Submitted 25 March, 2023; v1 submitted 25 September, 2022; originally announced September 2022.

    Comments: Accepted to CVPR 2023

  22. arXiv:2207.00779  [pdf, other

    cs.CL cs.AI cs.LG

    FRAME: Evaluating Rationale-Label Consistency Metrics for Free-Text Rationales

    Authors: Aaron Chan, Shaoliang Nie, Liang Tan, Xiaochang Peng, Hamed Firooz, Maziar Sanjabi, Xiang Ren

    Abstract: Following how humans communicate, free-text rationales aim to use natural language to explain neural language model (LM) behavior. However, free-text rationales' unconstrained nature makes them prone to hallucination, so it is important to have metrics for free-text rationale quality. Existing free-text rationale metrics measure how consistent the rationale is with the LM's predicted label, but th… ▽ More

    Submitted 2 December, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

    Comments: BlackboxNLP Workshop at EMNLP 2022

  23. arXiv:2205.12542  [pdf, other

    cs.CL

    ER-Test: Evaluating Explanation Regularization Methods for Language Models

    Authors: Brihi Joshi, Aaron Chan, Ziyi Liu, Shaoliang Nie, Maziar Sanjabi, Hamed Firooz, Xiang Ren

    Abstract: By explaining how humans would solve a given task, human rationales can provide strong learning signal for neural language models (LMs). Explanation regularization (ER) aims to improve LM generalization by pushing the LM's machine rationales (Which input tokens did the LM focus on?) to align with human rationales (Which input tokens would humans focus on?). Though prior works primarily study ER vi… ▽ More

    Submitted 27 February, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Findings of EMNLP 2022

  24. arXiv:2204.09191  [pdf, other

    cs.SE

    Unleashing the Power of Compiler Intermediate Representation to Enhance Neural Program Embeddings

    Authors: Zongjie Li, Pingchuan Ma, Huaijin Wang, Shuai Wang, Qiyi Tang, Sen Nie, Shi Wu

    Abstract: Neural program embeddings have demonstrated considerable promise in a range of program analysis tasks, including clone identification, program repair, code completion, and program synthesis. However, most existing methods generate neural program embeddings directly from the program source codes, by learning from features such as tokens, abstract syntax trees, and control flow graphs. This paper ta… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

  25. arXiv:2204.05990  [pdf, other

    cs.CL

    Detection, Disambiguation, Re-ranking: Autoregressive Entity Linking as a Multi-Task Problem

    Authors: Khalil Mrini, Shaoliang Nie, Jiatao Gu, Sinong Wang, Maziar Sanjabi, Hamed Firooz

    Abstract: We propose an autoregressive entity linking model, that is trained with two auxiliary tasks, and learns to re-rank generated samples at inference time. Our proposed novelties address two weaknesses in the literature. First, a recent method proposes to learn mention detection and then entity candidate selection, but relies on predefined sets of candidates. We use encoder-decoder autoregressive enti… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: Long paper accepted to ACL 2022 Findings

  26. arXiv:2202.08433  [pdf, ps, other

    cs.SD cs.LG eess.AS

    ADD 2022: the First Audio Deep Synthesis Detection Challenge

    Authors: Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Xiaohui Zhang, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu

    Abstract: Audio deepfake detection is an emerging topic, which was included in the ASVspoof 2021. However, the recent shared tasks have not covered many real-life and challenging scenarios. The first Audio Deep synthesis Detection challenge (ADD) was motivated to fill in the gap. The ADD 2022 includes three tracks: low-quality fake audio detection (LF), partially fake audio detection (PF) and audio fake gam… ▽ More

    Submitted 2 July, 2024; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: Accepted by ICASSP 2022

  27. arXiv:2201.00072  [pdf, other

    cs.LG

    BARACK: Partially Supervised Group Robustness With Guarantees

    Authors: Nimit S. Sohoni, Maziar Sanjabi, Nicolas Ballas, Aditya Grover, Shaoliang Nie, Hamed Firooz, Christopher Ré

    Abstract: While neural networks have shown remarkable success on classification tasks in terms of average-case performance, they often fail to perform well on certain groups of the data. Such group information may be expensive to obtain; thus, recent works in robustness and fairness have proposed ways to improve worst-group performance even when group labels are unavailable for the training data. However, t… ▽ More

    Submitted 10 April, 2022; v1 submitted 31 December, 2021; originally announced January 2022.

    Comments: 26 pages

  28. arXiv:2112.12928  [pdf, other

    cs.SE

    1-to-1 or 1-to-n? Investigating the effect of function inlining on binary similarity analysis

    Authors: Ang Jia, Ming Fan, Wuxia Jin, Xi Xu, Zhaohui Zhou, Qiyi Tang, Sen Nie, Shi Wu, Ting Liu

    Abstract: Binary similarity analysis is critical to many code-reuse-related issues and "1-to-1" mechanism is widely applied, where one function in a binary file is matched against one function in a source file or binary file. However, we discover that function mapping is a more complex problem of "1-to-n" or even "n-to-n" due to the existence of function inlining. In this paper, we investigate the effect… ▽ More

    Submitted 5 May, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

  29. arXiv:2112.08802  [pdf, other

    cs.CL cs.AI cs.LG

    UNIREX: A Unified Learning Framework for Language Model Rationale Extraction

    Authors: Aaron Chan, Maziar Sanjabi, Lambert Mathias, Liang Tan, Shaoliang Nie, Xiaochang Peng, Xiang Ren, Hamed Firooz

    Abstract: An extractive rationale explains a language model's (LM's) prediction on a given task instance by highlighting the text inputs that most influenced the prediction. Ideally, rationale extraction should be faithful (reflective of LM's actual behavior) and plausible (convincing to humans), without compromising the LM's (i.e., task model's) task performance. Although attribution algorithms and select-… ▽ More

    Submitted 26 February, 2023; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: ICML 2022

  30. arXiv:2104.00226  [pdf, other

    cs.CV

    DF^2AM: Dual-level Feature Fusion and Affinity Modeling for RGB-Infrared Cross-modality Person Re-identification

    Authors: Junhui Yin, Zhanyu Ma, Jiyang Xie, Shibo Nie, Kongming Liang, Jun Guo

    Abstract: RGB-infrared person re-identification is a challenging task due to the intra-class variations and cross-modality discrepancy. Existing works mainly focus on learning modality-shared global representations by aligning image styles or feature distributions across modalities, while local feature from body part and relationships between person images are largely neglected. In this paper, we propose a… ▽ More

    Submitted 31 March, 2021; originally announced April 2021.

  31. arXiv:2101.01881  [pdf, other

    cs.CV cs.LG

    MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding

    Authors: Woojeong Jin, Maziar Sanjabi, Shaoliang Nie, Liang Tan, Xiang Ren, Hamed Firooz

    Abstract: To reduce a model size but retain performance, we often rely on knowledge distillation (KD) which transfers knowledge from a large "teacher" model to a smaller "student" model. However, KD on multimodal datasets such as vision-language tasks is relatively unexplored, and digesting multimodal information is challenging since different modalities present different types of information. In this paper… ▽ More

    Submitted 21 October, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

    Comments: Accepted to EMNLP 2021 Findings

  32. arXiv:2008.07742  [pdf, other

    eess.IV cs.CV

    UDC 2020 Challenge on Image Restoration of Under-Display Camera: Methods and Results

    Authors: Yuqian Zhou, Michael Kwan, Kyle Tolentino, Neil Emerton, Sehoon Lim, Tim Large, Lijiang Fu, Zhihong Pan, Baopu Li, Qirui Yang, Yihao Liu, Jigang Tang, Tao Ku, Shibin Ma, Bingnan Hu, Jiarong Wang, Densen Puthussery, Hrishikesh P S, Melvin Kuriakose, Jiji C V, Varun Sundar, Sumanth Hegde, Divya Kothandaraman, Kaushik Mitra, Akashdeep Jassal , et al. (20 additional authors not shown)

    Abstract: This paper is the report of the first Under-Display Camera (UDC) image restoration challenge in conjunction with the RLQ workshop at ECCV 2020. The challenge is based on a newly-collected database of Under-Display Camera. The challenge tracks correspond to two types of display: a 4k Transparent OLED (T-OLED) and a phone Pentile OLED (P-OLED). Along with about 150 teams registered the challenge, ei… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

    Comments: 15 pages

  33. arXiv:2003.10270  [pdf, other

    cs.ET cs.NI eess.SP

    Mobility-aware Beam Steering in Metasurface-based Programmable Wireless Environments

    Authors: Christos Liaskos, Shuai Nie, Ageliki Tsioliaridou, Andreas Pitsillides, Sotiris Ioannidis, Ian Akyildiz

    Abstract: Programmable wireless environments (PWEs) utilize electromagnetic metasurfaces to transform wireless propagation into a software-controlled resource. In this work we study the effects of user device mobility on the efficiency of PWEs. An analytical model is proposed, which describes the potential misalignment between user-emitted waves and the active PWE configuration, and can constitute the basis… ▽ More

    Submitted 23 March, 2020; originally announced March 2020.

    Comments: In proceedings of IEEE ICASSP 2020. This work was funded by the European Union via the Horizon 2020: Future Emerging Topics call (FETOPEN-RIA), grant EU736876, project VISORSURF (http://visorsurf.eu)

  34. arXiv:1907.00037  [pdf, other

    eess.SP cs.IT

    3D Channel Modeling and Characterization for Hypersurface Empowered Indoor Environment at 60 GHz Millimeter-Wave Band

    Authors: Rashi Mehrotra, Rafay Iqbal Ansari, Alexandros Pitilakis, Shuai Nie, Christos Liaskos, Nikolaos V. Kantartzis, Andreas Pitsillides

    Abstract: This paper proposes a three-dimensional (3D) communication channel model for an indoor environment considering the effect of the Hypersurface. The Hypersurface is a software controlled intelligent metasurface, which can be used to manipulate electromagnetic waves, as for example for non-specular reflection and full absorption. Thus it can control the impinging rays from a transmitter towards a rec… ▽ More

    Submitted 28 June, 2019; originally announced July 2019.

    Comments: Accepted

  35. arXiv:1905.02495  [pdf, other

    cs.ET cs.LG cs.NI

    An Interpretable Neural Network for Configuring Programmable Wireless Environments

    Authors: Christos Liaskos, Ageliki Tsioliaridou, Shuai Nie, Andreas Pitsillides, Sotiris Ioannidis, Ian Akyildiz

    Abstract: Software-defined metasurfaces (SDMs) comprise a dense topology of basic elements called meta-atoms, exerting the highest degree of control over surface currents among intelligent panel technologies. As such, they can transform impinging electromagnetic (EM) waves in complex ways, modifying their direction, power, frequency spectrum, polarity and phase. A well-defined software interface allows for… ▽ More

    Submitted 7 May, 2019; originally announced May 2019.

    Comments: In proceedings of IEEE SPAWC 2019 - Special Session on Signal Processing Advances for Emerging Transceiver Hardware. This work was funded by the European Union via the Horizon 2020: Future Emerging Topics call (FETOPEN), grant EU736876, project VISORSURF (http://www.visorsurf.eu)

  36. Modeling, Simulating and Configuring Programmable Wireless Environments for Multi-User Multi-Objective Networking

    Authors: Christos Liaskos, Ageliki Tsioliaridou, Shuai Nie, Andreas Pitsillides, Sotiris Ioannidis, Ian Akyildiz

    Abstract: Programmable wireless environments enable the software-defined propagation of waves within them, yielding exceptional performance potential. Several building-block technologies have been implemented and evaluated at the physical layer. The present work contributes a network-layer scheme to configure such environments for multiple users and objectives, and for any physical-layer technology. Support… ▽ More

    Submitted 29 December, 2018; originally announced December 2018.

    Comments: This work is part of project VISORSURF: A HyperVisor for Metasurface Functionalities (www.visorsurf.eu). Funded by the European Union Horizon 2020, under the Future Emerging Technologies - Research and Innovation Actions call (Grant Agreement EU 736876)

    Report number: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=90

    Journal ref: IEEE/ACM Transactions on Networking, volume: 27 , issue: 4 , pp. 1696-1713, Aug. 2019

  37. A Novel Communication Paradigm for High Capacity and Security via Programmable Indoor Wireless Environments in Next Generation Wireless Systems

    Authors: Christos Liaskos, Shuai Nie, Ageliki Tsioliaridou, Andreas Pitsillides, Sotiris Ioannidis, Ian Akyildiz

    Abstract: Wireless communication environments comprise passive objects that cause performance degradation and eavesdropping concerns due to anomalous scattering. This paper proposes a new paradigm, where scattering becomes software-defined and, subsequently, optimizable across wide frequency ranges. Through the proposed programmable wireless environments, the path loss, multi-path fading and interference ef… ▽ More

    Submitted 8 November, 2018; originally announced December 2018.

    Comments: This work was partially funded by the European Union via the Horizon 2020: Future Emerging Topics call (FETOPEN), grant EU736876, project VISORSURF. admin note: significant overlap with arXiv:1805.06677

  38. arXiv:1811.00883  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Deep Segment Attentive Embedding for Duration Robust Speaker Verification

    Authors: Bin Liu, Shuai Nie, Yaping Zhang, Shan Liang, Wenju Liu

    Abstract: LSTM-based speaker verification usually uses a fixed-length local segment randomly truncated from an utterance to learn the utterance-level speaker embedding, while using the average embedding of all segments of a test utterance to verify the speaker, which results in a critical mismatch between testing and training. This mismatch degrades the performance of speaker verification, especially when t… ▽ More

    Submitted 31 October, 2018; originally announced November 2018.

  39. A Swarming Approach to Optimize the One-hop Delay in Smart Driving Inter-platoon Communications

    Authors: Qiong Wu, Shuzhen Nie, Pingyi Fan, Hanxu Liu, Fan Qiang, Zhengquan Li

    Abstract: In this paper, we propose a swarming approach and optimize the one-hop delay for interplatoon communications through adjusting the minimum contention window size of each backbone vehicle in two steps. In the first step, we first set a small enough average one-hop delay as the initial optimization goal and then propose a swarming approach to find a minimum average one-hop delay for inter-platoon co… ▽ More

    Submitted 2 November, 2020; v1 submitted 19 July, 2018; originally announced July 2018.

    Comments: published by sensors. Simulation codes are available online at https://codeocean.com/2018/06/28/code-for-colon-a-swarming-approach-to

    Journal ref: Sensors 2018, 18(10), 3307

  40. arXiv:1806.01792  [pdf, other

    eess.SP cs.ET cs.NI eess.SY

    A New Wireless Communication Paradigm through Software-controlled Metasurfaces

    Authors: Christos Liaskos, Shuai Nie, Ageliki Tsioliaridou, Andreas Pitsillides, Sotiris Ioannidis, Ian Akyildiz

    Abstract: Electromagnetic waves undergo multiple uncontrollable alterations as they propagate within a wireless environment. Free space path loss, signal absorption, as well as reflections, refractions and diffractions caused by physical objects within the environment highly affect the performance of wireless communications. Currently, such effects are intractable to account for and are treated as probabili… ▽ More

    Submitted 4 June, 2018; originally announced June 2018.

    Comments: Paper accepted for publication at the IEEE Communications Magazine. This work was funded by the European Union via the Horizon 2020: Future Emerging Topics call (FETOPEN-RIA), grant EU736876, project VISORSURF: HyperSurfaces-A Hardware Platform for Software-driven Functional Metasurfaces (http://www.visorsurf.eu/)

  41. arXiv:1805.06677  [pdf, other

    cs.ET cs.NI eess.SY

    Realizing Wireless Communication through Software-defined HyperSurface Environments

    Authors: Christos Liaskos, Shuai Nie, Ageliki Tsioliaridou, Andreas Pitsillides, Sotiris Ioannidis, Ian Akyildiz

    Abstract: Wireless communication environments are unaware of the ongoing data exchange efforts within them. Moreover, their effect on the communication quality is intractable in all but the simplest cases. The present work proposes a new paradigm, where indoor scattering becomes software-defined and, subsequently, optimizable across wide frequency ranges. Moreover, the controlled scattering can surpass natu… ▽ More

    Submitted 17 May, 2018; originally announced May 2018.

    Comments: This paper appears at the 19TH IEEE WOWMOM 2018, JUNE 12-15, 2018. (Technical program: http://it.murdoch.edu.au/wowmom2018/technical_program.html) This work was funded by the European Union via the Horizon 2020: Future Emerging Topics call (FETOPEN-RIA), grant EU736876, project VISORSURF (http://www.visorsurf.eu) : HyperSurfaces-A Hardware Platform for Software-driven Functional Metasurfaces

  42. arXiv:1805.01357  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Boosting Noise Robustness of Acoustic Model via Deep Adversarial Training

    Authors: Bin Liu, Shuai Nie, Yaping Zhang, Dengfeng Ke, Shan Liang, Wenju Liu1

    Abstract: In realistic environments, speech is usually interfered by various noise and reverberation, which dramatically degrades the performance of automatic speech recognition (ASR) systems. To alleviate this issue, the commonest way is to use a well-designed speech enhancement approach as the front-end of ASR. However, more complex pipelines, more computations and even higher hardware costs (microphone a… ▽ More

    Submitted 2 May, 2018; originally announced May 2018.

  43. arXiv:1803.00757  [pdf, other

    cs.HC cs.RO

    Gesture-based Piloting of an Aerial Robot using Monocular Vision

    Authors: Ting Sun, Shengyi Nie, Dit-Yan Yeung, Shaojie Shen

    Abstract: Aerial robots are becoming popular among general public, and with the development of artificial intelligence (AI), there is a trend to equip aerial robots with a natural user interface (NUI). Hand/arm gestures are an intuitive way to communicate for humans, and various research works have focused on controlling an aerial robot with natural gestures. However, the techniques in this area are still f… ▽ More

    Submitted 2 March, 2018; originally announced March 2018.

  44. arXiv:1801.07632  [pdf, other

    cs.CV cs.GR

    High Resolution Face Completion with Multiple Controllable Attributes via Fully End-to-End Progressive Generative Adversarial Networks

    Authors: Zeyuan Chen, Shaoliang Nie, Tianfu Wu, Christopher G. Healey

    Abstract: We present a deep learning approach for high resolution face completion with multiple controllable attributes (e.g., male and smiling) under arbitrary masks. Face completion entails understanding both structural meaningfulness and appearance consistency locally and globally to fill in "holes" whose content do not appear elsewhere in an input image. It is a challenging task with the difficulty leve… ▽ More

    Submitted 23 January, 2018; originally announced January 2018.

  45. A Generative Restricted Boltzmann Machine Based Method for High-Dimensional Motion Data Modeling

    Authors: Siqi Nie, Ziheng Wang, Qiang Ji

    Abstract: Many computer vision applications involve modeling complex spatio-temporal patterns in high-dimensional motion data. Recently, restricted Boltzmann machines (RBMs) have been widely used to capture and represent spatial patterns in a single image or temporal patterns in several time slices. To model global dynamics and local spatial interactions, we propose to theoretically extend the conventional… ▽ More

    Submitted 21 October, 2017; originally announced October 2017.

    Journal ref: Computer Vision and Image Understanding 136 (2015): 14-22

  46. arXiv:1710.04809  [pdf, other

    cs.LG

    Deep Regression Bayesian Network and Its Applications

    Authors: Siqi Nie, Meng Zheng, Qiang Ji

    Abstract: Deep directed generative models have attracted much attention recently due to their generative modeling nature and powerful data representation ability. In this paper, we review different structures of deep directed generative models and the learning and inference algorithms associated with the structures. We focus on a specific structure that consists of layers of Bayesian Networks due to the pro… ▽ More

    Submitted 13 October, 2017; originally announced October 2017.

    Comments: Accepted to IEEE Signal Processing Magazine

  47. arXiv:1610.07090  [pdf, other

    cs.SI

    STEPS: Predicting place attributes via spatio-temporal analysis

    Authors: Shuxin Nie, Abhimanyu Das, Evgeniy Gabrilovich, Wei-Lwun Lu, Boris Mazniker, Chris Schilling

    Abstract: In recent years, a vast amount of research has been conducted on learning people's interests from their actions. Yet their collective actions also allow us to learn something about the world, in particular, infer attributes of places people visit or interact with. Imagine classifying whether a hotel has a gym or a swimming pool, or whether a restaurant has a romantic atmosphere without ever asking… ▽ More

    Submitted 22 October, 2016; originally announced October 2016.

  48. arXiv:1506.04720  [pdf, other

    cs.LG

    Latent Regression Bayesian Network for Data Representation

    Authors: Siqi Nie, Qiang Ji

    Abstract: Deep directed generative models have attracted much attention recently due to their expressive representation power and the ability of ancestral sampling. One major difficulty of learning directed models with many latent variables is the intractable inference. To address this problem, most existing algorithms make assumptions to render the latent variables independent of each other, either by desi… ▽ More

    Submitted 15 June, 2015; originally announced June 2015.

  49. arXiv:1406.1411  [pdf, other

    cs.AI cs.LG stat.ML

    Advances in Learning Bayesian Networks of Bounded Treewidth

    Authors: Siqi Nie, Denis Deratani Maua, Cassio Polpo de Campos, Qiang Ji

    Abstract: This work presents novel algorithms for learning Bayesian network structures with bounded treewidth. Both exact and approximate methods are developed. The exact method combines mixed-integer linear programming formulations for structure learning and treewidth computation. The approximate method consists in uniformly sampling $k$-trees (maximal graphs of treewidth $k$), and subsequently selecting,… ▽ More

    Submitted 6 June, 2014; v1 submitted 5 June, 2014; originally announced June 2014.

    Comments: 23 pages, 2 figures, 3 tables

    MSC Class: 68T37