Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 395 results for author: Liang, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09246  [pdf, other

    cs.RO cs.LG

    OpenVLA: An Open-Source Vision-Language-Action Model

    Authors: Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, Chelsea Finn

    Abstract: Large policies pretrained on a combination of Internet-scale vision-language data and diverse robot demonstrations have the potential to change how we teach robots new skills: rather than training new behaviors from scratch, we can fine-tune such vision-language-action (VLA) models to obtain robust, generalizable policies for visuomotor control. Yet, widespread adoption of VLAs for robotics has be… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Website: https://openvla.github.io/

  2. arXiv:2406.05514  [pdf, other

    cs.SE

    RAG-Enhanced Commit Message Generation

    Authors: Linghao Zhang, Hongyi Zhang, Chong Wang, Peng Liang

    Abstract: Commit message is one of the most important textual information in software development and maintenance. However, it is time-consuming and labor-intensive to write commit messages manually. Commit Message Generation (CMG) has become a research hotspot in automated software engineering. Researchers have proposed several methods for CMG and achieved great results. In recent years, CodeBERT, CodeT5,… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  3. arXiv:2406.02348  [pdf

    cs.LG

    AMOSL: Adaptive Modality-wise Structure Learning in Multi-view Graph Neural Networks For Enhanced Unified Representation

    Authors: Peiyu Liang, Hongchang Gao, Xubin He

    Abstract: While Multi-view Graph Neural Networks (MVGNNs) excel at leveraging diverse modalities for learning object representation, existing methods assume identical local topology structures across modalities that overlook real-world discrepancies. This leads MVGNNs straggles in modality fusion and representations denoising. To address these issues, we propose adaptive modality-wise structure learning (AM… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Journal ref: 13th International Conference on Soft Computing, Artificial Intelligence and Applications (SAI 2024)

  4. arXiv:2405.17631  [pdf, other

    cs.AI cs.CE cs.MA

    BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments

    Authors: Yusuf Roohani, Jian Vora, Qian Huang, Zachary Steinhart, Alexander Marson, Percy Liang, Jure Leskovec

    Abstract: Agents based on large language models have shown great potential in accelerating scientific discovery by leveraging their rich background knowledge and reasoning capabilities. Here, we develop BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions. We demonstrate our agent on the problem of d… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  5. arXiv:2405.09819  [pdf

    cs.SE cs.LG

    Automating the Training and Deployment of Models in MLOps by Integrating Systems with Machine Learning

    Authors: Penghao Liang, Bo Song, Xiaoan Zhan, Zhou Chen, Jiaqiang Yuan

    Abstract: This article introduces the importance of machine learning in real-world applications and explores the rise of MLOps (Machine Learning Operations) and its importance for solving challenges such as model deployment and performance monitoring. By reviewing the evolution of MLOps and its relationship to traditional software development methods, the paper proposes ways to integrate the system into mac… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  6. arXiv:2405.04602  [pdf, other

    cs.SE

    An Empirical Study of Kotlin-Java Interactions

    Authors: Qiong Feng, Huan Ji, Xiaotian Ma, Peng Liang

    Abstract: Background: Since Google introduced Kotlin as an official programming language for developing Android apps in 2017, Kotlin has gained widespread adoption in Android development. The interoperability of Java and Kotlin's design nature allows them to coexist and interact with each other smoothly within a project. Aims: However, there is limited research on how Java and Kotlin interact with each othe… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  7. arXiv:2404.18976  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.MM

    Foundations of Multisensory Artificial Intelligence

    Authors: Paul Pu Liang

    Abstract: Building multisensory AI systems that learn from multiple sensory inputs such as text, speech, video, real-world sensors, wearable devices, and medical data holds great promise for impact in many scientific areas with practical benefits, such as in supporting human health and well-being, enabling multimedia content processing, and enhancing real-world autonomous agents. By synthesizing a range of… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: CMU Machine Learning Department PhD Thesis

  8. arXiv:2404.17739  [pdf, other

    cs.SE

    How LLMs Aid in UML Modeling: An Exploratory Study with Novice Analysts

    Authors: Beian Wang, Chong Wang, Peng Liang, Bing Li, Cheng Zeng

    Abstract: Since the emergence of GPT-3, Large Language Models (LLMs) have caught the eyes of researchers, practitioners, and educators in the field of software engineering. However, there has been relatively little investigation regarding the performance of LLMs in assisting with requirements analysis and UML modeling. This paper explores how LLMs can assist novice analysts in creating three types of typica… ▽ More

    Submitted 10 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: The 21st IEEE International Conference on Software Services Engineering (SSE)

  9. arXiv:2404.13362  [pdf, other

    cs.CL cs.AI cs.LG eess.AS

    Semantically Corrected Amharic Automatic Speech Recognition

    Authors: Samuael Adnew, Paul Pu Liang

    Abstract: Automatic Speech Recognition (ASR) can play a crucial role in enhancing the accessibility of spoken languages worldwide. In this paper, we build a set of ASR tools for Amharic, a language spoken by more than 50 million people primarily in eastern Africa. Amharic is written in the Ge'ez script, a sequence of graphemes with spacings denoting word boundaries. This makes computational processing of Am… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  10. arXiv:2404.12241  [pdf, other

    cs.CL cs.AI

    Introducing v0.5 of the AI Safety Benchmark from MLCommons

    Authors: Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller , et al. (75 additional authors not shown)

    Abstract: This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-pu… ▽ More

    Submitted 13 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  11. arXiv:2404.11023  [pdf, other

    cs.HC cs.CL cs.LG

    Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions

    Authors: Leena Mathur, Paul Pu Liang, Louis-Philippe Morency

    Abstract: Building socially-intelligent AI agents (Social-AI) is a multidisciplinary, multimodal research goal that involves creating agents that can sense, perceive, reason about, learn from, and respond to affect, behavior, and cognition of other agents (human or artificial). Progress towards Social-AI has accelerated in the past decade across several computing communities, including natural language proc… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Position Paper, Under Review, 19 pages, 2 figures

  12. arXiv:2404.11016  [pdf, other

    cs.CV cs.AI

    MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training

    Authors: Jiayang Li, Junjun Jiang, Pengwei Liang, Jiayi Ma

    Abstract: In this research, we introduce MaeFuse, a novel autoencoder model designed for infrared and visible image fusion (IVIF). The existing approaches for image fusion often rely on training combined with downstream tasks to obtain high-level visual information, which is effective in emphasizing target objects and delivering impressive results in visual quality and task-specific applications. MaeFuse, h… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  13. arXiv:2404.07942  [pdf, other

    cs.SE cs.AI

    On Unified Prompt Tuning for Request Quality Assurance in Public Code Review

    Authors: Xinyu Chen, Lin Li, Rui Zhang, Peng Liang

    Abstract: Public Code Review (PCR) can be implemented through a Software Question Answering (SQA) community, which facilitates high knowledge dissemination. Current methods mainly focus on the reviewer's perspective, including finding a capable reviewer, predicting comment quality, and recommending/generating review comments. Our intuition is that satisfying review necessity requests can increase their visi… ▽ More

    Submitted 17 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: The 29th International Conference on Database Systems for Advanced Applications (DASFAA)

  14. arXiv:2404.05041  [pdf, other

    cs.SE

    How Do OSS Developers Utilize Architectural Solutions from Q&A Sites: An Empirical Study

    Authors: Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin

    Abstract: Developers utilize programming-related knowledge (e.g., code snippets) on Q&A sites (e.g., Stack Overflow) that functionally matches the programming problems they encounter in their development. Despite extensive research on Q&A sites, being a high-level and important type of development-related knowledge, architectural solutions (e.g., architecture tactics) and their utilization are rarely explor… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  15. arXiv:2404.04492  [pdf

    cs.RO cs.AI cs.CV

    Automated Lane Change Behavior Prediction and Environmental Perception Based on SLAM Technology

    Authors: Han Lei, Baoming Wang, Zuwei Shui, Peiyuan Yang, Penghao Liang

    Abstract: In addition to environmental perception sensors such as cameras, radars, etc. in the automatic driving system, the external environment of the vehicle is perceived, in fact, there is also a perception sensor that has been silently dedicated in the system, that is, the positioning module. This paper explores the application of SLAM (Simultaneous Localization and Mapping) technology in the context o… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  16. arXiv:2404.04475  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators

    Authors: Yann Dubois, Balázs Galambosi, Percy Liang, Tatsunori B. Hashimoto

    Abstract: LLM-based auto-annotators have become a key component of the LLM development process due to their cost-effectiveness and scalability compared to human-based evaluation. However, these auto-annotators can introduce complex biases that are hard to remove. Even simple, known confounders such as preference for longer outputs remain in existing automated evaluation metrics. We propose a simple regressi… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  17. arXiv:2404.02127  [pdf, other

    cs.CL cs.AI cs.LG

    FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning

    Authors: Joel Niklaus, Lucia Zheng, Arya D. McCarthy, Christopher Hahn, Brian M. Rosen, Peter Henderson, Daniel E. Ho, Garrett Honke, Percy Liang, Christopher Manning

    Abstract: Instruction tuning is an important step in making language models useful for direct user interaction. However, many legal tasks remain out of reach for most open LLMs and there do not yet exist any large scale instruction datasets for the domain. This critically limits research in this application area. In this work, we curate LawInstruct, a large legal instruction dataset, covering 17 jurisdictio… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    MSC Class: 68T50 ACM Class: I.2

  18. arXiv:2404.00903  [pdf

    cs.IR cs.AI

    Maximizing User Experience with LLMOps-Driven Personalized Recommendation Systems

    Authors: Chenxi Shi, Penghao Liang, Yichao Wu, Tong Zhan, Zhengyu Jin

    Abstract: The integration of LLMOps into personalized recommendation systems marks a significant advancement in managing LLM-driven applications. This innovation presents both opportunities and challenges for enterprises, requiring specialized teams to navigate the complexity of engineering technology while prioritizing data security and model interpretability. By leveraging LLMOps, enterprises can enhance… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  19. arXiv:2403.20035  [pdf, other

    eess.IV cs.CV

    UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation

    Authors: Renkai Wu, Yinghao Liu, Pengchen Liang, Qing Chang

    Abstract: Traditionally for improving the segmentation performance of models, most approaches prefer to use adding more complex modules. And this is not suitable for the medical field, especially for mobile medical devices, where computationally loaded models are not suitable for real clinical environments due to computational resource constraints. Recently, state-space models (SSMs), represented by Mamba,… ▽ More

    Submitted 24 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

  20. arXiv:2403.18421  [pdf, other

    cs.CL cs.AI

    BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

    Authors: Elliot Bolton, Abhinav Venigalla, Michihiro Yasunaga, David Hall, Betty Xiong, Tony Lee, Roxana Daneshjou, Jonathan Frankle, Percy Liang, Michael Carbin, Christopher D. Manning

    Abstract: Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks. However, these models have hundreds of billions of parameters, are computationally expensive to run, require users to send their input data over the internet, and are trained on unknown data sources. Can smaller, more targeted models compete? To address this question, we build an… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 23 pages

  21. arXiv:2403.13642  [pdf, other

    cs.CV

    H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation

    Authors: Renkai Wu, Yinghao Liu, Pengchen Liang, Qing Chang

    Abstract: In the field of medical image segmentation, variant models based on Convolutional Neural Networks (CNNs) and Visual Transformers (ViTs) as the base modules have been very widely developed and applied. However, CNNs are often limited in their ability to deal with long sequences of information, while the low sensitivity of ViTs to local feature information and the problem of secondary computational… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  22. arXiv:2403.12980  [pdf, other

    cs.DC

    Containerization in Multi-Cloud Environment: Roles, Strategies, Challenges, and Solutions for Effective Implementation

    Authors: Muhammad Waseem, Aakash Ahmad, Peng Liang, Muhammad Azeem Akbar, Arif Ali Khan, Iftikhar Ahmad, Manu Setälä, Tommi Mikkonen

    Abstract: Containerization in a multi-cloud environment facilitates workload portability and optimized resource utilization. Containerization in multi-cloud environments has received significant attention in recent years both from academic research and industrial development perspectives. However, there exists no effort to systematically investigate the state of research on this topic. The aim of this resea… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 59 pages, 4 images, 16 tables, Manuscript submitted to a Journal (2024)

  23. arXiv:2403.08822  [pdf

    cs.LG cs.CL

    LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models

    Authors: Yichao Wu, Yafei Xiang, Shuning Huo, Yulu Gong, Penghao Liang

    Abstract: In addressing the computational and memory demands of fine-tuning Large Language Models(LLMs), we propose LoRA-SP(Streamlined Partial Parameter Adaptation), a novel approach utilizing randomized half-selective parameter freezing within the Low-Rank Adaptation(LoRA)framework. This method efficiently balances pre-trained knowledge retention and adaptability for task-specific optimizations. Through a… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

  24. arXiv:2403.08217  [pdf

    cs.CL cs.LG

    Research on the Application of Deep Learning-based BERT Model in Sentiment Analysis

    Authors: Yichao Wu, Zhengyu Jin, Chenxi Shi, Penghao Liang, Tong Zhan

    Abstract: This paper explores the application of deep learning techniques, particularly focusing on BERT models, in sentiment analysis. It begins by introducing the fundamental concept of sentiment analysis and how deep learning methods are utilized in this domain. Subsequently, it delves into the architecture and characteristics of BERT models. Through detailed explanation, it elucidates the application ef… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  25. arXiv:2403.07918  [pdf, other

    cs.CY cs.AI cs.LG

    On the Societal Impact of Open Foundation Models

    Authors: Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan

    Abstract: Foundation models are powerful technologies: how they are released publicly directly shapes their societal impact. In this position paper, we focus on open foundation models, defined here as those with broadly available model weights (e.g. Llama 2, Stable Diffusion XL). We identify five distinctive properties (e.g. greater customizability, poor monitoring) of open foundation models that lead to bo… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

  26. arXiv:2403.06093  [pdf, other

    cs.CV

    Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

    Authors: Haoxuanye Ji, Pengpeng Liang, Erkang Cheng

    Abstract: Multi-camera-based 3D object detection has made notable progress in the past several years. However, we observe that there are cases (e.g. faraway regions) in which popular 2D object detectors are more reliable than state-of-the-art 3D detectors. In this paper, to improve the performance of query-based 3D object detectors, we present a novel query generating approach termed QAF2D, which infers 3D… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  27. arXiv:2403.05059  [pdf, other

    cs.SE

    Bug Priority Change: An Empirical Study on Apache Projects

    Authors: Zengyang Li, Guangzong Cai, Qinyi Yu, Peng Liang, Ran Mo, Hui Liu

    Abstract: In issue tracking systems, each bug is assigned a priority level (e.g., Blocker, Critical, Major, Minor, or Trivial in JIRA from highest to lowest), which indicates the urgency level of the bug. In this sense, understanding bug priority changes helps to arrange the work schedule of participants reasonably, and facilitates a better analysis and resolution of bugs. According to the data extracted fr… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Preprint accepted for publication in Journal of Systems and Software, 2024

  28. arXiv:2403.04893  [pdf, other

    cs.AI

    A Safe Harbor for AI Evaluation and Red Teaming

    Authors: Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

    Abstract: Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensio… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  29. arXiv:2403.02760  [pdf

    cs.AI

    Emerging Synergies Between Large Language Models and Machine Learning in Ecommerce Recommendations

    Authors: Xiaonan Xu, Yichao Wu, Penghao Liang, Yuhang He, Han Wang

    Abstract: With the boom of e-commerce and web applications, recommender systems have become an important part of our daily lives, providing personalized recommendations based on the user's preferences. Although deep neural networks (DNNs) have made significant progress in improving recommendation systems by simulating the interaction between users and items and incorporating their textual information, these… ▽ More

    Submitted 12 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  30. arXiv:2402.16268  [pdf, other

    cs.LG cs.AI cs.CY

    Foundation Model Transparency Reports

    Authors: Rishi Bommasani, Kevin Klyman, Shayne Longpre, Betty Xiong, Sayash Kapoor, Nestor Maslej, Arvind Narayanan, Percy Liang

    Abstract: Foundation models are critical digital technologies with sweeping societal impact that necessitates transparency. To codify how foundation model developers should provide transparency about the development and deployment of their models, we propose Foundation Model Transparency Reports, drawing upon the transparency reporting practices in social media. While external documentation of societal harm… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  31. arXiv:2402.07865  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

    Authors: Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna, Percy Liang, Thomas Kollar, Dorsa Sadigh

    Abstract: Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning; adoption that has fueled a wealth of new models such as LLaVa, InstructBLIP, and PaLI-3. Despite the volume of new releases, key design decisions around image preprocessing, architecture, and optimization are under-explored, making it chall… ▽ More

    Submitted 30 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Published at ICML 2024. 22 pages, 11 figures. Training code and models: https://github.com/TRI-ML/prismatic-vlms. Evaluation code: https://github.com/TRI-ML/vlm-evaluation

  32. arXiv:2402.06423  [pdf, other

    cs.CV

    CurveFormer++: 3D Lane Detection by Curve Propagation with Temporal Curve Queries and Attention

    Authors: Yifeng Bai, Zhirong Chen, Pengpeng Liang, Erkang Cheng

    Abstract: In autonomous driving, 3D lane detection using monocular cameras is an important task for various downstream planning and control tasks. Recent CNN and Transformer approaches usually apply a two-stage scheme in the model design. The first stage transforms the image feature from a front image into a bird's-eye-view (BEV) representation. Subsequently, a sub-network processes the BEV feature map to g… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2209.07989

  33. arXiv:2402.06155  [pdf, other

    cs.CL

    Model Editing with Canonical Examples

    Authors: John Hewitt, Sarah Chen, Lanruo Lora Xie, Edward Adams, Percy Liang, Christopher D. Manning

    Abstract: We introduce model editing with canonical examples, a setting in which (1) a single learning example is provided per desired behavior, (2) evaluation is performed exclusively out-of-distribution, and (3) deviation from an initial model is strictly limited. A canonical example is a simple instance of good behavior, e.g., The capital of Mauritius is Port Louis) or bad behavior, e.g., An aspect of re… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  34. arXiv:2402.03697  [pdf, other

    cs.CV

    SHMC-Net: A Mask-guided Feature Fusion Network for Sperm Head Morphology Classification

    Authors: Nishchal Sapkota, Yejia Zhang, Sirui Li, Peixian Liang, Zhuo Zhao, Jingjing Zhang, Xiaomin Zha, Yiru Zhou, Yunxia Cao, Danny Z Chen

    Abstract: Male infertility accounts for about one-third of global infertility cases. Manual assessment of sperm abnormalities through head morphology analysis encounters issues of observer variability and diagnostic discrepancies among experts. Its alternative, Computer-Assisted Semen Analysis (CASA), suffers from low-quality sperm images, small datasets, and noisy class labels. We propose a new approach fo… ▽ More

    Submitted 5 March, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Published on ISBI 2024

  35. arXiv:2402.02401  [pdf, other

    cs.CV cs.AI

    AI-Generated Content Enhanced Computer-Aided Diagnosis Model for Thyroid Nodules: A ChatGPT-Style Assistant

    Authors: Jincao Yao, Yunpeng Wang, Zhikai Lei, Kai Wang, Xiaoxian Li, Jianhua Zhou, Xiang Hao, Jiafei Shen, Zhenping Wang, Rongrong Ru, Yaqing Chen, Yahan Zhou, Chen Chen, Yanming Zhang, Ping Liang, Dong Xu

    Abstract: An artificial intelligence-generated content-enhanced computer-aided diagnosis (AIGC-CAD) model, designated as ThyGPT, has been developed. This model, inspired by the architecture of ChatGPT, could assist radiologists in assessing the risk of thyroid nodules through semantic-level human-machine interaction. A dataset comprising 19,165 thyroid nodule ultrasound cases from Zhejiang Cancer Hospital w… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  36. arXiv:2402.00462  [pdf, other

    cs.SE

    Data Management Challenges in Agile Software Projects: A Systematic Literature Review

    Authors: Ahmed Fawzy, Amjed Tahir, Matthias Galster, Peng Liang

    Abstract: Agile software development follows an adaptive and iterative approach. However, the management of data (e.g., development data or product data) can pose significant challenges for projects and agile teams. We aim to identify and characterize key challenges faced in data management within agile projects and to examine potential solutions proposed in the literature. We used a Systematic Literature R… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 32 pages, 3 images, 6 tables, Manuscript submitted to a Journal (2024)

  37. arXiv:2401.16865  [pdf, other

    cs.SE

    Depends-Kotlin: A Cross-Language Kotlin Dependency Extractor

    Authors: Qiong Feng, Xiaotian Ma, Huan Ji, Peng Liang

    Abstract: Since Google introduced Kotlin as an official programming language for developing Android apps in 2017, Kotlin has gained widespread adoption in Android development. However, compared to Java, there is limited support for Kotlin code dependency analysis, which is the foundation to software analysis. To bridge this gap, we developed Depends-Kotlin to extract entities and their dependencies in Kotli… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  38. arXiv:2401.16310  [pdf, other

    cs.SE cs.AI

    Security Code Review by Large Language Models

    Authors: Jiaxin Yu, Peng Liang, Yujia Fu, Amjed Tahir, Mojtaba Shahin, Chong Wang, Yangxiao Cai

    Abstract: Security code review, as a time-consuming and labour-intensive process, typically requires integration with automated security defect detection tools to ensure code security. Despite the emergence of numerous security analysis tools, those tools face challenges in terms of their poor generalization, high false positive rates, and coarse detection granularity. A recent development with Large Langua… ▽ More

    Submitted 8 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  39. arXiv:2401.14176  [pdf, other

    cs.SE cs.AI

    Copilot Refinement: Addressing Code Smells in Copilot-Generated Python Code

    Authors: Beiqi Zhang, Peng Liang, Qiong Feng, Yujia Fu, Zengyang Li

    Abstract: As one of the most popular dynamic languages, Python experiences a decrease in readability and maintainability when code smells are present. Recent advancements in Large Language Models have sparked growing interest in AI-enabled tools for both code generation and refactoring. GitHub Copilot is one such tool that has gained widespread usage. Copilot Chat, released on September 2023, functions as a… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  40. arXiv:2401.10755  [pdf, other

    cs.SE

    Code Reviewer Recommendation Based on a Hypergraph with Multiplex Relationships

    Authors: Yu Qiao, Jian Wang, Can Cheng, Wei Tang, Peng Liang, Yuqi Zhao, Bing Li

    Abstract: Code review is an essential component of software development, playing a vital role in ensuring a comprehensive check of code changes. However, the continuous influx of pull requests and the limited pool of available reviewer candidates pose a significant challenge to the review process, making the task of assigning suitable reviewers to each review request increasingly difficult. To tackle this i… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: The 31st IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER)

  41. arXiv:2401.08097  [pdf, other

    cs.SE cs.AI cs.CY

    Fairness Concerns in App Reviews: A Study on AI-based Mobile Apps

    Authors: Ali Rezaei Nasab, Maedeh Dashti, Mojtaba Shahin, Mansooreh Zahedi, Hourieh Khalajzadeh, Chetan Arora, Peng Liang

    Abstract: Fairness is one of the socio-technical concerns that must be addressed in AI-based systems. Unfair AI-based systems, particularly unfair AI-based mobile apps, can pose difficulties for a significant proportion of the global population. This paper aims to analyze fairness concerns in AI-based app reviews. We first manually constructed a ground-truth dataset, including 1,132 fairness and 1,473 non-f… ▽ More

    Submitted 20 June, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: 30 pages, 5 images, 6 tables, Manuscript submitted to a Journal (2024)

  42. arXiv:2401.07378  [pdf, other

    cs.CV cs.AI

    Efficient approximation of Earth Mover's Distance Based on Nearest Neighbor Search

    Authors: Guangyu Meng, Ruyu Zhou, Liu Liu, Peixian Liang, Fang Liu, Danny Chen, Michael Niemier, X. Sharon Hu

    Abstract: Earth Mover's Distance (EMD) is an important similarity measure between two distributions, used in computer vision and many other application domains. However, its exact calculation is computationally and memory intensive, which hinders its scalability and applicability for large-scale problems. Various approximate EMD algorithms have been proposed to reduce computational costs, but they suffer lo… ▽ More

    Submitted 19 January, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

  43. arXiv:2401.05926  [pdf, other

    cs.SE

    Using Large Language Models for Commit Message Generation: A Preliminary Study

    Authors: Linghao Zhang, Jingshu Zhao, Chong Wang, Peng Liang

    Abstract: A commit message is a textual description of the code changes in a commit, which is a key part of the Git version control system (VCS). It captures the essence of software updating. Therefore, it can help developers understand code evolution and facilitate efficient collaboration between developers. However, it is time-consuming and labor-intensive to write good and valuable commit messages. Some… ▽ More

    Submitted 13 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: The 31st IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER)

  44. arXiv:2401.03653  [pdf, other

    cs.SE cs.LG

    An Exploratory Study on Automatic Identification of Assumptions in the Development of Deep Learning Frameworks

    Authors: Chen Yang, Peng Liang, Zinan Ma

    Abstract: Stakeholders constantly make assumptions in the development of deep learning (DL) frameworks. These assumptions are related to various types of software artifacts (e.g., requirements, design decisions, and technical debt) and can turn out to be invalid, leading to system failures. Existing approaches and tools for assumption management usually depend on manual identification of assumptions. Howeve… ▽ More

    Submitted 20 March, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

    Comments: 28 pages, 15 images, 10 tables, Manuscript submitted to a Journal (2024)

  45. Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking

    Authors: Shihao Feng, Pengpeng Liang, Jin Gao, Erkang Cheng

    Abstract: Point cloud-based 3D object tracking is an important task in autonomous driving. Though great advances regarding Siamese-based 3D tracking have been made recently, it remains challenging to learn the correlation between the template and search branches effectively with the sparse LIDAR point cloud data. Instead of performing correlation of the two branches at just one point in the network, in this… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: Preprint version for IEEE Robotics and Automation Letters (RAL)

    Journal ref: IEEE Robotics and Automation Letters (RAL), vol. 8, no. 12, pp. 8066-8073, 2023

  46. arXiv:2312.05421  [pdf, other

    cs.SE

    Architecture Decisions in Quantum Software Systems: An Empirical Study on Stack Exchange and GitHub

    Authors: Mst Shamima Aktar, Peng Liang, Muhammad Waseem, Amjed Tahir, Aakash Ahmad, Beiqi Zhang, Zengyang Li

    Abstract: Quantum computing provides a new dimension in computation, utilizing the principles of quantum mechanics to potentially solve complex problems that are currently intractable for classical computers. However, little research has been conducted about the architecture decisions made in quantum software development, which have a significant influence on the functionality, performance, scalability, and… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 33 pages, 3 images, 10 tables, Manuscript submitted to a Journal (2023)

  47. arXiv:2312.04837  [pdf, other

    cs.AI cs.CL cs.CV

    Localized Symbolic Knowledge Distillation for Visual Commonsense Models

    Authors: Jae Sung Park, Jack Hessel, Khyathi Raghavi Chandu, Paul Pu Liang, Ximing Lu, Peter West, Youngjae Yu, Qiuyuan Huang, Jianfeng Gao, Ali Farhadi, Yejin Choi

    Abstract: Instruction following vision-language (VL) models offer a flexible interface that supports a broad range of multimodal tasks in a zero-shot fashion. However, interfaces that operate on full images do not directly enable the user to "point to" and access specific regions within images. This capability is important not only to support reference-grounded VL benchmarks, but also, for practical applica… ▽ More

    Submitted 12 December, 2023; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Neurips 2023

  48. arXiv:2312.04469  [pdf, other

    cs.LG cs.CL cs.CR

    On the Learnability of Watermarks for Language Models

    Authors: Chenchen Gu, Xiang Lisa Li, Percy Liang, Tatsunori Hashimoto

    Abstract: Watermarking of language model outputs enables statistical detection of model-generated text, which can mitigate harms and misuses of language models. Existing watermarking strategies operate by altering the decoder of an existing language model. In this paper, we ask whether language models can directly learn to generate watermarked text, which would have significant implications for the real-wor… ▽ More

    Submitted 2 May, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted at ICLR 2024

  49. arXiv:2311.15625  [pdf, other

    cs.CV

    Only Positive Cases: 5-fold High-order Attention Interaction Model for Skin Segmentation Derived Classification

    Authors: Renkai Wu, Yinghao Liu, Pengchen Liang, Qing Chang

    Abstract: Computer-aided diagnosis of skin diseases is an important tool. However, the interpretability of computer-aided diagnosis is currently poor. Dermatologists and patients cannot intuitively understand the learning and prediction process of neural networks, which will lead to a decrease in the credibility of computer-aided diagnosis. In addition, traditional methods need to be trained using negative… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  50. arXiv:2311.10227  [pdf, other

    cs.AI cs.CL

    Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities

    Authors: Alex Wilf, Sihyun Shawn Lee, Paul Pu Liang, Louis-Philippe Morency

    Abstract: Human interactions are deeply rooted in the interplay of thoughts, beliefs, and desires made possible by Theory of Mind (ToM): our cognitive ability to understand the mental states of ourselves and others. Although ToM may come naturally to us, emulating it presents a challenge to even the most advanced Large Language Models (LLMs). Recent improvements to LLMs' reasoning capabilities from simple y… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.