Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 101–150 of 481 results for author: Liang, P

.
  1. arXiv:2306.16413  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.MM

    MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning

    Authors: Paul Pu Liang, Yiwei Lyu, Xiang Fan, Arav Agarwal, Yun Cheng, Louis-Philippe Morency, Ruslan Salakhutdinov

    Abstract: Learning multimodal representations involves integrating information from multiple heterogeneous sources of data. In order to accelerate progress towards understudied modalities and tasks while ensuring real-world robustness, we release MultiZoo, a public toolkit consisting of standardized implementations of > 20 core multimodal algorithms and MultiBench, a large-scale benchmark spanning 15 datase… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: JMLR Open Source Software 2023, Code available at https://github.com/pliang279/MultiBench

  2. arXiv:2306.15700  [pdf, other

    cs.RO cs.LG

    Imitation with Spatial-Temporal Heatmap: 2nd Place Solution for NuPlan Challenge

    Authors: Yihan Hu, Kun Li, Pingyuan Liang, Jingyu Qian, Zhening Yang, Haichao Zhang, Wenxin Shao, Zhuangzhuang Ding, Wei Xu, Qiang Liu

    Abstract: This paper presents our 2nd place solution for the NuPlan Challenge 2023. Autonomous driving in real-world scenarios is highly complex and uncertain. Achieving safe planning in the complex multimodal scenarios is a highly challenging task. Our approach, Imitation with Spatial-Temporal Heatmap, adopts the learning form of behavior cloning, innovatively predicts the future multimodal states with a h… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  3. arXiv:2306.10599  [pdf, ps, other

    cs.SE

    An Empirical Study of Untangling Patterns of Two-Class Dependency Cycles

    Authors: Qiong Feng, Shuwen Liu, Huan Ji, Xiaotian Ma, Peng Liang

    Abstract: Dependency cycles pose a significant challenge to software quality and maintainability. However, there is limited understanding of how practitioners resolve dependency cycles in real-world scenarios. This paper presents an empirical study investigating the recurring patterns employed by software developers to resolve dependency cycles between two classes in practice. We analyzed the data from 38 o… ▽ More

    Submitted 17 December, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: Preprint accepted for publication in Empirical Software Engineering, 2023

  4. arXiv:2306.10015  [pdf, other

    cs.LG cs.CL cs.DC

    Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness

    Authors: Eric Zelikman, Qian Huang, Percy Liang, Nick Haber, Noah D. Goodman

    Abstract: Language model training in distributed settings is limited by the communication cost of gradient exchanges. In this short note, we extend recent work from Malladi et al. (2023), using shared randomness to perform distributed fine-tuning with low bandwidth. The method is a natural decentralized extension of memory-efficient Simultaneous Perturbation Stochastic Approximation (SPSA). Each iteration,… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  5. arXiv:2306.08620  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Anticipatory Music Transformer

    Authors: John Thickstun, David Hall, Chris Donahue, Percy Liang

    Abstract: We introduce anticipation: a method for constructing a controllable generative model of a temporal point process (the event process) conditioned asynchronously on realizations of a second, correlated process (the control process). We achieve this by interleaving sequences of events and controls, such that controls appear following stopping times in the event sequence. This work is motivated by pro… ▽ More

    Submitted 25 July, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: TMLR accepted version

  6. arXiv:2306.08616  [pdf, other

    cs.SE

    Towards Automated Identification of Violation Symptoms of Architecture Erosion

    Authors: Ruiyin Li, Peng Liang, Paris Avgeriou

    Abstract: Architecture erosion has a detrimental effect on maintenance and evolution, as the implementation drifts away from the intended architecture. To prevent this, development teams need to understand early enough the symptoms of erosion, and particularly violations of the intended architecture. One way to achieve this, is through the automated identification of architecture violations from textual art… ▽ More

    Submitted 29 April, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 21 pages, 4 images, 7 tables, Revision submitted to TSE (2024)

  7. arXiv:2306.07557  [pdf, other

    cs.SE

    Ethical Aspects of ChatGPT in Software Engineering Research

    Authors: Muhammad Azeem Akbar, Arif Ali Khan, Peng Liang

    Abstract: ChatGPT can improve Software Engineering (SE) research practices by offering efficient, accessible information analysis and synthesis based on natural language interactions. However, ChatGPT could bring ethical challenges, encompassing plagiarism, privacy, data security, and the risk of generating biased or potentially detrimental data. This research aims to fill the given gap by elaborating on th… ▽ More

    Submitted 13 August, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: Preprint accepted for publication in IEEE Transactions on Artificial Intelligence, 2023

  8. arXiv:2306.05268  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.MM

    Factorized Contrastive Learning: Going Beyond Multi-view Redundancy

    Authors: Paul Pu Liang, Zihao Deng, Martin Ma, James Zou, Louis-Philippe Morency, Ruslan Salakhutdinov

    Abstract: In a wide range of multimodal tasks, contrastive learning has become a particularly appealing approach since it can successfully learn representations from abundant unlabeled data with only pairing information (e.g., image-caption or video-audio pairs). Underpinning these approaches is the assumption of multi-view redundancy - that shared information between modalities is necessary and sufficient… ▽ More

    Submitted 30 October, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023. Code available at: https://github.com/pliang279/FactorCL

  9. arXiv:2306.04597  [pdf, other

    cs.CL cs.LG

    Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions

    Authors: Himanshu Thakur, Atishay Jain, Praneetha Vaddamanu, Paul Pu Liang, Louis-Philippe Morency

    Abstract: Societal biases present in pre-trained large language models are a critical issue as these models have been shown to propagate biases in countless downstream applications, rendering them unfair towards specific groups of people. Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trai… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023 Main Conference

  10. arXiv:2306.04539  [pdf, other

    cs.LG cs.CL cs.CV cs.IT stat.ML

    Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

    Authors: Paul Pu Liang, Chun Kai Ling, Yun Cheng, Alex Obolenskiy, Yudong Liu, Rohan Pandey, Alex Wilf, Louis-Philippe Morency, Ruslan Salakhutdinov

    Abstract: In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: how modalities combine to provide new task-relevant information that was not present in either alone. We study this challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data and naturally co-occurri… ▽ More

    Submitted 13 June, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: ICLR 2024, Code available at: https://github.com/pliang279/PID

  11. arXiv:2306.04125  [pdf, other

    cs.LG cs.CL cs.HC

    Multimodal Fusion Interactions: A Study of Human and Automatic Quantification

    Authors: Paul Pu Liang, Yun Cheng, Ruslan Salakhutdinov, Louis-Philippe Morency

    Abstract: In order to perform multimodal fusion of heterogeneous signals, we need to understand their interactions: how each modality individually provides information useful for a task and how this information changes in the presence of other modalities. In this paper, we perform a comparative study of how humans annotate two categorizations of multimodal interactions: (1) partial labels, where different a… ▽ More

    Submitted 30 October, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: International Conference on Multimodal Interaction (ICMI '23), Code available at: https://github.com/pliang279/PID. arXiv admin note: text overlap with arXiv:2302.12247

  12. arXiv:2306.04049  [pdf, other

    cs.LG cs.DS stat.ML

    One-sided Matrix Completion from Two Observations Per Row

    Authors: Steven Cao, Percy Liang, Gregory Valiant

    Abstract: Given only a few observed entries from a low-rank matrix $X$, matrix completion is the problem of imputing the missing entries, and it formalizes a wide range of real-world settings that involve estimating missing data. However, when there are too few observed entries to complete the matrix, what other aspects of the underlying matrix can be reliably recovered? We study one such problem setting, t… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  13. arXiv:2306.03262  [pdf, other

    cs.LG cs.DL

    Has the Machine Learning Review Process Become More Arbitrary as the Field Has Grown? The NeurIPS 2021 Consistency Experiment

    Authors: Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan

    Abstract: We present the NeurIPS 2021 consistency experiment, a larger-scale variant of the 2014 NeurIPS experiment in which 10% of conference submissions were reviewed by two independent committees to quantify the randomness in the review process. We observe that the two committees disagree on their accept/reject recommendations for 23% of the papers and that, consistent with the results from 2014, approxi… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  14. arXiv:2305.17311  [pdf, other

    cs.CL cs.AI cs.LG

    Beyond Positive Scaling: How Negation Impacts Scaling Trends of Language Models

    Authors: Yuhui Zhang, Michihiro Yasunaga, Zhengping Zhou, Jeff Z. HaoChen, James Zou, Percy Liang, Serena Yeung

    Abstract: Language models have been shown to exhibit positive scaling, where performance improves as models are scaled up in terms of size, compute, or data. In this work, we introduce NeQA, a dataset consisting of questions with negation in which language models do not exhibit straightforward positive scaling. We show that this task can exhibit inverse scaling, U-shaped scaling, or positive scaling, and th… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Published at ACL 2023 Findings

  15. arXiv:2305.16765  [pdf, other

    cs.CL

    Backpack Language Models

    Authors: John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang

    Abstract: We present Backpacks: a new neural architecture that marries strong modeling performance with an interface for interpretability and control. Backpacks learn multiple non-contextual sense vectors for each word in a vocabulary, and represent a word in a sequence as a context-dependent, non-negative linear combination of sense vectors in this sequence. We find that, after training, sense vectors spec… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: ACL 2023 Camera-Ready

  16. arXiv:2305.16349  [pdf, other

    cs.CL cs.AI cs.LG

    Lexinvariant Language Models

    Authors: Qian Huang, Eric Zelikman, Sarah Li Chen, Yuhuai Wu, Gregory Valiant, Percy Liang

    Abstract: Token embeddings, a mapping from discrete lexical symbols to continuous vectors, are at the heart of any language model (LM). However, lexical symbol meanings can also be determined and even redefined by their structural role in a long context. In this paper, we ask: is it possible for a language model to be performant without \emph{any} fixed token embeddings? Such a language model would have to… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  17. arXiv:2305.14577  [pdf, other

    cs.LG cs.CL

    Difference-Masking: Choosing What to Mask in Continued Pretraining

    Authors: Alex Wilf, Syeda Nahida Akter, Leena Mathur, Paul Pu Liang, Sheryl Mathew, Mengrou Shou, Eric Nyberg, Louis-Philippe Morency

    Abstract: The self-supervised objective of masking-and-predicting has led to promising performance gains on a variety of downstream tasks. However, while most approaches randomly mask tokens, there is strong intuition that deciding what to mask can substantially improve learning outcomes. We investigate this in continued pretraining setting in which pretrained models continue to pretrain on domain-specific… ▽ More

    Submitted 17 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  18. arXiv:2305.14387  [pdf, other

    cs.LG cs.AI cs.CL

    AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

    Authors: Yann Dubois, Xuechen Li, Rohan Taori, Tianyi Zhang, Ishaan Gulrajani, Jimmy Ba, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto

    Abstract: Large language models (LLMs) such as ChatGPT have seen widespread adoption due to their strong instruction-following abilities. Developing these LLMs involves a complex yet poorly understood workflow requiring training with human feedback. Replicating and understanding this instruction-following requires tackling three major challenges: the high cost of data collection, the lack of trustworthy eva… ▽ More

    Submitted 7 January, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Spotlight at NeurIPS 2023

  19. arXiv:2305.14342  [pdf, other

    cs.LG cs.CL math.OC

    Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

    Authors: Hong Liu, Zhiyuan Li, David Hall, Percy Liang, Tengyu Ma

    Abstract: Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training. Adam and its variants have been state-of-the-art for years, and more sophisticated second-order (Hessian-based) optimizers often incur too much per-step overhead. In this paper, we propose Sophia, Second-order Clipped St… ▽ More

    Submitted 5 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

  20. arXiv:2305.13583  [pdf, other

    cs.CL cs.MM eess.AS eess.IV

    Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical Fusion for Multimodal Affect Recognition

    Authors: Yaoting Wang, Yuanchao Li, Paul Pu Liang, Louis-Philippe Morency, Peter Bell, Catherine Lai

    Abstract: Fusing multiple modalities has proven effective for multimodal information processing. However, the incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition. In this study, we first analyze how the salient affective information in one modality can be affected by the other, and demonstrate that inter-modal incongruity exists latently in crossmodal att… ▽ More

    Submitted 12 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: *First two authors contributed equally

  21. arXiv:2305.12600  [pdf, other

    cs.LG cs.AI

    PRODIGY: Enabling In-context Learning Over Graphs

    Authors: Qian Huang, Hongyu Ren, Peng Chen, Gregor Kržmanc, Daniel Zeng, Percy Liang, Jure Leskovec

    Abstract: In-context learning is the ability of a pretrained model to adapt to novel and diverse downstream tasks by conditioning on prompt examples, without optimizing any parameters. While large language models have demonstrated this ability, how in-context learning could be performed over graphs is unexplored. In this paper, we develop \textbf{Pr}etraining \textbf{O}ver \textbf{D}iverse \textbf{I}n-Conte… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  22. arXiv:2305.12369  [pdf, other

    cs.CV cs.AI cs.LG

    HIINT: Historical, Intra- and Inter- personal Dynamics Modeling with Cross-person Memory Transformer

    Authors: Yubin Kim, Dong Won Lee, Paul Pu Liang, Sharifa Algohwinem, Cynthia Breazeal, Hae Won Park

    Abstract: Accurately modeling affect dynamics, which refers to the changes and fluctuations in emotions and affective displays during human conversations, is crucial for understanding human interactions. By analyzing affect dynamics, we can gain insights into how people communicate, respond to different situations, and form relationships. However, modeling affect dynamics is challenging due to contextual fa… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  23. arXiv:2305.10429  [pdf, other

    cs.CL cs.LG

    DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

    Authors: Sang Michael Xie, Hieu Pham, Xuanyi Dong, Nan Du, Hanxiao Liu, Yifeng Lu, Percy Liang, Quoc V. Le, Tengyu Ma, Adams Wei Yu

    Abstract: The mixture proportions of pretraining data domains (e.g., Wikipedia, books, web text) greatly affect language model (LM) performance. In this paper, we propose Domain Reweighting with Minimax Optimization (DoReMi), which first trains a small proxy model using group distributionally robust optimization (Group DRO) over domains to produce domain weights (mixture proportions) without knowledge of do… ▽ More

    Submitted 20 November, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  24. arXiv:2305.06884  [pdf, ps, other

    stat.ME cs.AI cs.LG math.ST stat.AP stat.ML

    Risk-limiting Financial Audits via Weighted Sampling without Replacement

    Authors: Shubhanshu Shekhar, Ziyu Xu, Zachary C. Lipton, Pierre J. Liang, Aaditya Ramdas

    Abstract: We introduce the notion of a risk-limiting financial auditing (RLFA): given $N$ transactions, the goal is to estimate the total misstated monetary fraction~($m^*$) to a given accuracy $ε$, with confidence $1-δ$. We do this by constructing new confidence sequences (CSs) for the weighted average of $N$ unknown values, based on samples drawn without replacement according to a (randomized) weighted sa… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: 23 pages, 8 figures, to appear in the Proceedings of Uncertainty in Artificial Intelligence (UAI) 2023

  25. arXiv:2305.02440  [pdf, other

    cs.LG

    Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs

    Authors: Deepak Narayanan, Keshav Santhanam, Peter Henderson, Rishi Bommasani, Tony Lee, Percy Liang

    Abstract: Large language models (LLMs) power many state-of-the-art systems in natural language processing. However, these models are extremely computationally expensive, even at inference time, raising the natural question: when is the extra cost of deploying a larger model worth the anticipated boost in capabilities? Better understanding this tradeoff fundamentally could benefit from an inference efficienc… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  26. Stark tuning of telecom single-photon emitters based on a single Er$^{3+}$

    Authors: Jian-Yin Huang, Peng-Jun Liang, Liang Zheng, Pei-Yun Li, You-Zhi Ma, Duan-Chen Liu, Zong-Quan Zhou, Chuan-Feng Li, Guang-Can Guo

    Abstract: The implementation of scalable quantum networks requires photons at the telecom band and long-lived spin coherence. The single Er$^{3+}$ in solid-state hosts is an important candidate that fulfills these critical requirements simultaneously. However, to entangle distant Er$^{3+}$ ions through photonic connections, the emission frequency of individual Er$^{3+}$ in solid-state matrix must be the sam… ▽ More

    Submitted 27 June, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Journal ref: Chinese Phys. Lett. 40 070301 (2023)

  27. arXiv:2304.11332  [pdf, other

    cs.CV cs.AI cs.LG

    Input Augmentation with SAM: Boosting Medical Image Segmentation with Segmentation Foundation Model

    Authors: Yizhe Zhang, Tao Zhou, Shuo Wang, Peixian Liang, Danny Z. Chen

    Abstract: The Segment Anything Model (SAM) is a recently developed large model for general-purpose segmentation for computer vision tasks. SAM was trained using 11 million images with over 1 billion masks and can produce segmentation results for a wide range of objects in natural scene images. SAM can be viewed as a general perception model for segmentation (partitioning images into semantically meaningful… ▽ More

    Submitted 21 June, 2023; v1 submitted 22 April, 2023; originally announced April 2023.

    Comments: GitHub: https://github.com/yizhezhang2000/SAMAug. Comments and questions are welcome

  28. arXiv:2304.09848  [pdf, other

    cs.CL cs.IR

    Evaluating Verifiability in Generative Search Engines

    Authors: Nelson F. Liu, Tianyi Zhang, Percy Liang

    Abstract: Generative search engines directly generate responses to user queries, along with in-line citations. A prerequisite trait of a trustworthy generative search engine is verifiability, i.e., systems should cite comprehensively (high citation recall; all statements are fully supported by citations) and accurately (high citation precision; every cite supports its associated statement). We conduct human… ▽ More

    Submitted 23 October, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: 25 pages, 12 figures; to appear in Findings of EMNLP 2023

  29. arXiv:2304.06819  [pdf, other

    cs.CV cs.AI q-bio.GN q-bio.QM q-bio.TO

    Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction

    Authors: Guillaume Jaume, Anurag Vaidya, Richard Chen, Drew Williamson, Paul Liang, Faisal Mahmood

    Abstract: Integrating whole-slide images (WSIs) and bulk transcriptomics for predicting patient survival can improve our understanding of patient prognosis. However, this multimodal task is particularly challenging due to the different nature of these data: WSIs represent a very high-dimensional spatial description of a tumor, while bulk transcriptomics represent a global description of gene expression leve… ▽ More

    Submitted 15 April, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2024

  30. arXiv:2304.03442  [pdf, other

    cs.HC cs.AI cs.LG

    Generative Agents: Interactive Simulacra of Human Behavior

    Authors: Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, Michael S. Bernstein

    Abstract: Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; t… ▽ More

    Submitted 5 August, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

  31. arXiv:2304.01620  [pdf, other

    cs.CV eess.IV

    Image Blind Denoising Using Dual Convolutional Neural Network with Skip Connection

    Authors: Wencong Wu, Shicheng Liao, Guannan Lv, Peng Liang, Yungang Zhang

    Abstract: In recent years, deep convolutional neural networks have shown fascinating performance in the field of image denoising. However, deeper network architectures are often accompanied with large numbers of model parameters, leading to high training cost and long inference time, which limits their application in practical denoising tasks. In this paper, we propose a novel dual convolutional blind denoi… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  32. arXiv:2304.01498  [pdf, other

    cs.CV eess.IV

    DCANet: Dual Convolutional Neural Network with Attention for Image Blind Denoising

    Authors: Wencong Wu, Guannan Lv, Yingying Duan, Peng Liang, Yungang Zhang, Yuelong Xia

    Abstract: Noise removal of images is an essential preprocessing procedure for many computer vision tasks. Currently, many denoising models based on deep neural networks can perform well in removing the noise with known distributions (i.e. the additive Gaussian white noise). However eliminating real noise is still a very challenging task, since real-world noise often does not simply follow one single type of… ▽ More

    Submitted 16 June, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

  33. A Diverse Population of z ~ 2 ULIRGs Revealed by JWST Imaging

    Authors: J. -S. Huang, Zi-Jian Li, Cheng Cheng, Meicun Hou, Haojing Yan, S. P. Willner, Y. -S. Dai, X. Z. Zheng, J. Pan, D. Rigopoulou, T. Wang, Zhiyuan Li, Piaoran Liang, A. Esamdin, G. G. Fazio

    Abstract: Four ultra-luminous infrared galaxies (ULIRGs) observed with JWST/NIRcam in the Cosmos Evolution Early Release Science program offer an unbiased preview of the $z\approx2$ ULIRG population. The objects were originally selected at 24 $μ$m and have strong polycyclic aromatic hydrocarbon emission features observed with Spitzer/IRS. The four objects have similar stellar masses of ${\sim}10^{11}$ M… ▽ More

    Submitted 6 April, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: 9 pages, 3 figures, 1 table, accepted for publication in ApJ. V2 updates author affiliations and acknowledgments, not scientific content

  34. arXiv:2304.00179  [pdf

    physics.app-ph

    Shedding Light on Rechargeable Na/Cl$_2$ Battery

    Authors: Guanzhou Zhu, Peng Liang, Cheng-Liang Huang, Shu-Chi Wu, Cheng-Chia Huang, Yuan-Yao Li, Shi-Kai Jiang, Wei-Hsiang Huang, Jiachen Li, Feifei Wang, Bing-Joe Hwang, Hongjie Dai

    Abstract: Advancing new ideas of rechargeable batteries represents an important path to meeting the ever increasing energy storage needs. Recently we showed rechargeable sodium/chlorine (Na/Cl$_2$) (or lithium/chlorine Li/Cl$_2$) batteries that used a Na (or Li) metal negative electrode, a microporous amorphous carbon nanosphere (aCNS) positive electrode and an electrolyte containing dissolved AlCl$_3$ and… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

    Comments: 30 pages, 9 figures

  35. Code Reviewer Recommendation for Architecture Violations: An Exploratory Study

    Authors: Ruiyin Li, Peng Liang, Paris Avgeriou

    Abstract: Code review is a common practice in software development and often conducted before code changes are merged into the code repository. A number of approaches for automatically recommending appropriate reviewers have been proposed to match such code changes to pertinent reviewers. However, such approaches are generic, i.e., they do not focus on specific types of issues during code reviews. In this p… ▽ More

    Submitted 24 April, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: The 27th International Conference on Evaluation and Assessment in Software Engineering (EASE)

  36. arXiv:2303.17548  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Whose Opinions Do Language Models Reflect?

    Authors: Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, Tatsunori Hashimoto

    Abstract: Language models (LMs) are increasingly being used in open-ended contexts, where the opinions reflected by LMs in response to subjective queries can have a profound impact, both on user satisfaction, as well as shaping the views of society at large. In this work, we put forth a quantitative framework to investigate the opinions reflected by LMs -- by leveraging high-quality public opinion polls and… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

  37. arXiv:2303.15772  [pdf, other

    cs.LG cs.AI cs.CY

    Ecosystem Graphs: The Social Footprint of Foundation Models

    Authors: Rishi Bommasani, Dilara Soylu, Thomas I. Liao, Kathleen A. Creel, Percy Liang

    Abstract: Foundation models (e.g. ChatGPT, StableDiffusion) pervasively influence society, warranting immediate social attention. While the models themselves garner much attention, to accurately characterize their impact, we must consider the broader sociotechnical ecosystem. We propose Ecosystem Graphs as a documentation framework to transparently centralize knowledge of this ecosystem. Ecosystem Graphs is… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Ecosystem Graphs available at https://crfm.stanford.edu/ecosystem-graphs/

    Journal ref: Published in AIES 2024

  38. arXiv:2303.15715  [pdf, other

    cs.CY cs.AI cs.LG

    Foundation Models and Fair Use

    Authors: Peter Henderson, Xuechen Li, Dan Jurafsky, Tatsunori Hashimoto, Mark A. Lemley, Percy Liang

    Abstract: Existing foundation models are trained on copyrighted material. Deploying these models can pose both legal and ethical risks when data creators fail to receive appropriate attribution or compensation. In the United States and several other countries, copyrighted content may be used to build foundation models without incurring liability due to the fair use doctrine. However, there is a caveat: If t… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  39. arXiv:2303.15198  [pdf, other

    cs.CV

    Image Deblurring by Exploring In-depth Properties of Transformer

    Authors: Pengwei Liang, Junjun Jiang, Xianming Liu, Jiayi Ma

    Abstract: Image deblurring continues to achieve impressive performance with the development of generative models. Nonetheless, there still remains a displeasing problem if one wants to improve perceptual quality and quantitative scores of recovered image at the same time. In this study, drawing inspiration from the research of transformer properties, we introduce the pretrained transformers to address this… ▽ More

    Submitted 27 January, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: accept by IEEE Transactions on Neural Networks and Learning Systems

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems 2024

  40. arXiv:2303.14713  [pdf, other

    cs.SE

    Engineering Software Systems for Quantum Computing as a Service: A Mapping Study

    Authors: Aakash Ahmad, Muhammad Waseem, Peng Liang, Mahdi Fehmideh, Arif Ali Khan, David Georg Reichelt, Tommi Mikkonen

    Abstract: Quantum systems have started to emerge as a disruptive technology and enabling platforms - exploiting the principles of quantum mechanics - to achieve quantum supremacy in computing. Academic research, industrial projects (e.g., Amazon Braket), and consortiums like 'Quantum Flagship' are striving to develop practically capable and commercially viable quantum computing (QC) systems and technologies… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

  41. arXiv:2303.11758  [pdf, other

    quant-ph

    The Closed and Open Unbalanced Dicke Trimer Model: Critical Properties and Nonlinear Semiclassical Dynamics

    Authors: Cheng Zhang, Pengfei Liang, Neill Lambert, Mauro Cirio

    Abstract: We study a generalization of a recently introduced Dicke trimer model [Phys. Rev. Lett. 128, 163601, Phys. Rev. Research 5, L042016], which allows for cavity losses and unbalanced light-matter interactions (in which rotating and counter-rotating terms can be tuned independently). We find that in the extreme unbalanced limit, the $U(1)$ symmetry of the Tavis-Cummings model is restored, qualitativel… ▽ More

    Submitted 2 November, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: 22 pages, 13 figures

  42. Practices and Challenges of Using GitHub Copilot: An Empirical Study

    Authors: Beiqi Zhang, Peng Liang, Xiyu Zhou, Aakash Ahmad, Muhammad Waseem

    Abstract: With the advances in machine learning, there is a growing interest in AI-enabled tools for autocompleting source code. GitHub Copilot, also referred to as the "AI Pair Programmer", has been trained on billions of lines of open source GitHub code, and is one of such tools that has been increasingly used since its launch on June 2021. However, little effort has been devoted to understanding the prac… ▽ More

    Submitted 27 April, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: The 35th International Conference on Software Engineering and Knowledge Engineering (SEKE)

  43. arXiv:2303.06865  [pdf, other

    cs.LG cs.AI cs.PF

    FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

    Authors: Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher RĂ©, Ion Stoica, Ce Zhang

    Abstract: The high computational and memory requirements of large language model (LLM) inference make it feasible only with multiple high-end accelerators. Motivated by the emerging demand for latency-insensitive tasks with batched processing, this paper initiates the study of high-throughput LLM inference using limited resources, such as a single commodity GPU. We present FlexGen, a high-throughput generat… ▽ More

    Submitted 12 June, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

  44. Automatic Identification and Extraction of Assumptions on GitHub

    Authors: Chen Yang, Zinan Ma, Peng Liang, Xiaohua Liu

    Abstract: In software development, due to the lack of knowledge or information, time pressure, complex context, and many other factors, various uncertainties emerge during the development process, leading to assumptions scattered in projects. Being unaware of certain assumptions can result in critical problems (e.g., system vulnerability and failures). The prerequisite of analyzing and understanding assumpt… ▽ More

    Submitted 25 April, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: The 29th International DMS Conference on Visualization and Visual Languages (DMSVIVA)

  45. arXiv:2303.02695  [pdf, other

    cs.SE cs.LG

    Understanding Bugs in Multi-Language Deep Learning Frameworks

    Authors: Zengyang Li, Sicheng Wang, Wenshuo Wang, Peng Liang, Ran Mo, Bing Li

    Abstract: Deep learning frameworks (DLFs) have been playing an increasingly important role in this intelligence age since they act as a basic infrastructure for an increasingly wide range of AIbased applications. Meanwhile, as multi-programming-language (MPL) software systems, DLFs are inevitably suffering from bugs caused by the use of multiple programming languages (PLs). Hence, it is of paramount signifi… ▽ More

    Submitted 5 March, 2023; originally announced March 2023.

    Comments: The 31st IEEE/ACM International Conference on Program Comprehension (ICPC)

  46. arXiv:2302.14600  [pdf, other

    cs.SE cs.AI

    Towards Human-Bot Collaborative Software Architecting with ChatGPT

    Authors: Aakash Ahmad, Muhammad Waseem, Peng Liang, Mahdi Fehmideh, Mst Shamima Aktar, Tommi Mikkonen

    Abstract: Architecting software-intensive systems can be a complex process. It deals with the daunting tasks of unifying stakeholders' perspectives, designers' intellect, tool-based automation, pattern-driven reuse, and so on, to sketch a blueprint that guides software implementation and evaluation. Despite its benefits, architecture-centric software engineering (ACSE) inherits a multitude of challenges. AC… ▽ More

    Submitted 26 February, 2023; originally announced February 2023.

    Comments: 7 pages, 6 images, Manuscript submitted to a Conference (2023)

  47. arXiv:2302.13289  [pdf, other

    cs.LG

    Improving Representational Continuity via Continued Pretraining

    Authors: Michael Sun, Ananya Kumar, Divyam Madaan, Percy Liang

    Abstract: We consider the continual representation learning setting: sequentially pretrain a model $M'$ on tasks $T_1, \ldots, T_T$, and then adapt $M'$ on a small amount of data from each task $T_i$ to check if it has forgotten information from old tasks. Under a kNN adaptation protocol, prior work shows that continual learning methods improve forgetting over naive training (SGD). In reality, practitioners… ▽ More

    Submitted 26 February, 2023; originally announced February 2023.

  48. arXiv:2302.12766  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    Language-Driven Representation Learning for Robotics

    Authors: Siddharth Karamcheti, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, Percy Liang

    Abstract: Recent work in visual representation learning for robotics demonstrates the viability of learning from large video datasets of humans performing everyday tasks. Leveraging methods such as masked autoencoding and contrastive learning, these representations exhibit strong transfer to policy learning for visuomotor control. But, robot learning encompasses a diverse set of problems beyond control incl… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: 30 Pages, 15 Figures

  49. arXiv:2302.12247  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.IT

    Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework

    Authors: Paul Pu Liang, Yun Cheng, Xiang Fan, Chun Kai Ling, Suzanne Nie, Richard Chen, Zihao Deng, Nicholas Allen, Randy Auerbach, Faisal Mahmood, Ruslan Salakhutdinov, Louis-Philippe Morency

    Abstract: The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different modalities. Despite these empirical advances, there remain fundamental research questions: How can we quantify the interactions that are necessary to solve a multimodal task? Subsequently, what are the most suitable multimo… ▽ More

    Submitted 10 December, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: NeurIPS 2023. Code available at: https://github.com/pliang279/PID

  50. arXiv:2302.11861  [pdf, other

    cs.LG cs.CV

    Out-of-Domain Robustness via Targeted Augmentations

    Authors: Irena Gao, Shiori Sagawa, Pang Wei Koh, Tatsunori Hashimoto, Percy Liang

    Abstract: Models trained on one set of domains often suffer performance drops on unseen domains, e.g., when wildlife monitoring models are deployed in new camera locations. In this work, we study principles for designing data augmentations for out-of-domain (OOD) generalization. In particular, we focus on real-world scenarios in which some domain-dependent features are robust, i.e., some features that vary… ▽ More

    Submitted 6 February, 2024; v1 submitted 23 February, 2023; originally announced February 2023.