Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 97 results for author: Shi, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13179  [pdf, other

    cs.SD cs.AI cs.NE eess.AS

    Global-Local Convolution with Spiking Neural Networks for Energy-efficient Keyword Spotting

    Authors: Shuai Wang, Dehao Zhang, Kexin Shi, Yuchen Wang, Wenjie Wei, Jibin Wu, Malu Zhang

    Abstract: Thanks to Deep Neural Networks (DNNs), the accuracy of Keyword Spotting (KWS) has made substantial progress. However, as KWS systems are usually implemented on edge devices, energy efficiency becomes a critical requirement besides performance. Here, we take advantage of spiking neural networks' energy efficiency and propose an end-to-end lightweight KWS model. The model consists of two innovative… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.07835  [pdf, other

    cs.CL cs.AI

    SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

    Authors: David Wadden, Kejian Shi, Jacob Morrison, Aakanksha Naik, Shruti Singh, Nitzan Barzilay, Kyle Lo, Tom Hope, Luca Soldaini, Shannon Zejiang Shen, Doug Downey, Hannaneh Hajishirzi, Arman Cohan

    Abstract: We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed t… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS Datasets and Benchmarks 2024

  3. arXiv:2405.20215  [pdf, other

    cs.CL

    TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models

    Authors: Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li

    Abstract: Mainstream approaches to aligning large language models (LLMs) heavily rely on human preference data, particularly when models require periodic updates. The standard process for iterative alignment of LLMs involves collecting new human feedback for each update. However, the data collection process is costly and challenging to scale. To address this issue, we introduce the "TS-Align" framework, whi… ▽ More

    Submitted 14 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  4. arXiv:2405.19299  [pdf, other

    cs.CL

    Expert-Guided Extinction of Toxic Tokens for Debiased Generation

    Authors: Xueyao Sun, Kaize Shi, Haoran Tang, Guandong Xu, Qing Li

    Abstract: Large language models (LLMs) can elicit social bias during generations, especially when inference with toxic prompts. Controlling the sensitive attributes in generation encounters challenges in data distribution, generalizability, and efficiency. Specifically, fine-tuning and retrieval demand extensive unbiased corpus, while direct prompting requires meticulously curated instructions for correctin… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  5. arXiv:2405.17659  [pdf, other

    eess.IV cs.CV

    Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba

    Authors: Jiahao Huang, Liutao Yang, Fanwen Wang, Yang Nan, Weiwen Wu, Chengyan Wang, Kuangyu Shi, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang

    Abstract: Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has sh… ▽ More

    Submitted 25 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  6. arXiv:2405.03085  [pdf, other

    cs.CL

    Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation

    Authors: Kaize Shi, Xueyao Sun, Qing Li, Guandong Xu

    Abstract: Large Language Models (LLMs) have made significant strides in information acquisition. However, their overreliance on potentially flawed parametric knowledge leads to hallucinations and inaccuracies, particularly when handling long-tail, domain-specific queries. Retrieval Augmented Generation (RAG) addresses this limitation by incorporating external, non-parametric knowledge. Nevertheless, the ret… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  7. arXiv:2405.01109  [pdf, other

    math.NA cs.LG math.AP

    Hypergraph $p$-Laplacian regularization on point clouds for data interpolation

    Authors: Kehan Shi, Martin Burger

    Abstract: As a generalization of graphs, hypergraphs are widely used to model higher-order relations in data. This paper explores the benefit of the hypergraph structure for the interpolation of point cloud data that contain no explicit structural information. We define the $\varepsilon_n$-ball hypergraph and the $k_n$-nearest neighbor hypergraph on a point cloud and study the $p$-Laplacian regularization o… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 33 pages

    MSC Class: 49J55; 35J20; 65N12

  8. arXiv:2404.19689  [pdf, ps, other

    math.AP cs.LG math.NA

    Continuum limit of $p$-biharmonic equations on graphs

    Authors: Kehan Shi, Martin Burger

    Abstract: This paper studies the $p$-biharmonic equation on graphs, which arises in point cloud processing and can be interpreted as a natural extension of the graph $p$-Laplacian from the perspective of hypergraph. The asymptotic behavior of the solution is investigated when the random geometric graph is considered and the number of data points goes to infinity. We show that the continuum limit is an appro… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 20 pages

    MSC Class: 35R02; 35J30; 65N12

  9. arXiv:2404.14662  [pdf, other

    cs.LG cs.CL cs.PL cs.SE

    NExT: Teaching Large Language Models to Reason about Code Execution

    Authors: Ansong Ni, Miltiadis Allamanis, Arman Cohan, Yinlin Deng, Kensen Shi, Charles Sutton, Pengcheng Yin

    Abstract: A fundamental skill among human developers is the ability to understand and reason about program execution. As an example, a programmer can mentally simulate code execution in natural language to debug and repair code (aka. rubber duck debugging). However, large language models (LLMs) of code are typically trained on the surface textual form of programs, thus may lack a semantic understanding of h… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 35 pages

  10. arXiv:2404.06851  [pdf, other

    cs.CV

    UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion

    Authors: Junsheng Zhou, Weiqi Zhang, Baorui Ma, Kanle Shi, Yu-Shen Liu, Zhizhong Han

    Abstract: Diffusion models have shown remarkable results for image generation, editing and inpainting. Recent works explore diffusion models for 3D shape generation with neural implicit functions, i.e., signed distance function and occupancy function. However, they are limited to shapes with closed surfaces, which prevents them from generating diverse 3D real-world contents containing open surfaces. In this… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: To appear at CVPR2024. Project page: https://weiqi-zhang.github.io/UDiFF

  11. arXiv:2403.19276  [pdf, ps, other

    cs.IR

    Enhanced Bayesian Personalized Ranking for Robust Hard Negative Sampling in Recommender Systems

    Authors: Kexin Shi, Jing Zhang, Linjiajie Fang, Wenjia Wang, Bingyi Jing

    Abstract: In implicit collaborative filtering, hard negative mining techniques are developed to accelerate and enhance the recommendation model learning. However, the inadvertent selection of false negatives remains a major concern in hard negative sampling, as these false negatives can provide incorrect information and mislead the model learning. To date, only a small number of studies have been committed… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 9 pages

  12. arXiv:2403.03346  [pdf, other

    cs.CV

    Enhancing Vision-Language Pre-training with Rich Supervisions

    Authors: Yuan Gao, Kunyu Shi, Pengkai Zhu, Edouard Belval, Oren Nuriel, Srikar Appalaraju, Shabnam Ghadar, Vijay Mahadevan, Zhuowen Tu, Stefano Soatto

    Abstract: We propose Strongly Supervised pre-training with ScreenShots (S4) - a novel pre-training paradigm for Vision-Language Models using data from large-scale web screenshot rendering. Using web screenshots unlocks a treasure trove of visual and textual cues that are not present in using image-text pairs. In S4, we leverage the inherent tree-structured hierarchy of HTML elements and the spatial localiza… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  13. arXiv:2403.02249  [pdf, other

    cs.CV cs.AI

    Non-autoregressive Sequence-to-Sequence Vision-Language Models

    Authors: Kunyu Shi, Qi Dong, Luis Goncalves, Zhuowen Tu, Stefano Soatto

    Abstract: Sequence-to-sequence vision-language models are showing promise, but their applicability is limited by their inference latency due to their autoregressive way of generating predictions. We propose a parallel decoding sequence-to-sequence vision-language model, trained with a Query-CTC loss, that marginalizes over multiple inference paths in the decoder. This allows us to model the joint distributi… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  14. arXiv:2402.15134  [pdf, other

    cs.LG cs.AI

    Deep Coupling Network For Multivariate Time Series Forecasting

    Authors: Kun Yi, Qi Zhang, Hui He, Kaize Shi, Liang Hu, Ning An, Zhendong Niu

    Abstract: Multivariate time series (MTS) forecasting is crucial in many real-world applications. To achieve accurate MTS forecasting, it is essential to simultaneously consider both intra- and inter-series relationships among time series data. However, previous work has typically modeled intra- and inter-series relationships separately and has disregarded multi-order interactions present within and between… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  15. arXiv:2402.11558  [pdf, other

    cs.LG

    A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation

    Authors: Yakun Chen, Kaize Shi, Zhangkai Wu, Juan Chen, Xianzhi Wang, Julian McAuley, Guandong Xu, Shui Yu

    Abstract: Spatiotemporal data analysis is pivotal across various domains, such as transportation, meteorology, and healthcare. The data collected in real-world scenarios are often incomplete due to device malfunctions and network errors. Spatiotemporal imputation aims to predict missing values by exploiting the spatial and temporal dependencies in the observed data. Traditional imputation approaches based o… ▽ More

    Submitted 22 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  16. arXiv:2402.08073  [pdf, other

    cs.LG cs.PL cs.SE

    Grounding Data Science Code Generation with Input-Output Specifications

    Authors: Yeming Wen, Pengcheng Yin, Kensen Shi, Henryk Michalewski, Swarat Chaudhuri, Alex Polozov

    Abstract: Large language models (LLMs) have recently demonstrated a remarkable ability to generate code from natural language (NL) prompts. However, in the real world, NL is often too ambiguous to capture the true intent behind programming problems, requiring additional input-output (I/O) specifications. Unfortunately, LLMs can have difficulty aligning their outputs with both the NL prompt and the I/O speci… ▽ More

    Submitted 14 March, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  17. arXiv:2401.08224  [pdf, other

    stat.ME cs.CR cs.LG

    Privacy Preserving Adaptive Experiment Design

    Authors: Jiachun Li, Kaining Shi, David Simchi-Levi

    Abstract: Adaptive experiment is widely adopted to estimate conditional average treatment effect (CATE) in clinical trials and many other scenarios. While the primary goal in experiment is to maximize estimation accuracy, due to the imperative of social welfare, it's also crucial to provide treatment with superior outcomes to patients, which is measured by regret in contextual bandit framework. These two ob… ▽ More

    Submitted 5 February, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Add a table

  18. arXiv:2401.07402  [pdf, other

    cs.CV

    Improved Implicit Neural Representation with Fourier Bases Reparameterized Training

    Authors: Kexuan Shi, Xingyu Zhou, Shuhang Gu

    Abstract: Implicit Neural Representation (INR) as a mighty representation paradigm has achieved success in various computer vision tasks recently. Due to the low-frequency bias issue of vanilla multi-layer perceptron (MLP), existing methods have investigated advanced techniques, such as positional encoding and periodic activation function, to improve the accuracy of INR. In this paper, we connect the networ… ▽ More

    Submitted 5 February, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

  19. arXiv:2401.06827  [pdf, other

    cs.CV cs.AI cs.CL

    APLe: Token-Wise Adaptive for Multi-Modal Prompt Learning

    Authors: Guiming Cao, Kaize Shi, Hong Fu, Huaiwen Zhang, Guandong Xu

    Abstract: Pre-trained Vision-Language (V-L) models set the benchmark for generalization to downstream tasks among the noteworthy contenders. Many characteristics of the V-L model have been explored in existing research including the challenge of the sensitivity to text input and the tuning process across multi-modal prompts. With the advanced utilization of the V-L model like CLIP, recent approaches deploy… ▽ More

    Submitted 23 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: 7 pages,3 figures

  20. arXiv:2311.16592  [pdf, other

    cs.RO

    RGBGrasp: Image-based Object Grasping by Capturing Multiple Views during Robot Arm Movement with Neural Radiance Fields

    Authors: Chang Liu, Kejian Shi, Kaichen Zhou, Haoxiao Wang, Jiyao Zhang, Hao Dong

    Abstract: Robotic research encounters a significant hurdle when it comes to the intricate task of grasping objects that come in various shapes, materials, and textures. Unlike many prior investigations that heavily leaned on specialized point-cloud cameras or abundant RGB visual data to gather 3D insights for object-grasping missions, this paper introduces a pioneering approach called RGBGrasp. This method… ▽ More

    Submitted 14 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  21. arXiv:2311.16542  [pdf, other

    cs.CV

    Agents meet OKR: An Object and Key Results Driven Agent System with Hierarchical Self-Collaboration and Self-Evaluation

    Authors: Yi Zheng, Chongyang Ma, Kanle Shi, Haibin Huang

    Abstract: In this study, we introduce the concept of OKR-Agent designed to enhance the capabilities of Large Language Models (LLMs) in task-solving. Our approach utilizes both self-collaboration and self-correction mechanism, facilitated by hierarchical agents, to address the inherent complexities in task-solving. Our key observations are two-fold: first, effective task-solving demands in-depth domain knowl… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  22. arXiv:2311.00389  [pdf, other

    cs.CV

    NeuralGF: Unsupervised Point Normal Estimation by Learning Neural Gradient Function

    Authors: Qing Li, Huifang Feng, Kanle Shi, Yue Gao, Yi Fang, Yu-Shen Liu, Zhizhong Han

    Abstract: Normal estimation for 3D point clouds is a fundamental task in 3D geometry processing. The state-of-the-art methods rely on priors of fitting local surfaces learned from normal supervision. However, normal supervision in benchmarks comes from synthetic shapes and is usually not available from real scans, thereby limiting the learned priors of these methods. In addition, normal orientation consiste… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted by NeurIPS 2023

  23. arXiv:2310.11191  [pdf, other

    cs.CL cs.AI

    Medical Text Simplification: Optimizing for Readability with Unlikelihood Training and Reranked Beam Search Decoding

    Authors: Lorenzo Jaime Yu Flores, Heyuan Huang, Kejian Shi, Sophie Chheang, Arman Cohan

    Abstract: Text simplification has emerged as an increasingly useful application of AI for bridging the communication gap in specialized fields such as medicine, where the lexicon is often dominated by technical jargon and complex constructs. Despite notable progress, methods in medical simplification sometimes result in the generated text having lower quality and diversity. In this work, we explore ways to… ▽ More

    Submitted 25 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

  24. arXiv:2310.08958  [pdf, other

    cs.CL

    xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark

    Authors: Chen Zhang, Luis Fernando D'Haro, Chengguang Tang, Ke Shi, Guohua Tang, Haizhou Li

    Abstract: Recent advancements in reference-free learned metrics for open-domain dialogue evaluation have been driven by the progress in pre-trained language models and the availability of dialogue data with high-quality human annotations. However, current studies predominantly concentrate on English dialogues, and the generalization of these metrics to other languages has not been fully examined. This is la… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP-2023 Findings

  25. arXiv:2309.14341  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Extreme Parkour with Legged Robots

    Authors: Xuxin Cheng, Kexin Shi, Ananye Agarwal, Deepak Pathak

    Abstract: Humans can perform parkour by traversing obstacles in a highly dynamic fashion requiring precise eye-muscle coordination and movement. Getting robots to do the same task requires overcoming similar challenges. Classically, this is done by independently engineering perception, actuation, and control systems to very low tolerances. This restricts them to tightly controlled settings such as a predete… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: Website and videos at https://extreme-parkour.github.io/

  26. arXiv:2309.09211  [pdf, other

    cs.CV

    Neural Gradient Learning and Optimization for Oriented Point Normal Estimation

    Authors: Qing Li, Huifang Feng, Kanle Shi, Yi Fang, Yu-Shen Liu, Zhizhong Han

    Abstract: We propose Neural Gradient Learning (NGL), a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation. It has excellent gradient approximation properties for the underlying geometry of the data. We utilize a simple neural network to parameterize the objective function to produce gradients at points using a global implicit representatio… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: accepted by SIGGRAPH Asia 2023

  27. arXiv:2309.08960  [pdf, other

    cs.CL

    ODSum: New Benchmarks for Open Domain Multi-Document Summarization

    Authors: Yijie Zhou, Kejian Shi, Wencai Zhang, Yixin Liu, Yilun Zhao, Arman Cohan

    Abstract: Open-domain Multi-Document Summarization (ODMDS) is a critical tool for condensing vast arrays of documents into coherent, concise summaries. With a more inter-related document set, there does not necessarily exist a correct answer for the retrieval, making it hard to measure the retrieving performance. We propose a rule-based method to process query-based document summarization datasets into ODMD… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  28. arXiv:2308.04913  [pdf, other

    cs.CL cs.AI cs.IR

    LLaMA-E: Empowering E-commerce Authoring with Object-Interleaved Instruction Following

    Authors: Kaize Shi, Xueyao Sun, Dingxian Wang, Yinlin Fu, Guandong Xu, Qing Li

    Abstract: E-commerce authoring entails creating engaging, diverse, and targeted content to enhance preference elicitation and retrieval experience. While Large Language Models (LLMs) have revolutionized content generation, they often fall short in e-commerce applications due to their limited memorization of domain-specific features. This paper proposes LLaMA-E, the unified e-commerce authoring models that a… ▽ More

    Submitted 10 June, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

  29. arXiv:2307.13883  [pdf, other

    cs.LG cs.PL

    ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis

    Authors: Kensen Shi, Joey Hong, Yinlin Deng, Pengcheng Yin, Manzil Zaheer, Charles Sutton

    Abstract: When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, we can measure whether they compositionally generalize, that is, whether a model that has been trained on the simpler subtasks is subsequently able to solve more co… ▽ More

    Submitted 6 May, 2024; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: ICLR 2024

  30. arXiv:2307.12187  [pdf, other

    cs.PL cs.AI cs.LG

    Monadic Deep Learning

    Authors: Bo Yang, Zhihao Zhang Kirisame Marisa, Kai Shi

    Abstract: The Java and Scala community has built a very successful big data ecosystem. However, most of neural networks running on it are modeled in dynamically typed programming languages. These dynamically typed deep learning frameworks treat neural networks as differentiable expressions that contain many trainable variable, and perform automatic differentiation on those expressions when training them.… ▽ More

    Submitted 22 July, 2023; originally announced July 2023.

    Comments: 27 pages, 7 figures, 3 tables

  31. arXiv:2306.13729  [pdf, other

    quant-ph cs.CR

    On the Two-sided Permutation Inversion Problem

    Authors: Gorjan Alagic, Chen Bai, Alexander Poremba, Kaiyan Shi

    Abstract: In the permutation inversion problem, the task is to find the preimage of some challenge value, given oracle access to the permutation. This is a fundamental problem in query complexity, and appears in many contexts, particularly cryptography. In this work, we examine the setting in which the oracle allows for quantum queries to both the forward and the inverse direction of the permutation -- exce… ▽ More

    Submitted 21 April, 2024; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: 32 pages. Published in Communications in Cryptology

    Journal ref: IACR Communications in Cryptology, Vol. 1, no. 1, Apr 09, 2024

  32. arXiv:2306.12794  [pdf, other

    cs.CL

    Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4

    Authors: Mario RodrĂ­guez-Cantelar, Chen Zhang, Chengguang Tang, Ke Shi, Sarik Ghazarian, JoĂŁo Sedoc, Luis Fernando D'Haro, Alexander Rudnicky

    Abstract: The advent and fast development of neural networks have revolutionized the research on dialogue systems and subsequently have triggered various challenges regarding their automatic evaluation. Automatic evaluation of open-domain dialogue systems as an open challenge has been the center of the attention of many researchers. Despite the consistent efforts to improve automatic metrics' correlations w… ▽ More

    Submitted 13 September, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

  33. arXiv:2306.02049  [pdf, other

    cs.LG cs.PL

    LambdaBeam: Neural Program Search with Higher-Order Functions and Lambdas

    Authors: Kensen Shi, Hanjun Dai, Wen-Ding Li, Kevin Ellis, Charles Sutton

    Abstract: Search is an important technique in program synthesis that allows for adaptive strategies such as focusing on particular search directions based on execution results. Several prior works have demonstrated that neural models are effective at guiding program synthesis searches. However, a common drawback of those approaches is the inability to handle iterative loops, higher-order functions, or lambd… ▽ More

    Submitted 28 October, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

  34. arXiv:2305.14239  [pdf, other

    cs.CL

    On Learning to Summarize with Large Language Models as References

    Authors: Yixin Liu, Kejian Shi, Katherine S He, Longtian Ye, Alexander R. Fabbri, Pengfei Liu, Dragomir Radev, Arman Cohan

    Abstract: Recent studies have found that summaries generated by large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. Therefore, we investigate a new learning setting of text summarization models that considers the LLMs as the reference or the gold-standard oracle on these datasets. To examine the standard practices that a… ▽ More

    Submitted 16 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: GitHub Repo: https://github.com/yixinL7/SumLLM

  35. arXiv:2305.10569  [pdf, other

    eess.IV cs.LG

    Self-Supervised Learning for Physiologically-Based Pharmacokinetic Modeling in Dynamic PET

    Authors: Francesca De Benetti, Walter Simson, Magdalini Paschali, Hasan Sari, Axel Romiger, Kuangyu Shi, Nassir Navab, Thomas Wendler

    Abstract: Dynamic positron emission tomography imaging (dPET) provides temporally resolved images of a tracer enabling a quantitative measure of physiological processes. Voxel-wise physiologically-based pharmacokinetic (PBPK) modeling of the time activity curves (TAC) can provide relevant diagnostic information for clinical workflow. Conventional fitting strategies for TACs are slow and ignore the spatial r… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  36. arXiv:2305.07019  [pdf, other

    cs.CV cs.AI cs.CL

    Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts

    Authors: Zhaoyang Zhang, Yantao Shen, Kunyu Shi, Zhaowei Cai, Jun Fang, Siqi Deng, Hao Yang, Davide Modolo, Zhuowen Tu, Stefano Soatto

    Abstract: We present a vision-language model whose parameters are jointly trained on all tasks and fully shared among multiple heterogeneous tasks which may interfere with each other, resulting in a single model which we named Musketeer. The integration of knowledge across heterogeneous tasks is enabled by a novel feature called Task Explanation Prompt (TEP). With rich and structured information such as tas… ▽ More

    Submitted 14 March, 2024; v1 submitted 11 May, 2023; originally announced May 2023.

  37. arXiv:2305.05873  [pdf, other

    cs.CV

    SHS-Net: Learning Signed Hyper Surfaces for Oriented Normal Estimation of Point Clouds

    Authors: Qing Li, Huifang Feng, Kanle Shi, Yue Gao, Yi Fang, Yu-Shen Liu, Zhizhong Han

    Abstract: We propose a novel method called SHS-Net for oriented normal estimation of point clouds by learning signed hyper surfaces, which can accurately predict normals with global consistent orientation from various point clouds. Almost all existing methods estimate oriented normals through a two-stage pipeline, i.e., unoriented normal estimation and normal orientation, and each step is implemented by a s… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted by CVPR 2023

  38. arXiv:2304.12035  [pdf, other

    cs.CV cs.MM

    GRIG: Few-Shot Generative Residual Image Inpainting

    Authors: Wanglong Lu, Xianta Jiang, Xiaogang Jin, Yong-Liang Yang, Minglun Gong, Tao Wang, Kaijie Shi, Hanli Zhao

    Abstract: Image inpainting is the task of filling in missing or masked region of an image with semantically meaningful contents. Recent methods have shown significant improvement in dealing with large-scale missing regions. However, these methods usually require large training datasets to achieve satisfactory results and there has been limited research into training these models on a small number of samples… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: There are 12 pages and 10 figures in this paper

    ACM Class: I.4.4; I.4.5; I.4.9

  39. arXiv:2304.00570  [pdf, other

    eess.IV cs.CV cs.LG

    FedFTN: Personalized Federated Learning with Deep Feature Transformation Network for Multi-institutional Low-count PET Denoising

    Authors: Bo Zhou, Huidong Xie, Qiong Liu, Xiongchao Chen, Xueqi Guo, Zhicheng Feng, Jun Hou, S. Kevin Zhou, Biao Li, Axel Rominger, Kuangyu Shi, James S. Duncan, Chi Liu

    Abstract: Low-count PET is an efficient way to reduce radiation exposure and acquisition time, but the reconstructed images often suffer from low signal-to-noise ratio (SNR), thus affecting diagnosis and other downstream tasks. Recent advances in deep learning have shown great potential in improving low-count PET image quality, but acquiring a large, centralized, and diverse dataset from multiple institutio… ▽ More

    Submitted 6 October, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: 13 pages, 6 figures, Accepted at Medical Image Analysis Journal (MedIA)

  40. arXiv:2302.08582  [pdf, other

    cs.CL cs.LG

    Pretraining Language Models with Human Preferences

    Authors: Tomasz Korbak, Kejian Shi, Angelica Chen, Rasika Bhalerao, Christopher L. Buckley, Jason Phang, Samuel R. Bowman, Ethan Perez

    Abstract: Language models (LMs) are pretrained to imitate internet text, including content that would violate human preferences if generated by an LM: falsehoods, offensive comments, personally identifiable information, low-quality or buggy code, and more. Here, we explore alternative objectives for pretraining LMs in a way that also guides them to generate text aligned with human preferences. We benchmark… ▽ More

    Submitted 14 June, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: ICML 2023

  41. arXiv:2302.04260  [pdf, other

    stat.ME cs.CR cs.LG

    The Test of Tests: A Framework For Differentially Private Hypothesis Testing

    Authors: Zeki Kazan, Kaiyan Shi, Adam Groce, Andrew Bray

    Abstract: We present a generic framework for creating differentially private versions of any hypothesis test in a black-box way. We analyze the resulting tests analytically and experimentally. Most crucially, we show good practical performance for small data sets, showing that at epsilon = 1 we only need 5-6 times as much data as in the fully public setting. We compare our work to the one existing framework… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: The main text is 14 pages and 4 figures. Appendices are 10 pages and 12 figures

  42. arXiv:2301.07527  [pdf, other

    cs.DC

    Evaluating Permissioned Blockchain Using Stochastic Modeling and Chaos Engineering

    Authors: Shiv Sondhi, Sherif Saad, Kevin Shi, Mohammad Mamun, Issa Traore

    Abstract: Blockchain and distributed ledger technologies rely on distributed consensus algorithms. In recent years many consensus algorithms and protocols have been proposed; most of them are for permissioned blockchain networks. However, the performance of these algorithms is not well understood. This paper introduces an approach to evaluating consensus algorithms and blockchain platforms in a hostile netw… ▽ More

    Submitted 14 January, 2023; originally announced January 2023.

    Comments: 21. arXiv admin note: text overlap with arXiv:2108.08441

  43. arXiv:2212.09248  [pdf, other

    cs.CL cs.SE

    Natural Language to Code Generation in Interactive Data Science Notebooks

    Authors: Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov, Charles Sutton

    Abstract: Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 46 pages. 32 figures

  44. arXiv:2211.13912  [pdf, ps, other

    cs.IR

    Enhancing Recommender Systems: A Strategy to Mitigate False Negative Impact

    Authors: Kexin Shi, Yun Zhang, Bingyi Jing, Wenjia Wang

    Abstract: In implicit collaborative filtering (CF) task of recommender systems, recent works mainly focus on model structure design with promising techniques like graph neural networks (GNNs). Effective and efficient negative sampling methods that suit these models, however, remain underdeveloped. One challenge is that existing hard negative samplers tend to suffer from severer over-fitting in model trainin… ▽ More

    Submitted 28 March, 2024; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: 9 pages, 16 figures

  45. arXiv:2211.06598  [pdf, ps, other

    cs.NI

    Enhancing Resource Utilization of Non-terrestrial Networks Using Temporal Graph-based Deterministic Routing

    Authors: Keyi Shi, Jingchao Wang, Hongyan Li, Kan Wang

    Abstract: Deterministic routing has emerged as a promising technology for future non-terrestrial networks (NTNs), offering the potential to enhance service performance and optimize resource utilization. However, the dynamic nature of network topology and resources poses challenges in establishing deterministic routing. These challenges encompass the intricacy of jointly scheduling transmission links and cyc… ▽ More

    Submitted 22 January, 2024; v1 submitted 12 November, 2022; originally announced November 2022.

  46. arXiv:2211.03885  [pdf, other

    cs.CV eess.IV

    Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Shuai Liu, Chaoyu Feng, Furui Bai, Xiaotao Wang, Lei Lei, Ziyao Yi, Yan Xiang, Zibin Liu, Shaoqing Li, Keming Shi, Dehui Kong, Ke Xu, Minsu Kwon, Yaqi Wu, Jiesi Zheng, Zhihao Fan, Xun Wu, Feng Zhang, Albert No, Minhyeok Cho, Zewen Chen, Xiaze Zhang, Ran Li , et al. (13 additional authors not shown)

    Abstract: The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. Th… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  47. arXiv:2211.00312  [pdf, other

    cs.CV cs.LG

    HDNet: Hierarchical Dynamic Network for Gait Recognition using Millimeter-Wave Radar

    Authors: Yanyan Huang, Yong Wang, Kun Shi, Chaojie Gu, Yu Fu, Cheng Zhuo, Zhiguo Shi

    Abstract: Gait recognition is widely used in diversified practical applications. Currently, the most prevalent approach is to recognize human gait from RGB images, owing to the progress of computer vision technologies. Nevertheless, the perception capability of RGB cameras deteriorates in rough circumstances, and visual surveillance may cause privacy invasion. Due to the robustness and non-invasive feature… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  48. arXiv:2210.01841  [pdf, other

    cs.RO cs.AI

    Learning Perception-Aware Agile Flight in Cluttered Environments

    Authors: Yunlong Song, Kexin Shi, Robert Penicka, Davide Scaramuzza

    Abstract: Recently, neural control policies have outperformed existing model-based planning-and-control methods for autonomously navigating quadrotors through cluttered environments in minimum time. However, they are not perception aware, a crucial requirement in vision-based navigation due to the camera's limited field of view and the underactuated nature of a quadrotor. We propose a learning-based system… ▽ More

    Submitted 3 March, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    Journal ref: 2023 IEEE International Conference on Robotics and Automation (ICRA)

  49. ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions

    Authors: Anjun Chen, Xiangyu Wang, Kun Shi, Shaohao Zhu, Bin Fang, Yingfeng Chen, Jiming Chen, Yuchi Huo, Qi Ye

    Abstract: 3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather. Complementary, mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather. However, combining RGB and mmWave signals for robust all-weather 3D human reconstruction is still an open challenge, given the sparse nature of mmWave and th… ▽ More

    Submitted 20 September, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: Accepted to ICRA2023, Project Page: https://chen3110.github.io/ImmFusion/index.html

  50. arXiv:2210.00227   

    eess.IV cs.CV cs.LG

    Attention Augmented ConvNeXt UNet For Rectal Tumour Segmentation

    Authors: Hongwei Wu, Junlin Wang, Xin Wang, Hui Nan, Yaxin Wang, Haonan Jing, Kaixuan Shi

    Abstract: It is a challenge to segment the location and size of rectal cancer tumours through deep learning. In this paper, in order to improve the ability of extracting suffi-cient feature information in rectal tumour segmentation, attention enlarged ConvNeXt UNet (AACN-UNet), is proposed. The network mainly includes two improvements: 1) the encoder stage of UNet is changed to ConvNeXt structure for encodi… ▽ More

    Submitted 26 October, 2022; v1 submitted 1 October, 2022; originally announced October 2022.

    Comments: I plan to replace this article, and supplement and confirm the structure and experimental content of this article