Search | arXiv e-print repository

CellAgent: An LLM-driven Multi-Agent Framework for Automated Single-cell Data Analysis

Authors: Yihang Xiao, Jinyi Liu, Yan Zheng, Xiaohan Xie, Jianye Hao, Mingzhi Li, Ruitao Wang, Fei Ni, Yuxiao Li, Jintian Luo, Shaoqing Jiao, Jiajie Peng

Abstract: Single-cell RNA sequencing (scRNA-seq) data analysis is crucial for biological research, as it enables the precise characterization of cellular heterogeneity. However, manual manipulation of various tools to achieve desired outcomes can be labor-intensive for researchers. To address this, we introduce CellAgent (http://cell.agent4science.cn/), an LLM-driven multi-agent framework, specifically desi… ▽ More Single-cell RNA sequencing (scRNA-seq) data analysis is crucial for biological research, as it enables the precise characterization of cellular heterogeneity. However, manual manipulation of various tools to achieve desired outcomes can be labor-intensive for researchers. To address this, we introduce CellAgent (http://cell.agent4science.cn/), an LLM-driven multi-agent framework, specifically designed for the automatic processing and execution of scRNA-seq data analysis tasks, providing high-quality results with no human intervention. Firstly, to adapt general LLMs to the biological field, CellAgent constructs LLM-driven biological expert roles - planner, executor, and evaluator - each with specific responsibilities. Then, CellAgent introduces a hierarchical decision-making mechanism to coordinate these biological experts, effectively driving the planning and step-by-step execution of complex data analysis tasks. Furthermore, we propose a self-iterative optimization mechanism, enabling CellAgent to autonomously evaluate and optimize solutions, thereby guaranteeing output quality. We evaluate CellAgent on a comprehensive benchmark dataset encompassing dozens of tissues and hundreds of distinct cell types. Evaluation results consistently show that CellAgent effectively identifies the most suitable tools and hyperparameters for single-cell analysis tasks, achieving optimal performance. This automated framework dramatically reduces the workload for science data analyses, bringing us into the "Agent for Science" era. △ Less

Submitted 13 July, 2024; originally announced July 2024.

arXiv:2406.09509 [pdf, other]

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Authors: Zibin Dong, Yifu Yuan, Jianye Hao, Fei Ni, Yi Ma, Pengyi Li, Yan Zheng

Abstract: Leveraging the powerful generative capability of diffusion models (DMs) to build decision-making agents has achieved extensive success. However, there is still a demand for an easy-to-use and modularized open-source library that offers customized and efficient development for DM-based decision-making algorithms. In this work, we introduce CleanDiffuser, the first DM library specifically designed f… ▽ More Leveraging the powerful generative capability of diffusion models (DMs) to build decision-making agents has achieved extensive success. However, there is still a demand for an easy-to-use and modularized open-source library that offers customized and efficient development for DM-based decision-making algorithms. In this work, we introduce CleanDiffuser, the first DM library specifically designed for decision-making algorithms. By revisiting the roles of DMs in the decision-making domain, we identify a set of essential sub-modules that constitute the core of CleanDiffuser, allowing for the implementation of various DM algorithms with simple and flexible building blocks. To demonstrate the reliability and flexibility of CleanDiffuser, we conduct comprehensive evaluations of various DM algorithms implemented with CleanDiffuser across an extensive range of tasks. The analytical experiments provide a wealth of valuable design choices and insights, reveal opportunities and challenges, and lay a solid groundwork for future research. CleanDiffuser will provide long-term support to the decision-making community, enhancing reproducibility and fostering the development of more robust solutions. The code and documentation of CleanDiffuser are open-sourced on the https://github.com/CleanDiffuserTeam/CleanDiffuser. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: The first two authors contribute equally to this work. Code and documentation: https://github.com/CleanDiffuserTeam/CleanDiffuser

arXiv:2403.16206 [pdf]

Rumor Detection with a novel graph neural network approach

Authors: Tianrui Liu, Qi Cai, Changxin Xu, Bo Hong, Fanghao Ni, Yuxin Qiao, Tsungwei Yang

Abstract: The wide spread of rumors on social media has caused a negative impact on people's daily life, leading to potential panic, fear, and mental health problems for the public. How to debunk rumors as early as possible remains a challenging problem. Existing studies mainly leverage information propagation structure to detect rumors, while very few works focus on correlation among users that they may co… ▽ More The wide spread of rumors on social media has caused a negative impact on people's daily life, leading to potential panic, fear, and mental health problems for the public. How to debunk rumors as early as possible remains a challenging problem. Existing studies mainly leverage information propagation structure to detect rumors, while very few works focus on correlation among users that they may coordinate to spread rumors in order to gain large popularity. In this paper, we propose a new detection model, that jointly learns both the representations of user correlation and information propagation to detect rumors on social media. Specifically, we leverage graph neural networks to learn the representations of user correlation from a bipartite graph that describes the correlations between users and source tweets, and the representations of information propagation with a tree structure. Then we combine the learned representations from these two modules to classify the rumors. Since malicious users intend to subvert our model after deployment, we further develop a greedy attack scheme to analyze the cost of three adversarial attacks: graph attack, comment attack, and joint attack. Evaluation results on two public datasets illustrate that the proposed MODEL outperforms the state-of-the-art rumor detection models. We also demonstrate our method performs well for early rumor detection. Moreover, the proposed detection method is more robust to adversarial attacks compared to the best existing method. Importantly, we show that it requires a high cost for attackers to subvert user correlation pattern, demonstrating the importance of considering user correlation for rumor detection. △ Less

Submitted 1 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

Comments: 10 pages, 5 figures

arXiv:2403.11542 [pdf, ps, other]

Topology Data Analysis-based Error Detection for Semantic Image Transmission with Incremental Knowledge-based HARQ

Authors: Fei Ni, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

Abstract: Semantic communication (SemCom) aims to achieve high fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy. Nevertheless, semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism (e.g., hybrid automatic repeat request [HARQ]) is indispensable. In that regard, instead of discarding prev… ▽ More Semantic communication (SemCom) aims to achieve high fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy. Nevertheless, semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism (e.g., hybrid automatic repeat request [HARQ]) is indispensable. In that regard, instead of discarding previously transmitted information, the incremental knowledge-based HARQ (IK-HARQ) is deemed as a more effective mechanism that could sufficiently utilize the information semantics. However, considering the possible existence of semantic ambiguity in image transmission, a simple bit-level cyclic redundancy check (CRC) might compromise the performance of IK-HARQ. Therefore, it emerges a strong incentive to revolutionize the CRC mechanism, so as to reap the benefits of both SemCom and HARQ. In this paper, built on top of swin transformer-based joint source-channel coding (JSCC) and IK-HARQ, we propose a semantic image transmission framework SC-TDA-HARQ. In particular, different from the conventional CRC, we introduce a topological data analysis (TDA)-based error detection method, which capably digs out the inner topological and geometric information of images, so as to capture semantic information and determine the necessity for re-transmission. Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework, especially under the limited bandwidth condition, and manifest the superiority of TDA-based error detection method in image transmission. △ Less

Submitted 23 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.03636 [pdf, other]

SheetAgent: A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models

Authors: Yibin Chen, Yifu Yuan, Zeyu Zhang, Yan Zheng, Jinyi Liu, Fei Ni, Jianye Hao

Abstract: Spreadsheet manipulation is widely existing in most daily works and significantly improves working efficiency. Large language model (LLM) has been recently attempted for automatic spreadsheet manipulation but has not yet been investigated in complicated and realistic tasks where reasoning challenges exist (e.g., long horizon manipulation with multi-step reasoning and ambiguous requirements). To br… ▽ More Spreadsheet manipulation is widely existing in most daily works and significantly improves working efficiency. Large language model (LLM) has been recently attempted for automatic spreadsheet manipulation but has not yet been investigated in complicated and realistic tasks where reasoning challenges exist (e.g., long horizon manipulation with multi-step reasoning and ambiguous requirements). To bridge the gap with the real-world requirements, we introduce $\textbf{SheetRM}$, a benchmark featuring long-horizon and multi-category tasks with reasoning-dependent manipulation caused by real-life challenges. To mitigate the above challenges, we further propose $\textbf{SheetAgent}$, a novel autonomous agent that utilizes the power of LLMs. SheetAgent consists of three collaborative modules: $\textit{Planner}$, $\textit{Informer}$, and $\textit{Retriever}$, achieving both advanced reasoning and accurate manipulation over spreadsheets without human interaction through iterative task reasoning and reflection. Extensive experiments demonstrate that SheetAgent delivers 20-30% pass rate improvements on multiple benchmarks over baselines, achieving enhanced precision in spreadsheet manipulation and demonstrating superior table reasoning abilities. More details and visualizations are available at https://sheetagent.github.io. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 24 pages, 14 figures

arXiv:2402.14245 [pdf, other]

Enhancing Robotic Manipulation with AI Feedback from Multimodal Large Language Models

Authors: Jinyi Liu, Yifu Yuan, Jianye Hao, Fei Ni, Lingzhi Fu, Yibin Chen, Yan Zheng

Abstract: Recently, there has been considerable attention towards leveraging large language models (LLMs) to enhance decision-making processes. However, aligning the natural language text instructions generated by LLMs with the vectorized operations required for execution presents a significant challenge, often necessitating task-specific details. To circumvent the need for such task-specific granularity, i… ▽ More Recently, there has been considerable attention towards leveraging large language models (LLMs) to enhance decision-making processes. However, aligning the natural language text instructions generated by LLMs with the vectorized operations required for execution presents a significant challenge, often necessitating task-specific details. To circumvent the need for such task-specific granularity, inspired by preference-based policy learning approaches, we investigate the utilization of multimodal LLMs to provide automated preference feedback solely from image inputs to guide decision-making. In this study, we train a multimodal LLM, termed CriticGPT, capable of understanding trajectory videos in robot manipulation tasks, serving as a critic to offer analysis and preference feedback. Subsequently, we validate the effectiveness of preference labels generated by CriticGPT from a reward modeling perspective. Experimental evaluation of the algorithm's preference accuracy demonstrates its effective generalization ability to new tasks. Furthermore, performance on Meta-World tasks reveals that CriticGPT's reward model efficiently guides policy learning, surpassing rewards based on state-of-the-art pre-trained representation models. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: Presented at AAAI 2024 RL+LLMs Workshop

arXiv:2402.10991 [pdf]

Enhancing Convergence in Federated Learning: A Contribution-Aware Asynchronous Approach

Authors: Changxin Xu, Yuxin Qiao, Zhanxin Zhou, Fanghao Ni, Jize Xiong

Abstract: Federated Learning (FL) is a distributed machine learning paradigm that allows clients to train models on their data while preserving their privacy. FL algorithms, such as Federated Averaging (FedAvg) and its variants, have been shown to converge well in many scenarios. However, these methods require clients to upload their local updates to the server in a synchronous manner, which can be slow and… ▽ More Federated Learning (FL) is a distributed machine learning paradigm that allows clients to train models on their data while preserving their privacy. FL algorithms, such as Federated Averaging (FedAvg) and its variants, have been shown to converge well in many scenarios. However, these methods require clients to upload their local updates to the server in a synchronous manner, which can be slow and unreliable in realistic FL settings. To address this issue, researchers have developed asynchronous FL methods that allow clients to continue training on their local data using a stale global model. However, most of these methods simply aggregate all of the received updates without considering their relative contributions, which can slow down convergence. In this paper, we propose a contribution-aware asynchronous FL method that takes into account the staleness and statistical heterogeneity of the received updates. Our method dynamically adjusts the contribution of each update based on these factors, which can speed up convergence compared to existing methods. △ Less

Submitted 3 March, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: 5 pages, 1 figures

arXiv:2402.03339 [pdf, other]

Interplay of Semantic Communication and Knowledge Learning

Authors: Fei Ni, Bingyan Wang, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

Abstract: In the swiftly advancing realm of communication technologies, Semantic Communication (SemCom), which emphasizes knowledge understanding and processing, has emerged as a hot topic. By integrating artificial intelligence technologies, SemCom facilitates a profound understanding, analysis and transmission of communication content. In this chapter, we clarify the means of knowledge learning in SemCom… ▽ More In the swiftly advancing realm of communication technologies, Semantic Communication (SemCom), which emphasizes knowledge understanding and processing, has emerged as a hot topic. By integrating artificial intelligence technologies, SemCom facilitates a profound understanding, analysis and transmission of communication content. In this chapter, we clarify the means of knowledge learning in SemCom with a particular focus on the utilization of Knowledge Graphs (KGs). Specifically, we first review existing efforts that combine SemCom with knowledge learning. Subsequently, we introduce a KG-enhanced SemCom system, wherein the receiver is carefully calibrated to leverage knowledge from its static knowledge base for ameliorating the decoding performance. Contingent upon this framework, we further explore potential approaches that can empower the system to operate in evolving knowledge base more effectively. Furthermore, we investigate the possibility of integration with Large Language Models (LLMs) for data augmentation, offering additional perspective into the potential implementation means of SemCom. Extensive numerical results demonstrate that the proposed framework yields superior performance on top of the KG-enhanced decoding and manifests its versatility under different scenarios. △ Less

Submitted 18 January, 2024; originally announced February 2024.

Comments: Contributing to a Wiley book, copyright might be transferred without further notice; And the paper "Knowledge Enhanced Semantic Communication Receiver" (available at arXiv:2302.07727) constitutes a segment of this work

arXiv:2401.15443 [pdf, other]

DiffuserLite: Towards Real-time Diffusion Planning

Authors: Zibin Dong, Jianye Hao, Yifu Yuan, Fei Ni, Yitian Wang, Pengyi Li, Yan Zheng

Abstract: Diffusion planning has been recognized as an effective decision-making paradigm in various domains. The capability of conditionally generating high-quality long-horizon trajectories makes it a promising research direction. However, existing diffusion planning methods suffer from low decision-making frequencies due to the expensive iterative sampling cost. To address this issue, we introduce Diffus… ▽ More Diffusion planning has been recognized as an effective decision-making paradigm in various domains. The capability of conditionally generating high-quality long-horizon trajectories makes it a promising research direction. However, existing diffusion planning methods suffer from low decision-making frequencies due to the expensive iterative sampling cost. To address this issue, we introduce DiffuserLite, a super fast and lightweight diffusion planning framework. DiffuserLite employs a planning refinement process (PRP) to generate coarse-to-fine-grained trajectories, significantly reducing the modeling of redundant information and leading to notable increases in decision-making frequency. Our experimental results demonstrate that DiffuserLite achieves a decision-making frequency of $122$Hz ($112.7$x faster than previous mainstream frameworks) and reaches state-of-the-art performance on D4RL benchmarks. In addition, our neat DiffuserLite framework can serve as a flexible plugin to enhance decision frequency in other diffusion planning algorithms, providing a structural design reference for future works. More details and visualizations are available at https://diffuserlite.github.io/. △ Less

Submitted 2 February, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

arXiv:2312.06677 [pdf, other]

Intelligent Virtual Assistants with LLM-based Process Automation

Authors: Yanchu Guan, Dong Wang, Zhixuan Chu, Shiyu Wang, Feiyue Ni, Ruihua Song, Longfei Li, Jinjie Gu, Chenyi Zhuang

Abstract: While intelligent virtual assistants like Siri, Alexa, and Google Assistant have become ubiquitous in modern life, they still face limitations in their ability to follow multi-step instructions and accomplish complex goals articulated in natural language. However, recent breakthroughs in large language models (LLMs) show promise for overcoming existing barriers by enhancing natural language proces… ▽ More While intelligent virtual assistants like Siri, Alexa, and Google Assistant have become ubiquitous in modern life, they still face limitations in their ability to follow multi-step instructions and accomplish complex goals articulated in natural language. However, recent breakthroughs in large language models (LLMs) show promise for overcoming existing barriers by enhancing natural language processing and reasoning capabilities. Though promising, applying LLMs to create more advanced virtual assistants still faces challenges like ensuring robust performance and handling variability in real-world user commands. This paper proposes a novel LLM-based virtual assistant that can automatically perform multi-step operations within mobile apps based on high-level user requests. The system represents an advance in assistants by providing an end-to-end solution for parsing instructions, reasoning about goals, and executing actions. LLM-based Process Automation (LLMPA) has modules for decomposing instructions, generating descriptions, detecting interface elements, predicting next actions, and error checking. Experiments demonstrate the system completing complex mobile operation tasks in Alipay based on natural language instructions. This showcases how large language models can enable automated assistants to accomplish real-world tasks. The main contributions are the novel LLMPA architecture optimized for app process automation, the methodology for applying LLMs to mobile apps, and demonstrations of multi-step task completion in a real-world environment. Notably, this work represents the first real-world deployment and extensive evaluation of a large language model-based virtual assistant in a widely used mobile application with an enormous user base numbering in the hundreds of millions. △ Less

Submitted 4 December, 2023; originally announced December 2023.

arXiv:2310.02054 [pdf, other]

AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model

Authors: Zibin Dong, Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing Hu, Tangjie Lv, Changjie Fan, Zhipeng Hu

Abstract: Aligning agent behaviors with diverse human preferences remains a challenging problem in reinforcement learning (RL), owing to the inherent abstractness and mutability of human preferences. To address these issues, we propose AlignDiff, a novel framework that leverages RL from Human Feedback (RLHF) to quantify human preferences, covering abstractness, and utilizes them to guide diffusion planning… ▽ More Aligning agent behaviors with diverse human preferences remains a challenging problem in reinforcement learning (RL), owing to the inherent abstractness and mutability of human preferences. To address these issues, we propose AlignDiff, a novel framework that leverages RL from Human Feedback (RLHF) to quantify human preferences, covering abstractness, and utilizes them to guide diffusion planning for zero-shot behavior customizing, covering mutability. AlignDiff can accurately match user-customized behaviors and efficiently switch from one to another. To build the framework, we first establish the multi-perspective human feedback datasets, which contain comparisons for the attributes of diverse behaviors, and then train an attribute strength model to predict quantified relative strengths. After relabeling behavioral datasets with relative strengths, we proceed to train an attribute-conditioned diffusion model, which serves as a planner with the attribute strength model as a director for preference aligning at the inference phase. We evaluate AlignDiff on various locomotion tasks and demonstrate its superior performance on preference matching, switching, and covering compared to other baselines. Its capability of completing unseen downstream tasks under human instructions also showcases the promising potential for human-AI collaboration. More visualization videos are released on https://aligndiff.github.io/. △ Less

Submitted 4 February, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

arXiv:2307.07184 [pdf, other]

TVPR: Text-to-Video Person Retrieval and a New Benchmark

Authors: Fan Ni, Xu Zhang, Jianhui Wu, Guan-Nan Dong, Aichun Zhu, Hui Liu, Yue Zhang

Abstract: Most existing methods for text-based person retrieval focus on text-to-image person retrieval. Nevertheless, due to the lack of dynamic information provided by isolated frames, the performance is hampered when the person is obscured in isolated frames or variable motion details are given in the textual description. In this paper, we propose a new task called Text-to-Video Person Retrieval(TVPR) wh… ▽ More Most existing methods for text-based person retrieval focus on text-to-image person retrieval. Nevertheless, due to the lack of dynamic information provided by isolated frames, the performance is hampered when the person is obscured in isolated frames or variable motion details are given in the textual description. In this paper, we propose a new task called Text-to-Video Person Retrieval(TVPR) which aims to effectively overcome the limitations of isolated frames. Since there is no dataset or benchmark that describes person videos with natural language, we construct a large-scale cross-modal person video dataset containing detailed natural language annotations, such as person's appearance, actions and interactions with environment, etc., termed as Text-to-Video Person Re-identification (TVPReid) dataset, which will be publicly available. To this end, a Text-to-Video Person Retrieval Network (TVPRN) is proposed. Specifically, TVPRN acquires video representations by fusing visual and motion representations of person videos, which can deal with temporal occlusion and the absence of variable motion details in isolated frames. Meanwhile, we employ the pre-trained BERT to obtain caption representations and the relationship between caption and video representations to reveal the most relevant person videos. To evaluate the effectiveness of the proposed TVPRN, extensive experiments have been conducted on TVPReid dataset. To the best of our knowledge, TVPRN is the first successful attempt to use video for text-based person retrieval task and has achieved state-of-the-art performance on TVPReid dataset. The TVPReid dataset will be publicly available to benefit future research. △ Less

Submitted 2 February, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

arXiv:2305.19923 [pdf, other]

MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL

Authors: Fei Ni, Jianye Hao, Yao Mu, Yifu Yuan, Yan Zheng, Bin Wang, Zhixuan Liang

Abstract: Recently, diffusion model shines as a promising backbone for the sequence modeling paradigm in offline reinforcement learning(RL). However, these works mostly lack the generalization ability across tasks with reward or dynamics change. To tackle this challenge, in this paper we propose a task-oriented conditioned diffusion planner for offline meta-RL(MetaDiffuser), which considers the generalizati… ▽ More Recently, diffusion model shines as a promising backbone for the sequence modeling paradigm in offline reinforcement learning(RL). However, these works mostly lack the generalization ability across tasks with reward or dynamics change. To tackle this challenge, in this paper we propose a task-oriented conditioned diffusion planner for offline meta-RL(MetaDiffuser), which considers the generalization problem as conditional trajectory generation task with contextual representation. The key is to learn a context conditioned diffusion model which can generate task-oriented trajectories for planning across diverse tasks. To enhance the dynamics consistency of the generated trajectories while encouraging trajectories to achieve high returns, we further design a dual-guided module in the sampling process of the diffusion model. The proposed framework enjoys the robustness to the quality of collected warm-start data from the testing task and the flexibility to incorporate with different task representation method. The experiment results on MuJoCo benchmarks show that MetaDiffuser outperforms other strong offline meta-RL baselines, demonstrating the outstanding conditional generation ability of diffusion architecture. △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: 19 pages, 4 figures, accepted by ICML 23'

arXiv:2302.01877 [pdf, other]

AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

Authors: Zhixuan Liang, Yao Mu, Mingyu Ding, Fei Ni, Masayoshi Tomizuka, Ping Luo

Abstract: Diffusion models have demonstrated their powerful generative capability in many tasks, with great potential to serve as a paradigm for offline reinforcement learning. However, the quality of the diffusion model is limited by the insufficient diversity of training data, which hinders the performance of planning and the generalizability to new tasks. This paper introduces AdaptDiffuser, an evolution… ▽ More Diffusion models have demonstrated their powerful generative capability in many tasks, with great potential to serve as a paradigm for offline reinforcement learning. However, the quality of the diffusion model is limited by the insufficient diversity of training data, which hinders the performance of planning and the generalizability to new tasks. This paper introduces AdaptDiffuser, an evolutionary planning method with diffusion that can self-evolve to improve the diffusion model hence a better planner, not only for seen tasks but can also adapt to unseen tasks. AdaptDiffuser enables the generation of rich synthetic expert data for goal-conditioned tasks using guidance from reward gradients. It then selects high-quality data via a discriminator to finetune the diffusion model, which improves the generalization ability to unseen tasks. Empirical experiments on two benchmark environments and two carefully designed unseen tasks in KUKA industrial robot arm and Maze2D environments demonstrate the effectiveness of AdaptDiffuser. For example, AdaptDiffuser not only outperforms the previous art Diffuser by 20.8% on Maze2D and 7.5% on MuJoCo locomotion, but also adapts better to new tasks, e.g., KUKA pick-and-place, by 27.9% without requiring additional expert data. More visualization results and demo videos could be found on our project page. △ Less

Submitted 12 May, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: Accepted by ICML 2023. Project page: https://adaptdiffuser.github.io

arXiv:2301.00135 [pdf, other]

TeViS:Translating Text Synopses to Video Storyboards

Authors: Xu Gu, Yuchong Sun, Feiyue Ni, Shizhe Chen, Xihua Wang, Ruihua Song, Boyuan Li, Xiang Cao

Abstract: A video storyboard is a roadmap for video creation which consists of shot-by-shot images to visualize key plots in a text synopsis. Creating video storyboards, however, remains challenging which not only requires cross-modal association between high-level texts and images but also demands long-term reasoning to make transitions smooth across shots. In this paper, we propose a new task called Text… ▽ More A video storyboard is a roadmap for video creation which consists of shot-by-shot images to visualize key plots in a text synopsis. Creating video storyboards, however, remains challenging which not only requires cross-modal association between high-level texts and images but also demands long-term reasoning to make transitions smooth across shots. In this paper, we propose a new task called Text synopsis to Video Storyboard (TeViS) which aims to retrieve an ordered sequence of images as the video storyboard to visualize the text synopsis. We construct a MovieNet-TeViS dataset based on the public MovieNet dataset. It contains 10K text synopses each paired with keyframes manually selected from corresponding movies by considering both relevance and cinematic coherence. To benchmark the task, we present strong CLIP-based baselines and a novel VQ-Trans. VQ-Trans first encodes text synopsis and images into a joint embedding space and uses vector quantization (VQ) to improve the visual representation. Then, it auto-regressively generates a sequence of visual features for retrieval and ordering. Experimental results demonstrate that VQ-Trans significantly outperforms prior methods and the CLIP-based baselines. Nevertheless, there is still a large gap compared to human performance suggesting room for promising future work. The code and data are available at: \url{https://ruc-aimind.github.io/projects/TeViS/} △ Less

Submitted 29 August, 2023; v1 submitted 31 December, 2022; originally announced January 2023.

Comments: Accepted to ACM Multimedia 2023

arXiv:2210.04209 [pdf, other]

Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

Authors: Yao Mu, Yuzheng Zhuang, Fei Ni, Bin Wang, Jianyu Chen, Jianye Hao, Ping Luo

Abstract: Adapting to the changes in transition dynamics is essential in robotic applications. By learning a conditional policy with a compact context, context-aware meta-reinforcement learning provides a flexible way to adjust behavior according to dynamics changes. However, in real-world applications, the agent may encounter complex dynamics changes. Multiple confounders can influence the transition dynam… ▽ More Adapting to the changes in transition dynamics is essential in robotic applications. By learning a conditional policy with a compact context, context-aware meta-reinforcement learning provides a flexible way to adjust behavior according to dynamics changes. However, in real-world applications, the agent may encounter complex dynamics changes. Multiple confounders can influence the transition dynamics, making it challenging to infer accurate context for decision-making. This paper addresses such a challenge by Decomposed Mutual INformation Optimization (DOMINO) for context learning, which explicitly learns a disentangled context to maximize the mutual information between the context and historical trajectories, while minimizing the state transition prediction error. Our theoretical analysis shows that DOMINO can overcome the underestimation of the mutual information caused by multi-confounded challenges via learning disentangled context and reduce the demand for the number of samples collected in various environments. Extensive experiments show that the context learned by DOMINO benefits both model-based and model-free reinforcement learning algorithms for dynamics generalization in terms of sample efficiency and performance in unseen environments. △ Less

Submitted 9 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022

arXiv:2210.00498 [pdf, other]

EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model

Authors: Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing Hu, Jinyi Liu, Yingfeng Chen, Changjie Fan

Abstract: Unsupervised reinforcement learning (URL) poses a promising paradigm to learn useful behaviors in a task-agnostic environment without the guidance of extrinsic rewards to facilitate the fast adaptation of various downstream tasks. Previous works focused on the pre-training in a model-free manner while lacking the study of transition dynamics modeling that leaves a large space for the improvement o… ▽ More Unsupervised reinforcement learning (URL) poses a promising paradigm to learn useful behaviors in a task-agnostic environment without the guidance of extrinsic rewards to facilitate the fast adaptation of various downstream tasks. Previous works focused on the pre-training in a model-free manner while lacking the study of transition dynamics modeling that leaves a large space for the improvement of sample efficiency in downstream tasks. To this end, we propose an Efficient Unsupervised Reinforcement Learning Framework with Multi-choice Dynamics model (EUCLID), which introduces a novel model-fused paradigm to jointly pre-train the dynamics model and unsupervised exploration policy in the pre-training phase, thus better leveraging the environmental samples and improving the downstream task sampling efficiency. However, constructing a generalizable model which captures the local dynamics under different behaviors remains a challenging problem. We introduce the multi-choice dynamics model that covers different local dynamics under different behaviors concurrently, which uses different heads to learn the state transition under different behaviors during unsupervised pre-training and selects the most appropriate head for prediction in the downstream task. Experimental results in the manipulation and locomotion domains demonstrate that EUCLID achieves state-of-the-art performance with high sample efficiency, basically solving the state-based URLB benchmark and reaching a mean normalized score of 104.0$\pm$1.2$\%$ in downstream tasks with 100k fine-tuning steps, which is equivalent to DDPG's performance at 2M interactive steps with 20x more data. △ Less

Submitted 21 February, 2023; v1 submitted 2 October, 2022; originally announced October 2022.

Comments: Published as a conference paper at ICLR 2023

arXiv:2207.04858 [pdf, other]

LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval

Authors: Jinbin Bai, Chunhui Liu, Feiyue Ni, Haofan Wang, Mengying Hu, Xiaofeng Guo, Lele Cheng

Abstract: Video-text retrieval is a class of cross-modal representation learning problems, where the goal is to select the video which corresponds to the text query between a given text query and a pool of candidate videos. The contrastive paradigm of vision-language pretraining has shown promising success with large-scale datasets and unified transformer architecture, and demonstrated the power of a joint… ▽ More Video-text retrieval is a class of cross-modal representation learning problems, where the goal is to select the video which corresponds to the text query between a given text query and a pool of candidate videos. The contrastive paradigm of vision-language pretraining has shown promising success with large-scale datasets and unified transformer architecture, and demonstrated the power of a joint latent space. Despite this, the intrinsic divergence between the visual domain and textual domain is still far from being eliminated, and projecting different modalities into a joint latent space might result in the distorting of the information inside the single modality. To overcome the above issue, we present a novel mechanism for learning the translation relationship from a source modality space $\mathcal{S}$ to a target modality space $\mathcal{T}$ without the need for a joint latent space, which bridges the gap between visual and textual domains. Furthermore, to keep cycle consistency between translations, we adopt a cycle loss involving both forward translations from $\mathcal{S}$ to the predicted target space $\mathcal{T'}$, and backward translations from $\mathcal{T'}$ back to $\mathcal{S}$. Extensive experiments conducted on MSR-VTT, MSVD, and DiDeMo datasets demonstrate the superiority and effectiveness of our LaT approach compared with vanilla state-of-the-art methods. △ Less

Submitted 13 February, 2023; v1 submitted 11 July, 2022; originally announced July 2022.

arXiv:2202.12466 [pdf, ps, other]

A Data-Driven Column Generation Algorithm For Bin Packing Problem in Manufacturing Industry

Authors: Jiahui Duan, Xialiang Tong, Fei Ni, Zhenan He, Lei Chen, Mingxuan Yuan

Abstract: The bin packing problem exists widely in real logistic scenarios (e.g., packing pipeline, express delivery), with its goal to improve the packing efficiency and reduce the transportation cost. In this NP-hard combinatorial optimization problem, the position and quantity of each item in the box are strictly restricted by complex constraints and special customer requirements. Existing approaches are… ▽ More The bin packing problem exists widely in real logistic scenarios (e.g., packing pipeline, express delivery), with its goal to improve the packing efficiency and reduce the transportation cost. In this NP-hard combinatorial optimization problem, the position and quantity of each item in the box are strictly restricted by complex constraints and special customer requirements. Existing approaches are hard to obtain the optimal solution since rigorous constraints cannot be handled within a reasonable computation load. In this paper, for handling this difficulty, the packing knowledge is extracted from historical data collected from the packing pipeline of Huawei. First, by fully exploiting the relationship between historical packing records and input orders(orders to be packed) , the problem is reformulated as a set cover problem. Then, two novel strategies, the constraint handling and process acceleration strategies are applied to the classic column generation approach to solve this set cover problem. The cost of solving pricing problem for generating new columns is high due to the complex constraints and customer requirements. The proposed constraints handling strategy exploits the historical packing records with the most negative value of the reduced cost. Those constraints have been implicitly satisfied in these historical packing records so that there is no need to conduct further evaluation on constraints, thus the computational load is saved. To further eliminate the iteration process of column generation algorithm and accelerate the optimization process, a Learning to Price approach called Modified Pointer Network is proposed, by which we can determine which historical packing records should be selected directly. Through experiments on realworld datasets, we show our proposed method can improve the packing success rate and decrease the computation time simultaneously. △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2108.03203 [pdf, other]

Adaptive Simulated Annealing with Greedy Search for the Circle Bin Packing Problem

Authors: Yong Yuan, Kevin Tole, Fei Ni, Kun He, Zhengda Xiong, Jinfa Liu

Abstract: We introduce a new bin packing problem, termed the circle bin packing problem with circular items (CBPP-CI). The problem involves packing all the circular items into multiple identical circle bins as compact as possible with the objective of minimizing the number of used bins. We first define the tangent occupying action (TOA) and propose a constructive greedy algorithm that sequentially packs the… ▽ More We introduce a new bin packing problem, termed the circle bin packing problem with circular items (CBPP-CI). The problem involves packing all the circular items into multiple identical circle bins as compact as possible with the objective of minimizing the number of used bins. We first define the tangent occupying action (TOA) and propose a constructive greedy algorithm that sequentially packs the items into places tangent to the packed items or the bin boundaries. Moreover, to avoid falling into a local minimum trap and efficiently judge whether an optimal solution has been established, we continue to present the adaptive simulated annealing with greedy search (ASA-GS) algorithm that explores and exploits the search space efficiently. Specifically, we offer two novel local perturbation strategies to jump out of the local optimum and incorporate the greedy search to achieve faster convergence. The parameters of ASA-GS are adaptive according to the number of items so that they can be size-agnostic across the problem scale. We design two sets of new benchmark instances, and the empirical results show that ASA-GS completely outperforms the constructive greedy algorithm. Moreover, the packing density of ASA-GS on the top few dense bins is much higher than that of the state-of-the-art algorithm for the single circle packing problem, inferring the high quality of the packing solutions for CBPP-CI. △ Less

Submitted 6 August, 2021; originally announced August 2021.

Comments: 14 pages, 8 figures

arXiv:2003.00410 [pdf, other]

PF-Net: Point Fractal Network for 3D Point Cloud Completion

Authors: Zitian Huang, Yikuan Yu, Jiawen Xu, Feng Ni, Xinyi Le

Abstract: In this paper, we propose a Point Fractal Network (PF-Net), a novel learning-based approach for precise and high-fidelity point cloud completion. Unlike existing point cloud completion networks, which generate the overall shape of the point cloud from the incomplete point cloud and always change existing points and encounter noise and geometrical loss, PF-Net preserves the spatial arrangements of… ▽ More In this paper, we propose a Point Fractal Network (PF-Net), a novel learning-based approach for precise and high-fidelity point cloud completion. Unlike existing point cloud completion networks, which generate the overall shape of the point cloud from the incomplete point cloud and always change existing points and encounter noise and geometrical loss, PF-Net preserves the spatial arrangements of the incomplete point cloud and can figure out the detailed geometrical structure of the missing region(s) in the prediction. To succeed at this task, PF-Net estimates the missing point cloud hierarchically by utilizing a feature-points-based multi-scale generating network. Further, we add up multi-stage completion loss and adversarial loss to generate more realistic missing region(s). The adversarial loss can better tackle multiple modes in the prediction. Our experiments demonstrate the effectiveness of our method for several challenging point cloud completion tasks. △ Less

Submitted 1 March, 2020; originally announced March 2020.

arXiv:2001.07709 [pdf, other]

Adaptive Large Neighborhood Search for Circle Bin Packing Problem

Authors: Kun He, Kevin Tole, Fei Ni, Yong Yuan, Linyun Liao

Abstract: We address a new variant of packing problem called the circle bin packing problem (CBPP), which is to find a dense packing of circle items to multiple square bins so as to minimize the number of used bins. To this end, we propose an adaptive large neighborhood search (ALNS) algorithm, which uses our Greedy Algorithm with Corner Occupying Action (GACOA) to construct an initial layout. The greedy so… ▽ More We address a new variant of packing problem called the circle bin packing problem (CBPP), which is to find a dense packing of circle items to multiple square bins so as to minimize the number of used bins. To this end, we propose an adaptive large neighborhood search (ALNS) algorithm, which uses our Greedy Algorithm with Corner Occupying Action (GACOA) to construct an initial layout. The greedy solution is usually in a local optimum trap, and ALNS enables multiple neighborhood search that depends on the stochastic annealing schedule to avoid getting stuck in local minimum traps. Specifically, ALNS perturbs the current layout to jump out of a local optimum by iteratively reassigns some circles and accepts the new layout with some probability during the search. The acceptance probability is adjusted adaptively using simulated annealing that fine-tunes the search direction in order to reach the global optimum. We benchmark computational results against GACOA in heterogeneous instances. ALNS always outperforms GACOA in improving the objective function, and in several cases, there is a significant reduction on the number of bins used in the packing. △ Less

Submitted 20 January, 2020; originally announced January 2020.

Comments: 13 pages, 6 figures, 6 tables

arXiv:1910.11764 [pdf, other]

ClsGAN: Selective Attribute Editing Model Based On Classification Adversarial Network

Authors: Liu Ying, Heng Fan, Fuchuan Ni, Jinhai Xiang

Abstract: Attribution editing has achieved remarkable progress in recent years owing to the encoder-decoder structure and generative adversarial network (GAN). However, it remains challenging in generating high-quality images with accurate attribute transformation. Attacking these problems, the work proposes a novel selective attribute editing model based on classification adversarial network (referred to a… ▽ More Attribution editing has achieved remarkable progress in recent years owing to the encoder-decoder structure and generative adversarial network (GAN). However, it remains challenging in generating high-quality images with accurate attribute transformation. Attacking these problems, the work proposes a novel selective attribute editing model based on classification adversarial network (referred to as ClsGAN) that shows good balance between attribute transfer accuracy and photo-realistic images. Considering that the editing images are prone to be affected by original attribute due to skip-connection in encoder-decoder structure, an upper convolution residual network (referred to as Tr-resnet) is presented to selectively extract information from the source image and target label. In addition, to further improve the transfer accuracy of generated images, an attribute adversarial classifier (referred to as Atta-cls) is introduced to guide the generator from the perspective of attribute through learning the defects of attribute transfer images. Experimental results on CelebA demonstrate that our ClsGAN performs favorably against state-of-the-art approaches in image quality and transfer accuracy. Moreover, ablation studies are also designed to verify the great performance of Tr-resnet and Atta-cls. △ Less

Submitted 29 July, 2020; v1 submitted 25 October, 2019; originally announced October 2019.

arXiv:1909.01537 [pdf, ps, other]

doi 10.7566/JPSCP.32.010024

Nuclear structure and reaction with quantum shape fluctuation

Authors: Takashi Nakatsukasa, Yu Kashiwaba, Fang Ni, Kouhei Washiyama, Kai Wen, Nobuo Hinohara

Abstract: We present recent results in theoretical studies on nuclear structure and reaction beyond mean field, using the adiabatic self-consistent collective coordinate method and its extension. We also present new results with the finite-temperature Hartree-Fock-Bogoliubov calculation with the three-dimensional-coordinate-space representation. We present recent results in theoretical studies on nuclear structure and reaction beyond mean field, using the adiabatic self-consistent collective coordinate method and its extension. We also present new results with the finite-temperature Hartree-Fock-Bogoliubov calculation with the three-dimensional-coordinate-space representation. △ Less

Submitted 30 September, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

Comments: 6 pages, 3 figures, proceedings of NN2018

arXiv:1905.00719 [pdf, other]

Internet of Intelligence: The Collective Advantage for Advancing Communications and Intelligence

Authors: Rongpeng Li, Zhifeng Zhao, Xing Xu, Fei Ni, Honggang Zhang

Abstract: The fifth-generation cellular networks (5G) has boosted the unprecedented convergence between the information world and physical world. On the other hand, empowered with the enormous amount of data and information, artificial intelligence (AI) has been universally applied and pervasive AI is believed to be an integral part of the six-generation cellular networks (6G). Consequently, benefiting from… ▽ More The fifth-generation cellular networks (5G) has boosted the unprecedented convergence between the information world and physical world. On the other hand, empowered with the enormous amount of data and information, artificial intelligence (AI) has been universally applied and pervasive AI is believed to be an integral part of the six-generation cellular networks (6G). Consequently, benefiting from the advancement in communication technology and AI, we boldly argue that the conditions for collective intelligence (CI) will be mature in the 6G era and CI will emerge among the widely connected beings and things. Afterwards, we highlight the potential huge impact of CI on both communications and intelligence. In particular, we introduce a regular language (i.e., the information economy metalanguage) supporting the future collective communications to augment human intelligence and explain its potential applications in naming Internet information and pushing information centric networks forward. Meanwhile, we propose a stigmergy-based federated collective intelligence and demonstrate its achievement in a simulated scenario where the agents collectively work together to form a pattern through simple indirect communications. In a word, CI could advance both communications and intelligence. △ Less

Submitted 18 April, 2020; v1 submitted 26 April, 2019; originally announced May 2019.

Comments: 6 figures; accepted by IEEE Wireless Commun with the title "The Collective Advantage for Advancing Communications and Intelligence"

arXiv:1903.00384 [pdf, other]

Data-driven Approach for Quality Evaluation on Knowledge Sharing Platform

Authors: Lu Xu, Jinhai Xiang, Yating Wang, Fuchuan Ni

Abstract: In recent years, voice knowledge sharing and question answering (Q&A) platforms have attracted much attention, which greatly facilitate the knowledge acquisition for people. However, little research has evaluated on the quality evaluation on voice knowledge sharing. This paper presents a data-driven approach to automatically evaluate the quality of a specific Q&A platform (Zhihu Live). Extensive e… ▽ More In recent years, voice knowledge sharing and question answering (Q&A) platforms have attracted much attention, which greatly facilitate the knowledge acquisition for people. However, little research has evaluated on the quality evaluation on voice knowledge sharing. This paper presents a data-driven approach to automatically evaluate the quality of a specific Q&A platform (Zhihu Live). Extensive experiments demonstrate the effectiveness of the proposed method. Furthermore, we introduce a dataset of Zhihu Live as an open resource for researchers in related areas. This dataset will facilitate the development of new methods on knowledge sharing services quality evaluation. △ Less

Submitted 1 March, 2019; originally announced March 2019.

arXiv:1811.02352 [pdf, ps, other]

doi 10.1103/PhysRevC.98.064327

Low-lying collective excited states in non-integrable pairing models based on stationary-phase approximation to path integral

Authors: Fang Ni, Nobuo Hinohara, Takashi Nakatsukasa

Abstract: For a description of large-amplitude collective motion associated with nuclear pairing, requantization of time-dependent mean-field dynamics is performed using the stationary-phase approximation (SPA) to the path integral. We overcome the difficulty of the SPA, which is known to be applicable to integrable systems only, by developing a requantization approach combining the SPA with the adiabatic s… ▽ More For a description of large-amplitude collective motion associated with nuclear pairing, requantization of time-dependent mean-field dynamics is performed using the stationary-phase approximation (SPA) to the path integral. We overcome the difficulty of the SPA, which is known to be applicable to integrable systems only, by developing a requantization approach combining the SPA with the adiabatic self-consistent collective coordinate method (ASCC+SPA). We apply the ASCC+SPA to multi-level pairing models, which are non-integrable systems, to study the nuclear pairing dynamics. The ASCC+SPA gives a reasonable description of low-lying excited $0^+$ states in non-integrable pairing systems. △ Less

Submitted 6 November, 2018; originally announced November 2018.

Comments: 13 pages, 12 figures

Journal ref: Phys. Rev. C 98, 064327 (2018)

arXiv:1807.11886 [pdf, other]

Deep Dual Pyramid Network for Barcode Segmentation using Barcode-30k Database

Authors: Qijie Zhao, Feng Ni, Yang Song, Yongtao Wang, Zhi Tang

Abstract: Digital signs(such as barcode or QR code) are widely used in our daily life, and for many applications, we need to localize them on images. However, difficult cases such as targets with small scales, half-occlusion, shape deformation and large illumination changes cause challenges for conventional methods. In this paper, we address this problem by producing a large-scale dataset and adopting a dee… ▽ More Digital signs(such as barcode or QR code) are widely used in our daily life, and for many applications, we need to localize them on images. However, difficult cases such as targets with small scales, half-occlusion, shape deformation and large illumination changes cause challenges for conventional methods. In this paper, we address this problem by producing a large-scale dataset and adopting a deep learning based semantic segmentation approach. Specifically, a synthesizing method was proposed to generate well-annotated images containing barcode and QR code labels, which contributes to largely decrease the annotation time. Through the synthesis strategy, we introduce a dataset that contains 30000 images with Barcode and QR code - Barcode-30k. Moreover, we further propose a dual pyramid structure based segmentation network - BarcodeNet, which is mainly formed with two novel modules, Prior Pyramid Pooling Module(P3M) and Pyramid Refine Module(PRM). We validate the effectiveness of BarcodeNet on the proposed synthetic dataset, and it yields the result of mIoU accuracy 95.36\% on validation set. Additional segmentation results of real images have shown that accurate segmentation performance is achieved. △ Less

Submitted 31 July, 2018; originally announced July 2018.

arXiv:1806.09790 [pdf, other]

CFENet: An Accurate and Efficient Single-Shot Object Detector for Autonomous Driving

Authors: Qijie Zhao, Tao Sheng, Yongtao Wang, Feng Ni, Ling Cai

Abstract: The ability to detect small objects and the speed of the object detector are very important for the application of autonomous driving, and in this paper, we propose an effective yet efficient one-stage detector, which gained the second place in the Road Object Detection competition of CVPR2018 workshop - Workshop of Autonomous Driving(WAD). The proposed detector inherits the architecture of SSD an… ▽ More The ability to detect small objects and the speed of the object detector are very important for the application of autonomous driving, and in this paper, we propose an effective yet efficient one-stage detector, which gained the second place in the Road Object Detection competition of CVPR2018 workshop - Workshop of Autonomous Driving(WAD). The proposed detector inherits the architecture of SSD and introduces a novel Comprehensive Feature Enhancement(CFE) module into it. Experimental results on this competition dataset as well as the MSCOCO dataset demonstrate that the proposed detector (named CFENet) performs much better than the original SSD and the state-of-the-art method RefineDet especially for small objects, while keeping high efficiency close to the original SSD. Specifically, the single scale version of the proposed detector can run at the speed of 21 fps, while the multi-scale version with larger input size achieves the mAP 29.69, ranking second on the leaderboard △ Less

Submitted 10 October, 2018; v1 submitted 26 June, 2018; originally announced June 2018.

Comments: 5 pages, 4 figures, CVPR2018, Workshop of Autonomous Driving (WAD)

arXiv:1806.08466 [pdf]

doi 10.1103/PhysRevB.98.064418

Observation of a pressure-induced transition from interlayer ferromagnetism to intralayer antiferromagnetism in Sr4Ru3O10

Authors: H. Zheng, W. H. Song, J. Terzic, H. D. Zhao, Y. Zhang, Y. F. Ni, L. E. DeLong, P. Schlottmann, G. Cao

Abstract: Sr4Ru3O10 is a Ruddlesden-Popper compound with triple Ru-O perovskite layers separated by Sr-O alkali layers. This compound presents a rare coexistence of interlayer (c-axis) ferromagnetism and intralayer (basal-plane) metamagnetism at ambient pressure. Here we report the observation of pressure-induced, intralayer itinerant antiferromagnetism arising from the interlayer ferromagnetism. The applic… ▽ More Sr4Ru3O10 is a Ruddlesden-Popper compound with triple Ru-O perovskite layers separated by Sr-O alkali layers. This compound presents a rare coexistence of interlayer (c-axis) ferromagnetism and intralayer (basal-plane) metamagnetism at ambient pressure. Here we report the observation of pressure-induced, intralayer itinerant antiferromagnetism arising from the interlayer ferromagnetism. The application of modest hydrostatic pressure generates an anisotropy that causes a flattening and a tilting of RuO6 octahedra. All magnetic and transport results from this study indicate these lattice distortions diminish the c-axis ferromagnetism and basal-plane metamagnetism, and induce a basal-plane antiferromagnetic state. The unusually large magnetoelastic coupling and pressure tunability of Sr4Ru3O10 makes it a unique model system for studies of itinerant magnetism. △ Less

Submitted 21 June, 2018; originally announced June 2018.

Comments: 6 figures

Journal ref: Phys. Rev. B 98, 064418 (2018)

arXiv:1805.02200 [pdf, other]

doi 10.1145/1810479.1810540

Wormhole: A Fast Ordered Index for In-memory Data Management

Authors: Xingbo Wu, Fan Ni, Song Jiang

Abstract: In-memory data management systems, such as key-value stores, have become an essential infrastructure in today's big-data processing and cloud computing. They rely on efficient index structures to access data. While unordered indexes, such as hash tables, can perform point search with O(1) time, they cannot be used in many scenarios where range queries must be supported. Many ordered indexes, such… ▽ More In-memory data management systems, such as key-value stores, have become an essential infrastructure in today's big-data processing and cloud computing. They rely on efficient index structures to access data. While unordered indexes, such as hash tables, can perform point search with O(1) time, they cannot be used in many scenarios where range queries must be supported. Many ordered indexes, such as B+ tree and skip list, have a O(log N) lookup cost, where N is number of keys in an index. For an ordered index hosting billions of keys, it may take more than 30 key-comparisons in a lookup, which is an order of magnitude more expensive than that on a hash table. With availability of large memory and fast network in today's data centers, this O(log N) time is taking a heavy toll on applications that rely on ordered indexes. In this paper we introduce a new ordered index structure, named Wormhole, that takes O(log L) worst-case time for looking up a key with a length of L. The low cost is achieved by simultaneously leveraging strengths of three indexing structures, namely hash table, prefix tree, and B+ tree, to orchestrate a single fast ordered index. Wormhole's range operations can be performed by a linear scan of a list after an initial lookup. This improvement of access efficiency does not come at a price of compromised space efficiency. Instead, Wormhole's index space is comparable to those of B+ tree and skip list. Experiment results show that Wormhole outperforms skip list, B+ tree, ART, and Masstree by up to 8.4x, 4.9x, 4.3x, and 6.6x in terms of key lookup throughput, respectively. △ Less

Submitted 15 February, 2019; v1 submitted 6 May, 2018; originally announced May 2018.

Comments: 15 pages; 18 figures; 1 table

ACM Class: E.1

arXiv:1712.05518 [pdf]

doi 10.1088/1361-648X/aac23d

Decoupling of magnetism and electric transport in single-crystal (Sr1-xAx)2IrO4 (A = Ca or Ba)

Authors: H. D. Zhao, J. Terzic, H. Zheng, Y. F. Ni, Y. Zhang, Feng Ye, P. Schlottmann, G. Cao

Abstract: We report a systematical structural, transport and magnetic study of Ca or Ba doped Sr2IrO4single crystals. Isoelectronically substituting Ca2+ (up to 15%) or Ba2+ (up to 4%) ion for the Sr2+ ion provides no additional charge carriers but effectively changes the lattice parameters in Sr2IrO4. In particular, 15% Ca doping considerably reduces the c-axis and the unit cell by nearly 0.45% and 1.00 %,… ▽ More We report a systematical structural, transport and magnetic study of Ca or Ba doped Sr2IrO4single crystals. Isoelectronically substituting Ca2+ (up to 15%) or Ba2+ (up to 4%) ion for the Sr2+ ion provides no additional charge carriers but effectively changes the lattice parameters in Sr2IrO4. In particular, 15% Ca doping considerably reduces the c-axis and the unit cell by nearly 0.45% and 1.00 %, respectively. These significant, anisotropic compressions in the lattice parameters conspicuously cause no change in the Néel temperature which remains at 240 K, but drastically reduces the electrical resistivity by up to five orders of magnitude or even precipitates a sharp insulator-to-metal transition at lower temperatures, i.e. the vanishing insulating state accompanies an unchanged Néel temperature in (Sr1-xAx)2IrO4. This observation brings to light an intriguing difference between chemical pressure and applied pressure, the latter of which does suppress the long-range magnetic order in Sr2IrO4. This difference reveals the importance of the Ir1-O2-Ir1 bond angle and homogenous volume compression in determining the magnetic ground state. All results, along with a comparison drawn with results of Tb and La doped Sr2IrO4, underscore that the magnetic transition plays a nonessential role in the formation of the charge gap in the spin-orbit-tuned iridate. △ Less

Submitted 3 May, 2018; v1 submitted 14 December, 2017; originally announced December 2017.

Comments: 6 figures

Journal ref: J. Phys.: Condens. Matter 2018

arXiv:1712.02954 [pdf, ps, other]

doi 10.1103/PhysRevC.97.044310

Comparative study of the requantization of the time-dependent mean field for the dynamics of nuclear pairing

Authors: F. Ni, T. Nakatsukasa

Abstract: To describe quantal collective phenomena, it is useful to requantize the time-dependent mean-field dynamics. We study the time-dependent Hartree-Fock-Bogoliubov (TDHFB) theory for the two-level pairing Hamiltonian, and compare results of different quantization methods. The one constructing microscopic wave functions, using the TDHFB trajectories fulfilling the Einstein-Brillouin-Keller quantizatio… ▽ More To describe quantal collective phenomena, it is useful to requantize the time-dependent mean-field dynamics. We study the time-dependent Hartree-Fock-Bogoliubov (TDHFB) theory for the two-level pairing Hamiltonian, and compare results of different quantization methods. The one constructing microscopic wave functions, using the TDHFB trajectories fulfilling the Einstein-Brillouin-Keller quantization condition, turns out to be the most accurate. The method is based on the stationary-phase approximation to the path integral. We also examine the performance of the collective model which assumes that the pairing gap parameter is the collective coordinate. The applicability of the collective model is limited for the nuclear pairing with a small number of single-particle levels, because the pairing gap parameter represents only a half of the pairing collective space. △ Less

Submitted 18 April, 2018; v1 submitted 8 December, 2017; originally announced December 2017.

Comments: 28 pages, 13 figures

Journal ref: Phys. Rev. C 97, 044310 (2018)

arXiv:1507.02367 [pdf]

doi 10.1088/1674-1137/40/4/047003

Beam transport experiment with a new kicker control system on the HIRFL-CSR

Authors: Yan-Yu Wang, De-Tai Zhou, Jin-Fu Luo, Jian-Chuan Zhang, Wen-Xiong Zhou, Fa-Fu Ni, Jun Yin, Jia Yin, You-Jin Yuan, Jin-Bin Shang-Guan

Abstract: The kicker control system was used for beam extraction and injection between two cooling storage rings (CSRs) at the Heavy Ion Research Facility in Lanzhou (HIRFL). To meet the requirements of special physics experiments, the kicker controller was upgraded. The new controller was designed based on ARM+DSP+FPGA technology and monolithic circuit architecture, which can achieve a precision time delay… ▽ More The kicker control system was used for beam extraction and injection between two cooling storage rings (CSRs) at the Heavy Ion Research Facility in Lanzhou (HIRFL). To meet the requirements of special physics experiments, the kicker controller was upgraded. The new controller was designed based on ARM+DSP+FPGA technology and monolithic circuit architecture, which can achieve a precision time delay of 2.5 ns. In September 2014, the new kicker control system was installed in the kicker field, and the test experiment using the system was completed. In addition, a pre-trigger signal was provided by the controller, which was designed to synchronize the beam diagnostic system and physics experiments. Experimental results indicate that the phenomena of "missed kick" and "inefficient kick" were not observed, and the multichannel trigger signals' delay could be adjusted individually for kick power supplies in digitization; thus, the beam transport efficiency was improved compared with that of the original system. The fast extraction and injection experiment was successfully completed based on the new kicker control systems for HIRFL-CSR. △ Less

Submitted 9 July, 2015; originally announced July 2015.

arXiv:cond-mat/0608274 [pdf]

doi 10.1063/1.2386950

Illumination and annealing characteristics of two-dimensional electron gas systems in metal-organic vapor-phase epitaxy grown AlGaN/AlN/GaN heterostructures

Authors: N. Biyikli, U. Ozgur, X. F. Ni, Y. Fu, H. Morkoc, C. Kurdak

Abstract: We studied the persistent photoconductivity (PPC) effect in AlGaN/AlN/GaN heterostructures with two different Al-compositions (x=0.15 and x=0.25). The two-dimensional electron gas formed at the AlN/GaN heterointerface was characterized by Shubnikov-de Haas and Hall measurements. Using optical illumination, we were able to increase the carrier density of the Al0.15Ga0.85N/AlN/GaN sample from 1.6x… ▽ More We studied the persistent photoconductivity (PPC) effect in AlGaN/AlN/GaN heterostructures with two different Al-compositions (x=0.15 and x=0.25). The two-dimensional electron gas formed at the AlN/GaN heterointerface was characterized by Shubnikov-de Haas and Hall measurements. Using optical illumination, we were able to increase the carrier density of the Al0.15Ga0.85N/AlN/GaN sample from 1.6x10^{12} cm^{-2} to 5.9x1012 cm^{-2}, while the electron mobility was enhanced from 9540 cm2/Vs to 21400 cm2/Vs at T = 1.6 K. The persistent photocurrent in both samples exhibited a strong dependence on illumination wavelength, being highest close to the bandgap and decreasing at longer wavelengths. The PPC effect became fairly weak for illumination wavelengths longer than 530 nm and showed a more complex response with an initial negative photoconductivity in the infrared region of the spectrum (>700 nm). The maximum PPC-efficiency for 390 nm illumination was 0.011% and 0.005% for Al0.25Ga0.75N/AlN/GaN and Al0.15Ga0.85N/AlN/GaN samples, respectively. After illumination, the carrier density could be reduced by annealing the sample. Annealing characteristics of the PPC effect were studied in the 20-280 K temperature range. We found that annealing at 280 K was not sufficient for full recovery of the carrier density. In fact, the PPC effect occurs in these samples even at room temperature. Comparing the measurement results of two samples, the Al0.25Ga0.75N/AlN/GaN sample had a larger response to illumination and displayed a smaller recovery with thermal annealing. This result suggests that the energy scales of the defect configuration-coordinate diagrams for these samples are different, depending on their Al-composition. △ Less

Submitted 11 August, 2006; originally announced August 2006.

Comments: 27 pages, 8 figures

Showing 1–35 of 35 results for author: Ni, F