Search | arXiv e-print repository

Transferable speech-to-text large language model alignment module

Abstract: By leveraging the power of Large Language Models(LLMs) and speech foundation models, state of the art speech-text bimodal works can achieve challenging tasks like spoken translation(ST) and question answering(SQA) altogether with much simpler architectures. In this paper, we utilize the capability of Whisper encoder and pre-trained Yi-6B. Empirical results reveal that modal alignment can be achiev… ▽ More By leveraging the power of Large Language Models(LLMs) and speech foundation models, state of the art speech-text bimodal works can achieve challenging tasks like spoken translation(ST) and question answering(SQA) altogether with much simpler architectures. In this paper, we utilize the capability of Whisper encoder and pre-trained Yi-6B. Empirical results reveal that modal alignment can be achieved with one layer module and hundred hours of speech-text multitask corpus. We further swap the Yi-6B with human preferences aligned version of Yi-6B-Chat during inference, and discover that the alignment capability is applicable as well. In addition, the alignment subspace revealed by singular value decomposition(SVD) also implies linear alignment subspace is sparse, which leaves the possibility to concatenate other features like voice-print or video to expand modality. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: Accepted by InterSpeech 2024; 5 pages, 2 figures

arXiv:2403.06397 [pdf, other]

DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning

Authors: Xuefeng Wang, Henglin Pu, Hyung Jun Kim, Husheng Li

Abstract: Safe Multi-agent reinforcement learning (safe MARL) has increasingly gained attention in recent years, emphasizing the need for agents to not only optimize the global return but also adhere to safety requirements through behavioral constraints. Some recent work has integrated control theory with multi-agent reinforcement learning to address the challenge of ensuring safety. However, there have bee… ▽ More Safe Multi-agent reinforcement learning (safe MARL) has increasingly gained attention in recent years, emphasizing the need for agents to not only optimize the global return but also adhere to safety requirements through behavioral constraints. Some recent work has integrated control theory with multi-agent reinforcement learning to address the challenge of ensuring safety. However, there have been only very limited applications of Model Predictive Control (MPC) methods in this domain, primarily due to the complex and implicit dynamics characteristic of multi-agent environments. To bridge this gap, we propose a novel method called Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning (DeepSafeMPC). The key insight of DeepSafeMPC is leveraging a entralized deep learning model to well predict environmental dynamics. Our method applies MARL principles to search for optimal solutions. Through the employment of MPC, the actions of agents can be restricted within safe states concurrently. We demonstrate the effectiveness of our approach using the Safe Multi-agent MuJoCo environment, showcasing significant advancements in addressing safety concerns in MARL. △ Less

Submitted 11 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

Comments: 8 pages, 5 figures

arXiv:2304.06512 [pdf, other]

Investigating Skin Temperature-Based Overheating in mmWave Smartphones Power and Thermal Models for Optimal Non-Throttling Performance

Authors: Henglin Pu, Xingqi Wu

Abstract: 5G mmWave, as a revolutionary cellular technology, holds monumental potential for innovations in many academic and industrial areas. However, widespread adoption of this technology is hindered by the severe overheating issues experienced by current Commercial Off-The-Shelf (COTS) mmWave smartphones. This study aims to identify the root causes of device skin temperature related throttling during 5G… ▽ More 5G mmWave, as a revolutionary cellular technology, holds monumental potential for innovations in many academic and industrial areas. However, widespread adoption of this technology is hindered by the severe overheating issues experienced by current Commercial Off-The-Shelf (COTS) mmWave smartphones. This study aims to identify the root causes of device skin temperature related throttling during 5G transmission, and to quantify power reduction required to prevent such throttling in a given ambient temperature. The key insight of our paper is leveraging the power model and thermal model of mmWave smartphone to acquire the quantitative relationship among power consumption, ambient temperature and device skin temperature. This approach allows us to determine the extent of power reduction required to prevent throttling under specific ambient temperature conditions. △ Less

Submitted 25 March, 2023; originally announced April 2023.

Showing 1–3 of 3 results for author: Pu, H