Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- short-paperJune 2024
WiP: Efficient LLM Prefilling with Mobile NPU
EdgeFM '24: Proceedings of the Workshop on Edge and Mobile Foundation ModelsJune 2024, Pages 33–35https://doi.org/10.1145/3662006.3662066Large language models (LLMs) play a crucial role in various Natural Language Processing (NLP) tasks, prompting their deployment on mobile devices for inference. However, a significant challenge arises due to high waiting latency, especially for long ...
- research-articleJune 2024
Large Language Models on Mobile Devices: Measurements, Analysis, and Insights
EdgeFM '24: Proceedings of the Workshop on Edge and Mobile Foundation ModelsJune 2024, Pages 1–6https://doi.org/10.1145/3662006.3662059Deploying large language models (LLMs) inference into mobile devices is cost-efficient for companies, and well addresses the privacy concern of users. However, the limited computation capacity and memory constraints of mobile devices hinder their ...
- short-paperJune 2024
Poster: Efficient and Accurate Mobile Task Automation through Learning from Code
MOBISYS '24: Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and ServicesJune 2024, Pages 638–639https://doi.org/10.1145/3643832.3661397With the emergence and continuous prosperity of large language models (LLMs), artificial intelligence (AI) agents have experienced rapid advancements. Most mobile AI agents merely imitate human operations, executing actions based on the human user ...
- research-articleMay 2024
Deciphering the Enigma of Satellite Computing with COTS Devices: Measurement and Analysis
ACM MobiCom '24: Proceedings of the 30th Annual International Conference on Mobile Computing and NetworkingMay 2024, Pages 420–435https://doi.org/10.1145/3636534.3649371In the wake of the rapid deployment of large-scale low-Earth orbit satellite constellations, exploiting the full computing potential of Commercial Off-The-Shelf (COTS) devices in these environments has become a pressing issue. However, understanding this ...
- research-articleMay 2024
Mobile Foundation Model as Firmware
- Jinliang Yuan,
- Chen Yang,
- Dongqi Cai,
- Shihe Wang,
- Xin Yuan,
- Zeling Zhang,
- Xiang Li,
- Dingge Zhang,
- Hanzi Mei,
- Xianqing Jia,
- Shangguang Wang,
- Mengwei Xu
ACM MobiCom '24: Proceedings of the 30th Annual International Conference on Mobile Computing and NetworkingMay 2024, Pages 279–295https://doi.org/10.1145/3636534.3649361In the current AI era, mobile devices such as smartphones are tasked with executing a myriad of deep neural networks (DNNs) locally. It presents a complex landscape, as these models are highly fragmented in terms of architecture, operators, and ...
-
- research-articleMay 2024
Towards Energy-efficient Federated Learning via INT8-based Training on Mobile DSPs
WWW '24: Proceedings of the ACM Web Conference 2024May 2024, Pages 2786–2794https://doi.org/10.1145/3589334.3645341AI is making the Web an even cooler place, but also introduces serious privacy risks due to the extensive user data collection. Federated learning (FL), as a privacy-preserving machine learning paradigm, enables mobile devices to collaboratively learn a ...
- research-articleMay 2024
Safeguard Privacy for Minimal Data Collection with Trustworthy Autonomous Agents
AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsMay 2024, Pages 1966–1974Ensuring digital privacy necessitates users giving well-considered consent to online service providers for data usage, creating an unsustainable and error-prone decision load. Software privacy agents can help make data consent decisions on behalf of ...
- research-articleApril 2024
FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission
EuroMLSys '24: Proceedings of the 4th Workshop on Machine Learning and SystemsApril 2024, Pages 126–133https://doi.org/10.1145/3642970.3655834Communication overhead is a significant bottleneck in federated learning (FL), which has been exaggerated with the increasing size of AI models. In this paper, we propose FedRDMA, a communication-efficient cross-silo FL system that integrates RDMA into ...
SoCFlow: Efficient and Scalable DNN Training on SoC-Clustered Edge Servers
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1April 2024, Pages 368–385https://doi.org/10.1145/3617232.3624847SoC-Cluster, a novel server architecture composed of massive mobile system-on-chips (SoCs), is gaining popularity in industrial edge computing due to its energy efficiency and compatibility with existing mobile applications. However, we observe that the ...
- research-articleDecember 2023
Demystifying the QoS and QoE of Edge-hosted Video Streaming Applications in the Wild with SNESet
- Yanan Li,
- Guangqing Deng,
- Changming Bai,
- Jingyu Yang,
- Gang Wang,
- Hao Zhang,
- Jin Bai,
- Haitao Yuan,
- Mengwei Xu,
- Shangguang Wang
Proceedings of the ACM on Management of Data (PACMMOD), Volume 1, Issue 4Article No.: 236, Pages 1–29https://doi.org/10.1145/3626723Video streaming applications (VSAs) are increasingly being deployed on large-scale edge platforms, which have the potential to significantly improve the quality of service (QoS) and end-user experience (QoE), ultimately maximizing business outcomes. ...
- ArticleNovember 2023
- ArticleNovember 2023
CAN-verify: A Verification Tool For BDI Agents
AbstractCAN-verify is an automated tool that aids the development, verification, and analysis of BDI agents written in the Conceptual Agent Notation (Can) language. It does not require users to be familiar with verification techniques. CAN-verify supports ...
- research-articleNovember 2023
Seamless Cross-Edge Service Migration for Real-Time Rendering Applications
IEEE Transactions on Mobile Computing (ITMV), Volume 23, Issue 6June 2024, Pages 7084–7098https://doi.org/10.1109/TMC.2023.3331773Seamless cross-edge migration for real-time rendering applications is challenging. The strong interactive nature of real-time rendering applications demands a downtime lower than <inline-formula><tex-math notation="LaTeX">$\text{15}\;\text{ms}$</tex-math><...
Federated Few-Shot Learning for Mobile NLP
ACM MobiCom '23: Proceedings of the 29th Annual International Conference on Mobile Computing and NetworkingOctober 2023, Article No.: 63, Pages 1–17https://doi.org/10.1145/3570361.3613277Natural language processing (NLP) sees rich mobile applications. To support various language understanding tasks, a foundation NLP model is often fine-tuned in a federated, privacy-preserving setting (FL). This process currently relies on at least ...
- research-articleOctober 2023
Efficient Federated Learning for Modern NLP
ACM MobiCom '23: Proceedings of the 29th Annual International Conference on Mobile Computing and NetworkingOctober 2023, Article No.: 37, Pages 1–16https://doi.org/10.1145/3570361.3592505Transformer-based pre-trained models have revolutionized NLP for superior performance and generality. Fine-tuning pre-trained models for downstream tasks often requires private data, for which federated learning is the de-facto approach (i.e., FedNLP)...
- research-articleAugust 2023
Quantitative modelling and analysis of BDI agents
Software and Systems Modeling (SoSyM) (SPSSM), Volume 23, Issue 2Apr 2024, Pages 343–367https://doi.org/10.1007/s10270-023-01121-5AbstractBelief–desire–intention (BDI) agents are a popular agent architecture. We extend conceptual agent notation (Can)—a BDI programming language with advanced features such as failure recovery and declarative goals—to include probabilistic action ...
- research-articleSeptember 2023
Tango: Harmonious Management and Scheduling for Mixed Services Co-located among Distributed Edge-Clouds
ICPP '23: Proceedings of the 52nd International Conference on Parallel ProcessingAugust 2023, Pages 595–604https://doi.org/10.1145/3605573.3605589Co-locating Latency-Critical (LC) and Best-Effort (BE) services in edge-clouds is expected to enhance resource utilization. However, this mixed deployment encounters unique challenges. Edge-clouds are heterogeneous, distributed, and resource-constrained, ...
- research-articleAugust 2023
A large-scale holistic measurement of crowdsourced edge cloud platform
World Wide Web (WWWJ), Volume 26, Issue 5Sep 2023, Pages 3561–3584https://doi.org/10.1007/s11280-023-01201-yAbstractEdge clouds have become a de-facto paradigm to deliver low and stable networks to delay-critical applications such as Web services and AR/VR. A unique form of edge clouds is those crowdsourced from third parties, e.g., idle PCs or workstations. ...
- research-articleAugust 2023
A Comprehensive Deep Learning Library Benchmark and Optimal Library Selection
- Qiyang Zhang,
- Xiangying Che,
- Yijie Chen,
- Xiao Ma,
- Mengwei Xu,
- Schahram Dustdar,
- Xuanzhe Liu,
- Shangguang Wang
IEEE Transactions on Mobile Computing (ITMV), Volume 23, Issue 5May 2024, Pages 5069–5082https://doi.org/10.1109/TMC.2023.3301973Deploying deep learning (DL) on mobile devices has been a notable trend in recent years. To support fast inference of on-device DL, DL libraries play a critical role as algorithms and hardware do. Unfortunately, no prior work ever dives deep into the ...
- posterSeptember 2023
FedAdapter: Efficient Federated Learning for Mobile NLP
ACM TURC '23: Proceedings of the ACM Turing Award Celebration Conference - China 2023July 2023, Pages 27–28https://doi.org/10.1145/3603165.3607380Fine-tuning pre-trained models for downstream tasks often requires private data, for which federated learning is the de-facto approach (i.e., FedNLP). However, FedNLP is prohibitively slow due to the large model sizes and the resultant high network/...