research-article

Penetrative AI: Making LLMs Comprehend the Physical World

Authors:

Mani SrivastavaAuthors Info & Claims

HOTMOBILE '24: Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications

Pages 1 - 7

https://doi.org/10.1145/3638550.3641130

Published: 28 February 2024 Publication History

Abstract

Recent developments in Large Language Models (LLMs) have demonstrated their remarkable capabilities across a range of tasks. Questions, however, persist about the nature of LLMs and their potential to integrate common-sense human knowledge when performing tasks involving information about the real physical world. This paper delves into these questions by exploring how LLMs can be extended to interact with and reason about the physical world through IoT sensors and actuators, a concept that we term "Penetrative AI". The paper explores such an extension at two levels of LLMs' ability to penetrate into the physical world via the processing of sensory signals. Our preliminary findings indicate that LLMs, with ChatGPT being the representative example in our exploration, have considerable and unique proficiency in employing the embedded world knowledge for interpreting IoT sensor data and reasoning over them about tasks in the physical realm. Not only this opens up new applications for LLMs beyond traditional text-based tasks, but also enables new ways of incorporating human knowledge in cyber-physical systems.

References

[1]

Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, et al. 2023. Rt-2: Vision-language-action models transfer web knowledge to robotic control. arXiv preprint arXiv:2307.15818 (2023).

[2]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020).

[3]

Minwoo Byeon, Beomhee Park, Haecheon Kim, Sungjun Lee, Woonhyuk Baek, and Saehoon Kim. 2022. COYO-700M: Image-Text Pair Dataset. https://github.com/kakaobrain/coyo-dataset.

[4]

Salvatore Carta, Alessandro Giuliani, Leonardo Piano, Alessandro Sebastian Podda, Livio Pompianu, and Sandro Gabriele Tiddia. 2023. Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction. arXiv preprint arXiv:2307.01128 (2023).

[5]

Antonia Creswell, Murray Shanahan, and Irina Higgins. 2022. Selection-inference: Exploiting large language models for interpretable logical reasoning. arXiv preprint arXiv:2205.09712 (2022).

[6]

Android Developers. 2023. GnssStatus. https://developer.android.com/reference/android/location/GnssStatus

[7]

Android Developers. 2023. Motion Sensors. https://developer.android.com/develop/sensors-and-location/sensors/sensors_motion?hl=en#sensors-motion-stepcounter

[8]

Android Developers. 2023. ScanResult. https://developer.android.com/reference/android/net/wifi/ScanResult

[9]

Jiafei Duan, Samson Yu, Hui Li Tan, Hongyuan Zhu, and Cheston Tan. 2022. A survey of embodied ai: From simulators to research tasks. IEEE Transactions on Emerging Topics in Computational Intelligence 6, 2 (2022), 230--244.

[10]

Yunfan Gao, Tao Sheng, Youlin Xiang, Yun Xiong, Haofen Wang, and Jiawei Zhang. 2023. Chat-rec: Towards interactive and explainable llms-augmented recommender system. arXiv preprint arXiv:2303.14524 (2023).

[11]

Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, and Ishan Misra. 2023. Imagebind: One embedding space to bind them all. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15180--15190.

[12]

Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. 2000. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. circulation 101, 23 (2000), e215--e220.

[13]

Siqi Lai, Zhao Xu, Weijia Zhang, Hao Liu, and Hui Xiong. 2023. Large Language Models as Traffic Signal Control Agents: Capacity and Opportunity. arXiv preprint arXiv:2312.16044 (2023).

[14]

Yann LeCun. 2022. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. Open Review 62 (2022).

[15]

KunChang Li, Yinan He, Yi Wang, Yizhuo Li, Wenhai Wang, Ping Luo, Yali Wang, Limin Wang, and Yu Qiao. 2023. Videochat: Chat-centric video understanding. arXiv preprint arXiv:2305.06355 (2023).

[16]

Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, and Andy Zeng. 2023. Code as policies: Language model programs for embodied control. In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 9493--9500.

[17]

Junling Liu, Chao Liu, Renjie Lv, Kang Zhou, and Yan Zhang. 2023. Is chatgpt a good recommender? a preliminary study. arXiv preprint arXiv:2304.10149 (2023).

[18]

Xin Liu, Daniel McDuff, Geza Kovacs, Isaac Galatzer-Levy, Jacob Sunshine, Jiening Zhan, Ming-Zher Poh, Shun Liao, Paolo Di Achille, and Shwetak Patel. 2023. Large Language Models are Few-Shot Health Learners. arXiv preprint arXiv:2305.15525 (2023).

[19]

Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montser-rat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, and Andy Zeng. 2023. Large Language Models as General Pattern Machines. In Proceedings of the 7th Conference on Robot Learning (CoRL).

[20]

OpenAI. 2023. GPT. https://platform.openai.com/docs/guides/gpt

[21]

OpenAI. 2023. GPT-4 System Card. https://openai.com/research/gpt-4v-system-card

[22]

OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]

[23]

Jiapu Pan and Willis J Tompkins. 1985. A real-time QRS detection algorithm. IEEE transactions on biomedical engineering 3 (1985), 230--236.

[24]

Joon Sung Park, Joseph C O'Brien, Carrie J Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. 2023. Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442 (2023).

[25]

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, et al. 2023. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789 (2023).

[26]

Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, et al. 2022. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100 (2022).

[27]

Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, and Animesh Garg. 2023. Progprompt: Generating situated robot task plans using large language models. In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 11523--11530.

[28]

Jiashuo Sun, Chengjin Xu, Lumingyuan Tang, Saizhuo Wang, Chen Lin, Yeyun Gong, Heung-Yeung Shum, and Jian Guo. 2023. Think-on-graph: Deep and responsible reasoning of large language model with knowledge graph. arXiv preprint arXiv:2307.07697 (2023).

[29]

Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. 2023. Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291 (2023).

[30]

Chenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, and Nan Duan. 2023. Visual chatgpt: Talking, drawing and editing with visual foundation models. arXiv preprint arXiv:2303.04671 (2023).

[31]

Huatao Xu, Pengfei Zhou, Rui Tan, Mo Li, and Guobin Shen. 2021. Limu-bert: Unleashing the potential of unlabeled data for imu sensing applications. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems. 220--233.

Digital Library

[32]

Frank G Yanowitz. 2010. Lesson III. Characteristics of the Normal ECG. University of Utah School of Medicine (2010).

[33]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629 (2022).

[34]

Aohan Zeng, Xiao Liu, Zhengxiao Du, Zihan Wang, Hanyu Lai, Ming Ding, Zhuoyi Yang, Yifan Xu, Wendi Zheng, Xiao Xia, et al. 2022. Glm-130b: An open bilingual pre-trained model. arXiv preprint arXiv:2210.02414 (2022).

Cited By

Uhlmann EPolte JLelidis P(2024)Generative KI zur No-/Low-Code-WissensverarbeitungZeitschrift für wirtschaftlichen Fabrikbetrieb10.1515/zwf-2024-1155119:11(840-844)Online publication date: 18-Nov-2024
https://doi.org/10.1515/zwf-2024-1155
Yang BJiang SXu LLiu KLi HXing GChen HJiang XYan Z(2024)DrHouse: An LLM-empowered Diagnostic Reasoning System through Harnessing Outcomes from Sensor Data and Expert KnowledgeProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997658:4(1-29)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3699765
Arakawa RLehman JGoel M(2024)PrISM-Q&A: Step-Aware Voice Assistant on a Smartwatch Enabled by Multimodal Procedure Tracking and Large Language ModelsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997598:4(1-26)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3699759
Show More Cited By

Index Terms

Penetrative AI: Making LLMs Comprehend the Physical World
1. Computer systems organization
  1. Embedded and cyber-physical systems
2. Computing methodologies
  1. Artificial intelligence

Recommendations

Design of Intelligent Software Systems for Cyber-Physical Systems: Poster Abstract
IoTDI '17: Proceedings of the Second International Conference on Internet-of-Things Design and Implementation

As Information Technology (IT) and Operational Technology (OT) are rapidly converging, new and emerging technologies have been introducing both opportunities and challenges to Cyber-Physical Systems, which play critical roles in SCADA (Supervisory ...
Modeling Automotive Cyber Physical Systems
DCABES '13: Proceedings of the 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science

Automotive cyber physical systems (CPSs) involve interactions between software controllers, communication networks, and physical devices. These systems are among the most complex cyber physical systems being designed by humans. However, automotive cyber ...
Autonomic computing technologies for cyber-physical systems
ICACT'10: Proceedings of the 12th international conference on Advanced communication technology

Cyber-physical systems (CPS) are characterized by extremely tight integration of and coordination between computational and physical resources. Embedded computers and networks in the cyber world monitor and control the physical world. But it is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HOTMOBILE '24: Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications

February 2024

167 pages

ISBN:9798400704970

DOI:10.1145/3638550

Chair:
Nigel Davies,
Program Chair:
Chenren Xu

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 February 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Global STEM Professorship Scheme of Hong Kong and HKUST start up grant
Singapore NRF Investigatorship

Conference

HOTMOBILE '24

Sponsor:

SIGMOBILE

HOTMOBILE '24: 25th International Workshop on Mobile Computing Systems and Applications

February 28 - 29, 2024

CA, San Diego, USA

Acceptance Rates

Overall Acceptance Rate 96 of 345 submissions, 28%

Upcoming Conference

HOTMOBILE '25

Sponsor:
sigmobile

The 26th International Workshop on Mobile Computing Systems and Applications

February 26 - 27, 2025

La Quinta , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
990
Total Downloads

Downloads (Last 12 months)990
Downloads (Last 6 weeks)88

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Uhlmann EPolte JLelidis P(2024)Generative KI zur No-/Low-Code-WissensverarbeitungZeitschrift für wirtschaftlichen Fabrikbetrieb10.1515/zwf-2024-1155119:11(840-844)Online publication date: 18-Nov-2024
https://doi.org/10.1515/zwf-2024-1155
Yang BJiang SXu LLiu KLi HXing GChen HJiang XYan Z(2024)DrHouse: An LLM-empowered Diagnostic Reasoning System through Harnessing Outcomes from Sensor Data and Expert KnowledgeProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997658:4(1-29)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3699765
Arakawa RLehman JGoel M(2024)PrISM-Q&A: Step-Aware Voice Assistant on a Smartwatch Enabled by Multimodal Procedure Tracking and Large Language ModelsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997598:4(1-26)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3699759
Siam SAhn HLiu LAlam SShen HCao ZShroff NKrishnamachari BSrivastava MZhang M(2024)Artificial Intelligence of Things: A SurveyACM Transactions on Sensor Networks10.1145/3690639Online publication date: 30-Aug-2024
https://doi.org/10.1145/3690639
Zhang SMa YFang LJia HD'Alfonso SKostakos VKostakos VKay JHoang T(2024)Enabling On-Device LLMs Personalization with Smartphone SensingCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3677545(186-190)Online publication date: 5-Oct-2024
https://dl.acm.org/doi/10.1145/3675094.3677545
Luo XLiu DDang FLuo H(2024)Integration of LLMs and the Physical World: Research and ApplicationProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674402(1-5)Online publication date: 5-Jul-2024
https://dl.acm.org/doi/10.1145/3674399.3674402
Hou KGuo YFu HChen HYan ZXing GJiang XGanesan DLane NShi W(2024)Improving On-Device LLMs' Sensory Understanding with Embedding InterpolationsProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3697456(1674-1676)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3697456
Laskaridis SKatevas KMinto LHaddadi HGanesan DLane NShi W(2024)MELTing Point: Mobile Evaluation of Language TransformersProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3690668(890-907)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3690668
Wang GZhang DZhang TYang SSun QChen Y(2024)Learning Domain-Invariant Model for WiFi-Based Indoor LocalizationIEEE Transactions on Mobile Computing10.1109/TMC.2024.343845423:12(13898-13913)Online publication date: Dec-2024
https://doi.org/10.1109/TMC.2024.3438454
Ouyang XSrivastava M(2024)LLMSense: Harnessing LLMs for High-level Reasoning Over Spatiotemporal Sensor Traces2024 IEEE 3rd Workshop on Machine Learning on Edge in Sensor Systems (SenSys-ML)10.1109/SenSys-ML62579.2024.00007(9-14)Online publication date: 13-May-2024
https://doi.org/10.1109/SenSys-ML62579.2024.00007
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten