Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3643489.3661128acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article
Open access

MyEachtraX: Lifelog Question Answering on Mobile

Published: 18 June 2024 Publication History

Abstract

Your whole life in your pocket. That is the premise of lifelogging, a technology that captures and stores every moment of your life in digital form. Built on top of MyEachtra and the lifelog question-answering pipeline, MyEachtraX is a mobile-based application that addresses the overlook of mobile platforms in the area. Furthermore, leveraging the latest advancements in natural language processing, such as Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs), the system enhances the query-parsing, post-processing, and question-answering processes in lifelog retrieval. Official lifelog questions from the previous Lifelog Search Challenges were used to evaluate the system, which achieved an accuracy of 72.2%. We identify the retrieval component as the main bottleneck of the pipeline and propose future works to improve the system.

References

[1]
Josh Achiam et al. 2023. Gpt-4 technical report. arXiv preprint, abs/2303.08774. https://arxiv.org/abs/2303.08774 eprint: 2303.08774.
[2]
Ahmed Alateeq, Mark Roantree, and Cathal Gurrin. 2023. Voxento 4.0: a more flexible visualisation and control for lifelogs. In Proceedings of the 6th Annual ACM Lifelog Search Challenge, 7--12.
[3]
Tom B. Brown et al. 2020. Language models are few-shot learners. In NeurIPS.
[4]
Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics.
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT (1). Association for Computational Linguistics, 4171--4186.
[6]
Xiaoyi Dong et al. 2024. InternLM-XComposer2: mastering free-form text-image composition and comprehension in vision-language large model, (Jan. 2024). arXiv: 2401.16420v1 [cs.CV].
[7]
Aaron Duane, Cathal Gurrin, and Wolfgang Huerst. 2018. Virtual Reality Lifelog Explorer: Lifelog Search Challenge at ACM ICMR 2018. In Proceedings of the 2018 ACM Workshop on The Lifelog Search Challenge (Lsc '18). Association for Computing Machinery, New York, NY, USA, (June 2018), 20--23. isbn: 97-81450-357-9-6-8.
[8]
Cathal Gurrin, Alan F Smeaton, Aiden R Doherty, et al. 2014. Lifelogging: personal big data. Foundations and Trends® in information retrieval, 8, 1, 1--125.
[9]
Cathal Gurrin et al. 2022. Introduction to the fifth annual lifelog search challenge, LSC'22. In Proc. International Conference on Multimedia Retrieval (ICMR'22). Newark, NJ, USA.
[10]
Cathal Gurrin et al. 2024. Introduction to the seventh annual lifelog search challenge, LSC'24. In ACM, (June 2024).
[11]
Nhat Hoang-Xuan, Thang-Long Nguyen-Ho, Cathal Gurrin, and Minh-Triet Tran. 2023. Lifelog discovery assistant: suggesting prompts and indexing event sequences for FIRST at LSC 2023. In Proceedings of the 6th Annual ACM Lifelog Search Challenge (ICMR '23). ACM, (June 2023).
[12]
Maria Tysse Hordvik, Julie Sophie Teilstad Østby, Manoj Kesavulu, Thao-Nhu Nguyen, Tu-Khiem Le, and Duc-Tien Dang-Nguyen. 2023. LifeLens: transforming lifelog search with innovative UX/UI design. In Proceedings of the 6th Annual ACM Lifelog Search Challenge, 1--6.
[13]
Emil Knudsen, Thomas Holstein Qvortrup, Omar Shahbaz Khan, and Björn Þór Jónsson. 2021. XQC at the lifelog search challenge 2021: interactive learning on a mobile device. In Proceedings of the 4th Annual on Lifelog Search Challenge, 89--93.
[14]
Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022. Large language models are zero-shot reasoners. In NeurIPS.
[15]
Patrick Lewis et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. NeurIPS, 33, 9459--9474.
[16]
Zhang Li, Biao Yang, Qiang Liu, Zhiyin Ma, Shuo Zhang, Jingxu Yang, Yabo Sun, Yuliang Liu, and Xiang Bai. 2023. Monkey: image resolution and text label are important things for large multi-modal models, (Nov. 2023). arXiv: 2311.06607v3 [cs.CV].
[17]
Tien-Thanh Nguyen-Dang, Xuan-Dang Thai, Gia-Huy Vuong, Van-Son Ho, Minh-Triet Tran, Van-Tu Ninh, Minh-Khoi Pham, Tu-Khiem Le, and Graham Healy. 2023. LifeInsight: an interactive lifelog retrieval system with comprehensive spatial insights and query assistance. In Proceedings of the 6th Annual ACM Lifelog Search Challenge (ICMR '23). ACM, (June 2023). 573.3593106.
[18]
Alec Radford et al. 2021. Learning transferable visual models from natural language supervision. In ICML. Pmlr, 8748--8763.
[19]
Shezheng Song, Xiaopeng Li, Shasha Li, Shan Zhao, Jie Yu, Jun Ma, Xiaoguang Mao, and Weimin Zhang. 2023. How to bridge the gap between modalities: a comprehensive survey on multimodal large language model, (Nov. 2023). arXiv: 2311.07594v2 [cs.CL].
[20]
Florian Spiess and Heiko Schuldt. 2022. Multimodal interactive lifelog retrieval with vitrivr-VR. In Proceedings of the 5th Annual on Lifelog Search Challenge, 38--42.
[21]
Ly-Duyen Tran, Thanh Cong Ho, Lan Anh Pham, Binh Nguyen, Cathal Gurrin, and Liting Zhou. 2022. LLQA-Lifelog question answering dataset. In MultiMedia Modeling: 28th International Conference, MMM 2022, Phu Quoc, Vietnam, June 6--10, 2022, Proceedings, Part I. Springer, 217--228.
[22]
Ly-Duyen Tran, Manh-Duy Nguyen, Nguyen Thanh Binh, Hyowon Lee, and Cathal Gurrin. 2020. Myscéal: an experimental interactive lifelog retrieval system for LSC'20. In Proceedings of the Third Annual Workshop on Lifelog Search Challenge (LSC '20). Association for Computing Machinery, New York, NY, USA, 23--28. isbn: 9781-450-3713-6-0.
[23]
Ly-Duyen Tran, Manh-Duy Nguyen, Binh T Nguyen, and Liting Zhou. 2023. Myscéal: a deeper analysis of an interactive lifelog search engine. Multimedia Tools and Applications, 1--18.
[24]
2024. Interactive question answering for Multimodal lifelog retrieval. Lecture Notes in Computer Science. Springer Nature Switzerland, 68--81. isbn: 9783-031-56435-2.
[25]
Ly-Duyen Tran et al. 2023. Comparing interactive retrieval approaches at the lifelog search challenge 2021. IEEE Access, 11, 30982--30995.
[26]
Ly Duyen Tran, Binh Nguyen, Liting Zhou, and Cathal Gurrin. 2023. MyEachtra: event-based interactive lifelog retrieval system for LSC'23. In Proceedings of the 6th Annual ACM Lifelog Search Challenge, 24--29.
[27]
A Waswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, A Gomez, L Kaiser, and I Polosukhin. 2017. Attention is all you need. In Nips.
[28]
Wayne Xin Zhao et al. 2023. A survey of large language models. (2023).

Cited By

View all
  • (2024)Introduction to the Seventh Annual Lifelog Search Challenge, LSC'24Proceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658891(1334-1335)Online publication date: 30-May-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
LSC '24: Proceedings of the 7th Annual ACM Workshop on the Lifelog Search Challenge
June 2024
128 pages
ISBN:9798400705502
DOI:10.1145/3643489
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2024

Check for updates

Author Tags

  1. lifelog
  2. mobile
  3. retrieval
  4. question answering

Qualifiers

  • Research-article

Funding Sources

Conference

LSC '24
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)148
  • Downloads (Last 6 weeks)47
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Introduction to the Seventh Annual Lifelog Search Challenge, LSC'24Proceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658891(1334-1335)Online publication date: 30-May-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media