research-article

BNoteHelper: A Note-based Outline Generation Tool for Structured Learning on Video-sharing Platforms

Authors:

Tun Lu, and

Ning GuAuthors Info & Claims

ACM Transactions on the Web, Volume 18, Issue 2

Article No.: 28, Pages 1 - 30

https://doi.org/10.1145/3638775

Published: 12 March 2024 Publication History

Abstract

Usually generated by ordinary users and often not particularly designed for learning, the videos on video-sharing platforms are mostly not structured enough to support learning purposes, although they are increasingly leveraged for that. Most existing studies attempt to structure the video using video summarization techniques. However, these methods focus on extracting information from within the video and aiming to consume the video itself. In this article, we design and implement BNoteHelper, a note-based video outline prototype that generates outline titles by extracting user-generated notes on Bilibili, using the BART model fine-tuned on a built dataset. As a browser plugin, BNoteHelper provides users with video overview and navigation as well as note-taking template, via two main features: outline table and navigation marker. The model and prototype are evaluated through automatic and human evaluations. The automatic evaluation reveals that, both before and after fine-tuning, the BART model outperforms T5-Pegasus in BLEU and Perplexity metrics. Also, the results from user feedback reveal that the generation outline sourced from notes is preferred by users over that sourced from video captions due to its more concise, clear, and accurate characteristics but also too general with less details and diversities sometimes. Two features of the video outline are also found to have respective advantages, especially in holistic and fine-grained aspects. Based on these results, we propose insights into designing a video summary from the user-generated creation perspective, customizing it based on video types, and strengthening the advantages of its different visual styles on video-sharing platforms.

References

[1]

Abir Al-Hajri, Gregor Miller, Matthew Fong, and Sidney S. Fels. 2014. Visualization of personal history for video navigation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1187–1196.

Digital Library

[2]

Ignacio Avellino, Sheida Nozari, Geoffroy Canlorbe, and Yvonne Jansen. 2021. Surgical video summarization: Multifarious uses, summarization process and ad-hoc coordination. Proc. ACM Hum.-Comput. Interact. 5, CSCW1 (2021), 1–23.

Digital Library

[3]

Qiwei Bi, Haoyuan Li, and Hanfang Yang. 2021. Boosting few-shot abstractive summarization with auxiliary tasks. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 2888–2893.

Digital Library

[4]

bilibili.com. 2021. Column Angel. Retrieved from https://www.bilibili.com/read/cv10976550

[5]

bilibili.com. 2022. Financial Report. Retrieved from https://www.bilibili.com/read/cv20123654

[6]

Arijit Biswas, Ankit Gandhi, and Om Deshmukh. 2015. MMToC: A multimodal method for table of content creation in educational videos. In Proceedings of the 23rd ACM International Conference on Multimedia. 621–630.

Digital Library

[7]

John Boreczky, Andreas Girgensohn, Gene Golovchinsky, and Shingo Uchihashi. 2000. An interactive comic book presentation for exploring video. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 185–192.

Digital Library

[8]

Dung C. Bui, Joel Myerson, and Sandra Hale. 2013. Note-taking with computers: Exploring alternative strategies for improved recall.J. Edu. Psychol. 105, 2 (2013), 299.

[9]

Yining Cao, Hariharan Subramonyam, and Eytan Adar. 2022. VideoSticker: A tool for active viewing and visual note-taking from videos. In Proceedings of the 27th International Conference on Intelligent User Interfaces. 672–690.

Digital Library

[10]

Songqiang Chen, Xiaoyuan Xie, Bangguo Yin, Yuanxiang Ji, Lin Chen, and Baowen Xu. 2020. Stay professional and efficient: Automatically generate titles for your bug reports. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 385–397.

Digital Library

[11]

Michelene T. H. Chi and Ruth Wylie. 2014. The ICAP framework: Linking cognitive engagement to active learning outcomes. Edu. Psychol. 49, 4 (2014), 219–243.

[12]

Hai Dang, Karim Benharrak, Florian Lehmann, and Daniel Buschek. 2022. Beyond text generation: Supporting writers with continuous automatic text summaries. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. 1–13.

Digital Library

[13]

Erhan Delen, Jeffrey Liew, and Victor Willson. 2014. Effects of interactivity and instructional scaffolding on learning: Self-regulation in online video-based environments. Comput. Edu. 78 (2014), 312–320.

Digital Library

[14]

Thaleia Deniozou, Mariza Dima, and Chris Cox. 2020. Designing a game to help higher education students develop their note-taking skills. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play. 181–192.

Digital Library

[15]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arXiv:1810.04805

[16]

Xianghua Ding, Yubo Kou, Yiwen Xu, and Peng Zhang. 2022. “As uploaders, we have the responsibility”: Individualized professionalization of bilibili uploaders. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–14.

[17]

Jingchao Fang, Yanhao Wang, Chi-Lan Yang, Ching Liu, and Hao-Chuan Wang. 2022. Understanding the effects of structured note-taking systems for video-based learners in individual and social learning contexts. Proc. ACM Hum.-Comput. Interact. 6 (2022), 1–21.

Digital Library

[18]

Jingchao Fang, Yanhao Wang, Chi-Lan Yang, and Hao-Chuan Wang. 2021. NoteCoStruct: Powering online learners with socially scaffolded note taking and sharing. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–5.

Digital Library

[19]

Marina Fernández Camporro and Nicolai Marquardt. 2020. Live sketchnoting across platforms: Exploring the potential and limitations of analogue and digital tools. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–12.

[20]

Xin Fu, John C. Schaefer, Gary Marchionini, and Xiangming Mu. 2006. Video annotation in a learning environment. Proceedings of the American Society for Information Science and Technology 43, 1 (2006), 1–22.

[21]

Dilrukshi Gamage. 2021. Scaffolding social presence in MOOCs. In Proceedings of the Asian CHI Symposium 2021. 140–145.

Digital Library

[22]

Carl Gutwin, Michael van der Kamp, Md Sami Uddin, Kevin Stanley, Ian Stavness, and Sally Vail. 2019. Improving early navigation in time-lapse video with spread-frame loading. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[23]

Udo Hahn and Inderjeet Mani. 2000. The challenges of automatic summarization. Computer 33, 11 (2000), 29–36.

Digital Library

[24]

Haojian Jin, Yale Song, and Koji Yatani. 2017. Elasticplay: Interactive video summarization with dynamic time budgets. In Proceedings of the 25th ACM International Conference on Multimedia. 1164–1172.

Digital Library

[25]

Hussain Kanafani, Junaid Ahmed Ghauri, Sherzod Hakimov, and Ralph Ewerth. 2021. Unsupervised video summarization via multi-source features. In Proceedings of the International Conference on Multimedia Retrieval. 466–470.

Digital Library

[26]

Anam Ahmad Khan, Sadia Nawaz, Joshua Newn, Ryan M. Kelly, Jason M. Lodge, James Bailey, and Eduardo Velloso. 2022. To type or to speak? The effect of input modality on text understanding during note-taking. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–15.

Digital Library

[27]

Kenneth A. Kiewra and Nelson F. DuBois. 1998. Learning to Learn: Making the Transition from Student to Life-long Learner. Prentice Hall.

[28]

Juho Kim, Philip J. Guo, Carrie J. Cai, Shang-Wen Li, Krzysztof Z. Gajos, and Robert C. Miller. 2014. Data-driven interaction techniques for improving navigation of educational videos. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology. 563–572.

Digital Library

[29]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. Retrieved from https://arXiv:1910.13461

[30]

Francis C. Li, Anoop Gupta, Elizabeth Sanocki, Li-wei He, and Yong Rui. 2000. Browsing digital video. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 169–176.

Digital Library

[31]

Wenxu Li, Gang Pan, Chen Wang, Zhen Xing, and Zhenjun Han. 2022. From coarse to fine: Hierarchical structure-aware video summarization. ACM Trans. Multimedia Comput., Commun. Appl. 18, 1s (2022), 1–16.

Digital Library

[32]

Zhicong Lu, Seongkook Heo, and Daniel J. Wigdor. 2018. StreamWiki: Enabling viewers of knowledge sharing live streams to collaboratively generate archival documentation for effective in-stream and post hoc learning. Proc. ACM Hum.-Comput. Interact. 2, CSCW (2018), 1–26.

Digital Library

[33]

Adele Lu Jia, Xiaoxue Shen, Siqi Shen, Yongquan Fu, and Liwen Peng. 2019. User donations in a user generated video system. In Proceedings of the World Wide Web Conference. 1055–1062.

[34]

Yuanjie Lyu, Chen Zhu, Tong Xu, Zikai Yin, and Enhong Chen. 2022. Faithful abstractive summarization via fact-aware consistency-constrained transformer. In Proceedings of the 31st ACM International Conference on Information and Knowledge Management. 1410–1419.

Digital Library

[35]

Nathan Magyar, Xuenan Xu, and Molly Maher. 2020. Creating and evaluating a goal setting prototype for MOOCs. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–8.

Digital Library

[36]

Debabrata Mahapatra, Ragunathan Mariappan, and Vaibhav Rajan. 2018. Automatic hierarchical table of contents generation for educational videos. In Proceedings of the Web Conference. 267–274.

Digital Library

[37]

Arthur G. Money and Harry Agius. 2008. Video summarisation: A conceptual framework and survey of the state of the art. J. Visual Commun. Image Represent. 19, 2 (2008), 121–143.

Digital Library

[38]

Toni-Jan Keith Monserrat, Shengdong Zhao, Kevin Mcgee, and Anshul Vikram Pandey. 2013. NoteVideo: Facilitating navigation of blackboard-style lecture videos. In Proceedings of the CHI Conference on Human Factors in Computing Systems. ACM, 2897–2898.

[39]

Leonardo Moraes, Ricardo Marcondes Marcacini, and Rudinei Goularte. 2022. Video summarization using text subjectivity classification. In Proceedings of the Brazilian Symposium on Multimedia and the Web. 133–141.

Digital Library

[40]

Xiangming Mu. 2010. Towards effective video annotation: An approach to automatically link notes with video content. Comput. Edu. 55, 4 (2010), 1752–1763.

Digital Library

[41]

Pam A. Mueller and Daniel M. Oppenheimer. 2014. The pen is mightier than the keyboard: Advantages of longhand over laptop note taking. Psychol. Sci. 25, 6 (2014), 1159–1168.

[42]

Amirhossein Nadiri and Frank W. Takes. 2022. A large-scale temporal analysis of user lifespan durability on the reddit social media platform. In Proceedings of the Web Conference. 677–685.

Digital Library

[43]

Ernesto Pacheco-Velazquez. 2020. Using gamification to develop self-directed learning. In Proceedings of the International Conference on Education Development and Studies. 1–5.

Digital Library

[44]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.

Digital Library

[45]

Amy Pavel, Colorado Reed, Björn Hartmann, and Maneesh Agrawala. 2014. Video digests: A browsable, skimmable format for informational lecture videos. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology (UIST’14), Vol. 10. Citeseer, 2642918–2647400.

Digital Library

[46]

Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever et al. 2018. Improving language understanding by generative pre-training. https://api.semanticscholar.org/CorpusID:49313245

[47]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 1 (2020), 5485–5551.

Digital Library

[48]

Sonny Rosenthal. 2018. Motivations to seek science videos on YouTube: Free-choice learning in a connected society. Int. J. Sci. Edu. B 8, 1 (2018), 22–39.

[49]

Marija Sablić, Ana Mirosavljević, and Alma Škugor. 2021. Video-based learning (VBL)–past, present and future: An overview of the research published from 2008 to 2019. Technol., Knowl. Learn. 26 (2021), 1061–1077.

[50]

Ananya B. Sai, Akash Kumar Mohankumar, and Mitesh M. Khapra. 2022. A survey of evaluation metrics used for NLG systems. ACM Comput. Surveys 55, 2 (2022), 1–39.

Digital Library

[51]

Xindi Shang, Zehuan Yuan, Anran Wang, and Changhu Wang. 2021. Multimodal video summarization via time-aware transformers. In Proceedings of the 29th ACM International Conference on Multimedia. 1756–1765.

Digital Library

[52]

Abdulhadi Shoufan. 2019. Estimating the cognitive value of YouTube’s educational videos: A learning analytics approach. Comput. Hum. Behav. 92 (2019), 450–458.

Digital Library

[53]

Jianlin Su. 2021. T5 PEGASUS: An Open Source Generative Pre-training model for Chinese. Retrieved from https://kexue.fm/archives/8209

[54]

Guodao Sun, Hao Wu, Lin Zhu, Chaoqing Xu, Haoran Liang, Binwei Xu, and Ronghua Liang. 2021. VSumVis: Interactive visual understanding and diagnosis of video summarization model. ACM Trans. Intell. Syst. Technol. 12, 4 (2021), 1–28.

Digital Library

[55]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. Adv. Neural Info. Process. Syst. 27 (2014).

[56]

Chien-Lin Tang, Jingxian Liao, Hao-Chuan Wang, Ching-Ying Sung, and Wen-Chieh Lin. 2021. ConceptGuide: Supporting online video learning with concept map-based recommendation of learning path. In Proceedings of the Web Conference. 2757–2768.

Digital Library

[57]

Anh Truong, Peggy Chi, David Salesin, Irfan Essa, and Maneesh Agrawala. 2021. Automatic generation of two-level hierarchical tutorials from instructional makeup videos. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–16.

Digital Library

[58]

Quoc-Tuan Truong, Tong Zhao, Changhe Yuan, Jin Li, Jim Chan, Soo-Min Pantel, and Hady W. Lauw. 2022. AmpSum: Adaptive multiple-product summarization towards improving recommendation captions. In Proceedings of the ACM Web Conference 2022. 2978–2988.

Digital Library

[59]

Teemu Valtonen, S. Havu-Nuutinen, Patrick Dillon, and Mikko Vesisenaho. 2011. Facilitating collaboration in lecture-based learning through shared notes using wireless technologies. J. Comput. Assist. Learn. 27, 6 (2011), 575–586.

[60]

Bingzhen Wei, Xuancheng Ren, Yi Zhang, Xiaoyan Cai, Qi Su, and Xu Sun. 2019. Regularizing output distribution of abstractive Chinese social media text summarization for improved semantic consistency. ACM Trans. Asian Low-Resour. Lang. Info. Process. 18, 3 (2019), 1–15.

Digital Library

[61]

Wesley Willett, Pascal Goffin, and Petra Isenberg. 2015. Understanding digital note-taking practice for visualization. IEEE Comput. Graph. Appl. 35, 4 (2015), 38–51.

Digital Library

[62]

Bo Xiong, Yannis Kalantidis, Deepti Ghadiyaram, and Kristen Grauman. 2019. Less is more: Learning highlight detection from video duration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1258–1267.

[63]

Kuldeep Yadav, Ankit Gandhi, Arijit Biswas, Kundan Shrivastava, Saurabh Srivastava, and Om Deshmukh. 2016. Vizig: Anchor points based non-linear navigation and summarization in educational videos. In Proceedings of the 21st International Conference on Intelligent User Interfaces. 407–418.

Digital Library

[64]

Kuldeep Yadav, Kundan Shrivastava, S. Mohana Prasad, Harish Arsikere, Sonal Patil, Ranjeet Kumar, and Om Deshmukh. 2015. Content-driven multi-modal techniques for non-linear video navigation. In Proceedings of the 20th International Conference on Intelligent user Interfaces. 333–344.

Digital Library

[65]

Saelyne Yang, Jisu Yim, Juho Kim, and Hijung Valentina Shin. 2022. CatchLive: Real-time summarization of live streams with stream content and interaction data. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–20.

Digital Library

[66]

Ming Yao, Yu Bai, Wei Du, Xuejun Zhang, Heng Quan, Fuli Cai, and Hongwei Kang. 2022. Multi-level spatiotemporal network for video summarization. In Proceedings of the 30th ACM International Conference on Multimedia. 790–798.

Digital Library

[67]

Taewon Yoo, Hyewon Jeong, Donghwan Lee, and Hyunggu Jung. 2021. LectYS: A system for summarizing lecture videos on YouTube. In Proceedings of the 26th International Conference on Intelligent User Interfaces-Companion. 90–92.

Digital Library

[68]

Fangyu Yu, Peng Zhang, Xianghua Ding, Tun Lu, and Ning Gu. 2023. Exploring how workspace awareness cues affect distributed meeting outcome. Int. J. Hum.–Comput. Interact. 39, 8 (2023), 1606–1625.

[69]

Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the International Conference on Machine Learning. PMLR, 11328–11339.

[70]

Lei Zhang, Qian-Kun Xu, Lei-Zheng Nie, and Hua Huang. 2014. VideoGraph: A non-linear video representation for efficient exploration. Visual Comput. 30 (2014), 1123–1132.

Digital Library

[71]

Ting Zhang, Divya Prabha Chandrasekaran, Ferdian Thung, and David Lo. 2022. Benchmarking library recognition in tweets. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension. 343–353.

Digital Library

[72]

Yinhe Zheng, Rongsheng Zhang, Minlie Huang, and Xiaoxi Mao. 2020. A pre-training based personalized dialogue generation model with persona-sparse data. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 9693–9700.

[73]

Chenguang Zhu, Ziyi Yang, Robert Gmyr, Michael Zeng, and Xuedong Huang. 2021. Leveraging lead bias for zero-shot abstractive news summarization. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1462–1471.

Digital Library

[74]

Meina Zhu. 2022. Designing and delivering MOOCs to motivate participants for self-directed learning. Open Learn.: J. Open, Distance E-Learn. (2022), 1–20.

Index Terms

BNoteHelper: A Note-based Outline Generation Tool for Structured Learning on Video-sharing Platforms
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy

Recommendations

Video summarization based on user log enhanced link analysis
MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia

Efficient video data management calls for intelligent video summarization tools that automatically generate concise video summaries for fast skimming and browsing. Traditional video summarization techniques are based on low-level feature analysis, which ...
Read More
Video summarization via transferrable structured learning
WWW '11: Proceedings of the 20th international conference on World wide web

It is well-known that textual information such as video transcripts and video reviews can significantly enhance the performance of video summarization algorithms. Unfortunately, many videos on the Web such as those from the popular video sharing site ...
Read More
Brief and high-interest video summary generation: evaluating the AT&T labs rushes summarizations
TVS '08: Proceedings of the 2nd ACM TRECVid Video Summarization Workshop

Video summarization is essential for the user to understand the main theme of video sequences in a short period, especially when the volume of the video is huge and the content is highly redundant. In this paper, we present a video summarization system, ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on the Web

ACM Transactions on the Web Volume 18, Issue 2

May 2024

378 pages

ISSN:1559-1131

EISSN:1559-114X

DOI:10.1145/3613666

Editor:
White Ryen
Microsoft Research, USA

Issue’s Table of Contents

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 March 2024

Online AM: 27 December 2023

Accepted: 23 December 2023

Revised: 26 November 2023

Received: 19 June 2023

Published in TWEB Volume 18, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China (NSFC)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
277
Total Downloads

Downloads (Last 12 months)277
Downloads (Last 6 weeks)40

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents