Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2678025.2701408acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

Content-driven Multi-modal Techniques for Non-linear Video Navigation

Published: 18 March 2015 Publication History

Abstract

The growth of Massive Open Online Courses (MOOCs) has been remarkable in the last few years. A significant amount of MOOCs content is in the form of videos and participants often use non-linear navigation to browse through a video. This paper proposes the design of a system that provides non-linear navigation in educational videos using features derived from a combination of audio and visual content of a video. It provides multiple dimensions for quickly navigating to a given point of interest in a video i.e., customized dynamic time-aware word-cloud, video pages, and a 2-D timeline. In word-cloud, the relative placement of the words indicates their temporal ordering in the video whereas color codes are used to represent acoustic stress. The 2-D timeline is used to present multiple occurrences of a keyword/concept in the video in response to user click in the word-cloud. Additionally, visual content is analyzed to identify frames with "maximum written content", known as video pages. We conducted a user study with 20 users to evaluate the proposed system and compared it with transcription-based interfaces used by major MOOC providers. Our findings suggest that the proposed system leads to statistically significant navigation time savings especially on multimodal navigation tasks.

References

[1]
EdX MOOC Platform, https://www.edx.org/
[2]
Coursera MOOC Platform https://www.coursera.org/
[3]
FFMPEG Library, https://www.ffmpeg.org/
[4]
Wordle http://www.wordle.net/
[5]
Lindsay Ryan, MOOCs are on the Move: A Snapshot of the Rapid Growth of MOOCs, https://www.efmd.org/index.php/blog/view/250-white-paper-moocs-massive-open-online-courses
[6]
An Early Report Card on Massive Open Online Courses, http://goo.gl/jceRXA
[7]
Online viewers ditch slow-loading video after 2 seconds, http://edition.cnn.com/2012/11/12/tech/web/video-loading-study
[8]
Guo, P. J., & Reinecke, K. (2014, March). Demographic differences in how students navigate through MOOCs. In Proceedings of the first ACM conference on Learning@ scale conference (pp. 21--30). ACM. Chicago
[9]
Kim, J., Guo, P. J., Seaton, D. T., Mitros, P., Gajos, K. Z., & Miller, R. C. (2014, March). Understanding in-video dropouts and interaction peaks in online lecture videos. In Proceedings of the first ACM conference on Learning@ scale conference (pp. 31--40). ACM.
[10]
Cutrell, E., Bala, S., Bansal, C., Cross, A., Datha, N., John, A., ... & Thies, W. (2013). Massively Empowered Classroom: Enhancing Technical Education in India, MSR Technical Report.
[11]
Krishnan, S. S., & Sitaraman, R. K. (2013). Video stream quality impacts viewer behavior: inferring causality using quasi-experimental designs. IEEE/ACM Transactions on Networking (TON), 21(6), 2001--2014.
[12]
Castella, Q., & Sutton, C. (2014, April). Word storms: multiples of word clouds for visual comparison of documents. In Proceedings of the 23rd international conference on World wide web (pp. 665--676). International World Wide Web Conferences Steering Committee.
[13]
Ada, I., Thiel, K., & Berthold, M. R. (2010, October). Distance aware tag clouds. In Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on (pp. 2316--2322). IEEE.
[14]
Monserrat, T. J. K. P., Zhao, S., McGee, K., & Pandey, A. V. (2013, April). NoteVideo: facilitating navigation of blackboard-style lecture videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1139--1148). ACM.
[15]
Kim, J., Guo, P. J., Cai, C. J., Li, S. W. D., Gajos, K. Z., & Miller, R. C. (2014, October). Data-driven interaction techniques for improving navigation of educational videos. In Proceedings of the 27th annual ACM symposium on User interface software and technology (pp. 563--572). ACM.
[16]
Nicholson, J., Huber, M., Jackson, D., & Olivier, P. (2014, April). Panopticon as an eLearning support search tool. In Proceedings of the 32nd annual ACM conference on Human factors in computing systems (pp. 1221--1224). ACM.
[17]
Matejka, J., Grossman, T., & Fitzmaurice, G. (2013, April). Swifter: improved online video scrubbing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1159--1168). ACM.
[18]
Rohlicek, J. Robin, et al. "Continuous hidden Markov modeling for speaker-independent word spotting." Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on. IEEE, 1989.
[19]
Vidrascu, Laurence, and Laurence Devillers. "Detection of real-life emotions in call centers." INTERSPEECH. Vol. 2005. No. 10. 2005.
[20]
Deshmukh, Om D., and Ashish Verma. "Nucleus-level clustering for word-independent syllable stress classification." Speech Communication 51.12 (2009): 1224--1233.
[21]
SNACK sound toolkit http://www.speech.kth.se/snack/
[22]
Jitendra Ajmera, Om D. Deshmukh, Anupam Jain, Amit Anil Nanavati, Nitendra Rajput, Saurabh Srivastava: Audio cloud: creation and rendering. IUI 2012: 277--280.
[23]
http://www.colour-affects.co.uk/psychologicalproperties-of-colours
[24]
Young, Steve, et al. "The HTK book (for HTK version 3.4)." Cambridge university engineering department 2.2 (2006): 2--3.
[25]
http://www.keithv.com/software/htk/us/
[26]
Reeves, Carolyn, A. Ren Schmauder, and Robin K. Morris. "Stress grouping improves performance on an immediate serial list recall task." Journal of Experimental Psychology: Learning, Memory, and Cognition 26.6 (2000): 1638.
[27]
C. J. van Rijsbergen, S. E. Robertson and M. F. Porter, 1980. New models in probabilistic information retrieval. London: British Library. (British Library Research and Development Report, no. 5587.
[28]
Chang, Chih-Chung, and Chih-Jen Lin. "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology (TIST) 2.3 (2011): 27.
[29]
Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005.
[30]
Jrvelin, Kalervo, and Jaana Keklinen. "IR evaluation methods for retrieving highly relevant documents." Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2000.
[31]
Comaniciu, Dorin, and Peter Meer. "Mean shift: A robust approach toward feature space analysis." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24.5 (2002): 603--619.
[32]
Ying-Dong, Qu, et al. "A fast subpixel edge detection method using¡ i¿ Sobel¡/i¿¡ i¿ Zernike moments¡/i¿ operator." Image and Vision Computing 23.1 (2005): 11--17.
[33]
Hu, Weiming, et al. "A survey on visual content-based video indexing and retrieval." Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 41.6 (2011): 797--819.
[34]
Choudary, Chekuri, and Tiecheng Liu. "Summarization of visual content in instructional videos." Multimedia, IEEE Transactions on 9.7 (2007): 1443--1455.
[35]
Altman, Edward, Yu Chen, and Wai Chong Low. "Semantic exploration of lecture videos." Proceedings of the tenth ACM international conference on Multimedia. ACM, 2002.
[36]
Adcock, John, et al. "Talkminer: a lecture webcast search engine." Proceedings of the international conference on Multimedia. ACM, 2010.

Cited By

View all
  • (2024)Cooking With Agents: Designing Context-aware Voice InteractionProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642183(1-13)Online publication date: 11-May-2024
  • (2024)Video Visualization and Visual Analytics: A Task-Based and Application- Driven InvestigationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.342340234:11(11316-11339)Online publication date: Nov-2024
  • (2023)BNoteHelper: A Note-based Outline Generation Tool for Structured Learning on Video-sharing PlatformsACM Transactions on the Web10.1145/363877518:2(1-30)Online publication date: 27-Dec-2023
  • Show More Cited By

Index Terms

  1. Content-driven Multi-modal Techniques for Non-linear Video Navigation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '15: Proceedings of the 20th International Conference on Intelligent User Interfaces
    March 2015
    480 pages
    ISBN:9781450333061
    DOI:10.1145/2678025
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 March 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. acoustic stress
    2. education
    3. moocs
    4. video analysis
    5. video navigation
    6. word-cloud

    Qualifiers

    • Research-article

    Conference

    IUI'15
    Sponsor:

    Acceptance Rates

    IUI '15 Paper Acceptance Rate 47 of 205 submissions, 23%;
    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Upcoming Conference

    IUI '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Cooking With Agents: Designing Context-aware Voice InteractionProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642183(1-13)Online publication date: 11-May-2024
    • (2024)Video Visualization and Visual Analytics: A Task-Based and Application- Driven InvestigationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.342340234:11(11316-11339)Online publication date: Nov-2024
    • (2023)BNoteHelper: A Note-based Outline Generation Tool for Structured Learning on Video-sharing PlatformsACM Transactions on the Web10.1145/363877518:2(1-30)Online publication date: 27-Dec-2023
    • (2023)MeetScript: Designing Transcript-based Interactions to Support Active Participation in Group Video MeetingsProceedings of the ACM on Human-Computer Interaction10.1145/36101967:CSCW2(1-32)Online publication date: 4-Oct-2023
    • (2023)Semantic Navigation of PowerPoint-Based Lecture Video for AutoNote GenerationIEEE Transactions on Learning Technologies10.1109/TLT.2022.321653516:1(1-17)Online publication date: 1-Feb-2023
    • (2023)Exploring the Impact of User and System Factors on Human-AI Interactions in Head-Worn Displays2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)10.1109/ISMAR59233.2023.00025(109-118)Online publication date: 16-Oct-2023
    • (2023)Identification of Visual Objects in Lecture Videos with Color and Keypoints Analysis2023 IEEE International Symposium on Multimedia (ISM)10.1109/ISM59092.2023.00060(315-320)Online publication date: 11-Dec-2023
    • (2022)Exploring jump back behavior patterns and reasons in e-book systemSmart Learning Environments10.1186/s40561-021-00183-69:1Online publication date: 4-Jan-2022
    • (2022)“Rewind to the Jiggling Meat Part”: Understanding Voice Control of Instructional Videos in Everyday TasksProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3502036(1-11)Online publication date: 29-Apr-2022
    • (2021)ConceptGuide: Supporting Online Video Learning with Concept Map-based Recommendation of Learning PathProceedings of the Web Conference 202110.1145/3442381.3449808(2757-2768)Online publication date: 19-Apr-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media