Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3640544.3645226acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
demonstration

ExpressEdit: Video Editing with Natural Language and Sketching

Published: 05 April 2024 Publication History

Abstract

Informational videos serve as a crucial source of conceptual and procedural knowledge. When producing informational videos, editors edit videos by overlaying text/images or trimming footage to enhance the video quality and make it more engaging. However, video editing can be difficult and time-consuming, especially for novice video editors who often struggle with expressing and implementing their editing ideas. We present ExpressEdit, a system that enables editing videos via NL text and sketching on the video frame. Powered by LLM and vision models, the system interprets (1) temporal, (2) spatial, and (3) operational references in an NL command and spatial references from sketching. This work offers insights into building multimodal interfaces for video editing.

References

[1]
Mireille Bétrancourt and Kalliopi Benetos. 2018. Why and when does instructional video facilitate learning? A commentary to the special issue “developments and trends in learning with instructional video”. Computers in Human Behavior 89 (Dec. 2018), 471–475. https://doi.org/10.1016/j.chb.2018.08.035
[2]
Gael Chandler. 2004. Cut by cut: editing your film or video. Michael Wiese Productions, Studio City, CA.
[3]
Pei-Yu Chi, Joyce Liu, Jason Linder, Mira Dontcheva, Wilmot Li, and Bjoern Hartmann. 2013. DemoCut: generating concise instructional videos for physical demonstrations. In Proceedings of the 26th annual ACM symposium on User interface software and technology(UIST ’13). Association for Computing Machinery, New York, NY, USA, 141–150. https://doi.org/10.1145/2501988.2502052
[4]
Logan Fiorella and Richard E. Mayer. 2018. What works and doesn’t work with instructional video. Computers in Human Behavior 89 (Dec. 2018), 465–470. https://doi.org/10.1016/j.chb.2018.07.015
[5]
Philip J. Guo, Juho Kim, and Rob Rubin. 2014. How video production affects student engagement: an empirical study of MOOC videos. In Proceedings of the first ACM conference on Learning @ scale conference(L@S ’14). Association for Computing Machinery, New York, NY, USA, 41–50. https://doi.org/10.1145/2556325.2566239
[6]
P. T. Hove. 2014. Characteristics of instructional videos for conceptual knowledge development. https://www.semanticscholar.org/paper/Characteristics-of-instructional-videos-for-Hove/c377da3ea8c08dbe79cd36927b25154ecb51cb48
[7]
Tero Jokela, Kaj Mäkelä, and Minna Karukka. 2007. Empirical observations on video editing in the mobile context. In Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology(Mobility ’07). Association for Computing Machinery, New York, NY, USA, 482–489. https://doi.org/10.1145/1378063.1378140
[8]
Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing step-by-step information extraction to enhance existing how-to videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’14). Association for Computing Machinery, New York, NY, USA, 4017–4026. https://doi.org/10.1145/2556288.2556986
[9]
KonvaJS. 2024. KonvaJS. https://konvajs.org/ Accessed: 2024-01-25.
[10]
Meta. 2024. React. https://react.dev/ Accessed: 2024-01-25.
[11]
MobX. 2024. MobX. https://mobx.js.org/ Accessed: 2024-01-25.
[12]
Pallets. 2024. Flask. https://flask.palletsprojects.com/ Accessed: 2024-01-25.
[13]
Anh Truong, Floraine Berthouzoz, Wilmot Li, and Maneesh Agrawala. 2016. QuickCut: An Interactive Tool for Editing Narrated Video. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology(UIST ’16). Association for Computing Machinery, New York, NY, USA, 497–507. https://doi.org/10.1145/2984511.2984569
[14]
youtube-dl developers. 2024. youtube-dl. https://ytdl-org.github.io/youtube-dl/ Accessed: 2024-01-25.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IUI '24 Companion: Companion Proceedings of the 29th International Conference on Intelligent User Interfaces
March 2024
182 pages
ISBN:9798400705090
DOI:10.1145/3640544
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 April 2024

Check for updates

Author Tags

  1. Human-AI Interaction
  2. Large Language Models
  3. Multimodal Input
  4. Video Editing

Qualifiers

  • Demonstration
  • Research
  • Refereed limited

Funding Sources

  • Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2021-0-01347,Video Interaction Technologies Using Object-Oriented Video Modeling)

Conference

IUI '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 63
    Total Downloads
  • Downloads (Last 12 months)63
  • Downloads (Last 6 weeks)21
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media