demonstration

ExpressEdit: Video Editing with Natural Language and Sketching

Authors:

Bekzat Tilekbay,

Saelyne Yang,

Michal Adam Lewkowicz,

Alex Suryapranata,

Juho KimAuthors Info & Claims

IUI '24 Companion: Companion Proceedings of the 29th International Conference on Intelligent User Interfaces

Pages 50 - 53

https://doi.org/10.1145/3640544.3645226

Published: 05 April 2024 Publication History

Get Access

Abstract

Informational videos serve as a crucial source of conceptual and procedural knowledge. When producing informational videos, editors edit videos by overlaying text/images or trimming footage to enhance the video quality and make it more engaging. However, video editing can be difficult and time-consuming, especially for novice video editors who often struggle with expressing and implementing their editing ideas. We present ExpressEdit, a system that enables editing videos via NL text and sketching on the video frame. Powered by LLM and vision models, the system interprets (1) temporal, (2) spatial, and (3) operational references in an NL command and spatial references from sketching. This work offers insights into building multimodal interfaces for video editing.

References

[1]

Mireille Bétrancourt and Kalliopi Benetos. 2018. Why and when does instructional video facilitate learning? A commentary to the special issue “developments and trends in learning with instructional video”. Computers in Human Behavior 89 (Dec. 2018), 471–475. https://doi.org/10.1016/j.chb.2018.08.035

Digital Library

Google Scholar

[2]

Gael Chandler. 2004. Cut by cut: editing your film or video. Michael Wiese Productions, Studio City, CA.

Google Scholar

[3]

Pei-Yu Chi, Joyce Liu, Jason Linder, Mira Dontcheva, Wilmot Li, and Bjoern Hartmann. 2013. DemoCut: generating concise instructional videos for physical demonstrations. In Proceedings of the 26th annual ACM symposium on User interface software and technology(UIST ’13). Association for Computing Machinery, New York, NY, USA, 141–150. https://doi.org/10.1145/2501988.2502052

Digital Library

Google Scholar

[4]

Logan Fiorella and Richard E. Mayer. 2018. What works and doesn’t work with instructional video. Computers in Human Behavior 89 (Dec. 2018), 465–470. https://doi.org/10.1016/j.chb.2018.07.015

Digital Library

Google Scholar

[5]

Philip J. Guo, Juho Kim, and Rob Rubin. 2014. How video production affects student engagement: an empirical study of MOOC videos. In Proceedings of the first ACM conference on Learning @ scale conference(L@S ’14). Association for Computing Machinery, New York, NY, USA, 41–50. https://doi.org/10.1145/2556325.2566239

Digital Library

Google Scholar

[6]

P. T. Hove. 2014. Characteristics of instructional videos for conceptual knowledge development. https://www.semanticscholar.org/paper/Characteristics-of-instructional-videos-for-Hove/c377da3ea8c08dbe79cd36927b25154ecb51cb48

Google Scholar

[7]

Tero Jokela, Kaj Mäkelä, and Minna Karukka. 2007. Empirical observations on video editing in the mobile context. In Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology(Mobility ’07). Association for Computing Machinery, New York, NY, USA, 482–489. https://doi.org/10.1145/1378063.1378140

Digital Library

Google Scholar

[8]

Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing step-by-step information extraction to enhance existing how-to videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’14). Association for Computing Machinery, New York, NY, USA, 4017–4026. https://doi.org/10.1145/2556288.2556986

Digital Library

Google Scholar

[9]

KonvaJS. 2024. KonvaJS. https://konvajs.org/ Accessed: 2024-01-25.

Google Scholar

[10]

Meta. 2024. React. https://react.dev/ Accessed: 2024-01-25.

Google Scholar

[11]

MobX. 2024. MobX. https://mobx.js.org/ Accessed: 2024-01-25.

Google Scholar

[12]

Pallets. 2024. Flask. https://flask.palletsprojects.com/ Accessed: 2024-01-25.

Google Scholar

[13]

Anh Truong, Floraine Berthouzoz, Wilmot Li, and Maneesh Agrawala. 2016. QuickCut: An Interactive Tool for Editing Narrated Video. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology(UIST ’16). Association for Computing Machinery, New York, NY, USA, 497–507. https://doi.org/10.1145/2984511.2984569

Digital Library

Google Scholar

[14]

youtube-dl developers. 2024. youtube-dl. https://ytdl-org.github.io/youtube-dl/ Accessed: 2024-01-25.

Google Scholar

Recommendations

ExpressEdit: Video Editing with Natural Language and Sketching
IUI '24: Proceedings of the 29th International Conference on Intelligent User Interfaces

Informational videos serve as a crucial source for explaining conceptual and procedural knowledge to novices and experts alike. When producing informational videos, editors edit videos by overlaying text/images or trimming footage to enhance the video ...
Iterative Motion Editing with Natural Language
SIGGRAPH '24: ACM SIGGRAPH 2024 Conference Papers

Text-to-motion diffusion models can generate realistic animations from text prompts, but do not support fine-grained motion editing controls. In this paper, we present a method for using natural language to iteratively specify local edits to existing ...
Multi-clip video editing from a single viewpoint
CVMP '14: Proceedings of the 11th European Conference on Visual Media Production

We propose a framework for automatically generating multiple clips suitable for video editing by simulating pan-tilt-zoom camera movements within the frame of a single static camera. Assuming important actors and objects can be localized using computer ...

Comments

Information & Contributors

Information

Published In

IUI '24 Companion: Companion Proceedings of the 29th International Conference on Intelligent User Interfaces

March 2024

182 pages

ISBN:9798400705090

DOI:10.1145/3640544

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 April 2024

Check for updates

Author Tags

Qualifiers

Demonstration
Research
Refereed limited

Funding Sources

Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2021-0-01347,Video Interaction Technologies Using Object-Oriented Video Modeling)

Conference

IUI '24

Sponsor:

IUI '24: 29th International Conference on Intelligent User Interfaces

March 18 - 21, 2024

SC, Greenville, USA

Acceptance Rates

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
63
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)21

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Recommendations