research-article

Gaze-Driven Video Re-Editing

Authors:

Jessica HodginsAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 34, Issue 2

Article No.: 21, Pages 1 - 12

https://doi.org/10.1145/2699644

Published: 02 March 2015 Publication History

Abstract

Given the current profusion of devices for viewing media, video content created at one aspect ratio is often viewed on displays with different aspect ratios. Many previous solutions address this problem by retargeting or resizing the video, but a more general solution would re-edit the video for the new display. Our method employs the three primary editing operations: pan, cut, and zoom. We let viewers implicitly reveal what is important in a video by tracking their gaze as they watch the video. We present an algorithm that optimizes the path of a cropping window based on the collected eyetracking data, finds places to cut, and computes the size of the cropping window. We present results on a variety of video clips, including close-up and distant shots, and stationary and moving cameras. We conduct two experiments to evaluate our results. First, we eyetrack viewers on the result videos generated by our algorithm, and second, we perform a subjective assessment of viewer preference. These experiments show that viewer gaze patterns are similar on our result videos and on the original video clips, and that viewers prefer our results to an optimized crop-and-warp algorithm.

Supplementary Material

JPG File (a21.jpg)

Download
18.40 KB

jain (jain.zip)

Supplemental movie, appendix, image and software files for, Gaze-Driven Video Re-Editing

Download
110.67 MB

MP4 File (a21.mp4)

Download
17.20 MB

References

[1]

W. Abbot and F. Aldo. 2011. Ultra-low cost eyetracking as an high information throughput alternative to BMIS. BMC Neurosci. 12,1.

[2]

J. S. Agustin, H. Skovsgaard, E. Mollenbach, M. Barret, M. Tall, D. W. Hansen, and J. P. Hansen. 2010. Evaluation of a low-cost open-source gaze tracker. In Proceedings of the Symposium on Eyetracking Research and Applications (ETRA'10). 77—80.

Digital Library

[3]

S. Avidan and A. Shamir. 2007. Seam carving for content-aware image resizing. ACM Trans. Graph. 26, 3.

Digital Library

[4]

F. Baluch and L. Itti. 2011. Mechanisms of top-down attention. Trends Neurosci. 34, 210--224.

[5]

S. Castillo, T. Judd, and D. Gutierrez. 2011. Using eye-tracking to assess different image retargeting methods. In Proceedings of the Symposium on Applied Perception in Graphics and Visualization (APGV'11).

Digital Library

[6]

C. Chamaret and O. Le Meur. 2008. Attention-based video reframing: Validation using eye-tracking. In Proceedings of the International Conference on Pattern Recognition (ICPR'08).

[7]

D. DeCarlo and A. Santella. 2002. Stylization and abstraction of photographs. ACM Trans. Graph. 21, 3, 769--776.

Digital Library

[8]

T. Deselaers, P. Dreuw, and H. Ney. 2008. Pan, zoom, scan -- Time coherent, trained automatic video cropping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08). 1--8.

[9]

E. Dmytryk. 1984. On Film Editing. Focal Press.

[10]

M. Dorr, T. Martinetz, K. Gegenfurtner, and E. Barth. 2010. Variability of eye movements when viewing dynamic natural scenes. J. Vis. 10, 10.

[11]

H. El-Alfy, D. Jacobs, and L. Davis. 2007. Multi-scale video cropping. In Proceedings of the 15^th ACM International Conference on Multimedia (MULTIMEDIA'07). 97--106.

Digital Library

[12]

E. Erdfelder, F. Faul, and A. Buchner. 1996. Gpower: A general power analysis program. Behav. Res. Meth. Instrum. Comput. 28, 1, 1--11.

[13]

M. A. Fischler and R. C. Bolles. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM 24, 381--395.

Digital Library

[14]

J. D. Foley, A. Van Dam, S. K. Feiner, and J. F. Hughes. 1996. Computer Graphics Principles and Practice 2^nd Ed. Addison-Wesley.

Digital Library

[15]

R. B. Goldstein, R. L. Woods, and E. Peli. 2007. Where people look when watching movies: Do all viewers look at the same place&quest; Comput. Biol. Med. 37, 7, 957--964.

Digital Library

[16]

C. G. Healey and A. P. Sawant. 2012. On the limits of resolution and visual angle in visualization. ACM Trans. Appl. Percept. 9, 4, 20:1--20:21.

Digital Library

[17]

E. Jain, Y. Sheikh, and J. Hodgins. 2012. Inferring artistic intention in comic art through viewer gaze. In Proceedings of the ACM Symposium on Applied Perception (SAP'12).

Digital Library

[18]

T. Judd, F. Durand, and A. Torralba. 2012. A benchmark of computational models of saliency to predict human fixations. Tech. rep. MITCSAIL-TR-2012-001, Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/68590.

[19]

H. Katti, A. K. Rajagopal, M. Kankanhalli, and R. Kalpathi. 2014. Online estimation of evolving human visual interest. ACM Trans. Multimedia Comput. Comm. Appl. 11, 1.

Digital Library

[20]

S. D. Katz. 1991. Shot by Shot. Michael Wiese Productions, Focal Press.

[21]

H. Knoche, J. McCarthy, and M. Sasse. 2008. How low can you go&quest; The effect of low resolutions on shot types in mobile tv. Multimedia Tools Appl. 36, 1--2, 145--166.

Digital Library

[22]

S. Kopf, T. Haenselmann, J. Kiess, B. Guthier, and W. Effelsberg. 2011. Algorithms for video retargeting. Multimedia Tools Appl. 51, 2, 819--861.

Digital Library

[23]

P. Krähenbühl, M. Lang, A. Hornung, and M. Gross. 2009. A system for retargeting of streaming video. ACM Trans. Graph. 28, 126:1--126:10.

Digital Library

[24]

F. Liu and M. Gleicher. 2006. Video retargeting: Automating pan and scan. In Proceedings of the ACM International Conference on Multimedia (MULTIMEDIA'06). 241--250.

Digital Library

[25]

L. Liu, R. Chen, L. Wolf, and D. Cohen-Or. 2010. Optimizing photo composition. Comput. Graph. Forum 29, 2, 469--478.

[26]

P. K. Mital, T. J. Smith, R. L. Hill, and J. M. Henderson. 2010. Clustering of gaze during dynamic scene viewing is predicted by motion. Cogn. Comput. 3, 1, 5--24.

[27]

Y. Niu, F. Liu, X. Li, and M. Gleicher. 2010. Warp propagation for video resizing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10). 537--544.

[28]

M. Rubinstein, A. Shamir, and S. Avidan. 2008. Improved seam carving for video retargeting. ACM Trans. Graph. 27, 3, 16:1--16:9.

Digital Library

[29]

D. Rudoy, D. B. Goldman, E. Shechtman, and L. Zelnik-Manor. 2012. Crowdsourcing gaze data collection. http://arxiv.org/abs/1204. 3367.

[30]

A. Santella, M. Agrawala, D. DeCarlo, D. Salesin, and M. Cohen. 2006. Gaze-based interaction for semi-automatic photo cropping. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI'06). 771--780.

Digital Library

[31]

A. Shamir and O. Sorkine. 2009. Visual media retargeting. In Proceedings of the 1^st ACM SIGGRAPH Conference and Exhibition in Asia (SIGGRAPH-ASIA'09). 11:1--11:13.

Digital Library

[32]

T. J. Smith and J. M. Henderson. 2008. Edit blindness: The relationship between attention and global change blindness in dynamic scenes. J. Eye Movement Res. 2, 2, 1--17.

[33]

C. Tao, J. Jia, and H. Sun. 2007. Active window oriented dynamic video retargeting. In Proceedings of the Workshop on Dynamical Vision at the International Conference on Computer Vision (ICCV'07).

[34]

J. Wang, M. J. T. Reinders, R. L. Lagendijk, J. Lindenberg, and M. S. Kankanhalli. 2004. Video content representation on tiny devices. In Proceedings of the IEEE Conference on Multimedia and Expo (ICME'04). 1711--1714.

[35]

Y.-S. Wang, H. Fu, O. Sorkine, T.-Y. Lee, and H.-P. Seidel. 2009. Motion-aware temporal coherence for video resizing. ACM Trans. Graph. 28, 127:1--127:10.

Digital Library

[36]

Y.-S. Wang, J.-H. Hsiao, O. Sorkine, and T.-Y. Lee. 2011. Scalable and coherent video resizing with per-frame optimization. ACM Trans. Graph. 30, 4, 88:1--88:8.

Digital Library

[37]

Y.-S. Wang, H.-C. Lin, O. Sorkine, and T.-Y. Lee. 2010. Motionbased video retargeting with optimized crop-and-warp. ACM Trans. Graph. 29, 90:1--90:9.

Digital Library

[38]

Y.-S. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee. 2008. Optimized scale-and-stretch for image resizing. ACM Trans. Graph. 27, 118:1--118:8.

Digital Library

[39]

Wikipedia. 2015. http://en.wikipedia.org/wiki/pan_and_scan.

[40]

Y. Y. Xiang and M. S. Kankanhalli. 2010a. Automated aesthetic enhancement of videos. In Proceedings of the ACM International Conference on Multimedia (MM'10). 218--290.

Digital Library

[41]

Y.-Y. Xiang and M. S. Kankanhalli. 2010b. Video retargeting for aesthetic enhancement. In Proceedings of the ACM International Conference on Multimedia (MM'10). 919--922.

Digital Library

[42]

J. Young. 2008. Sydney Pollack dies at 73. Variety, May 26.

[43]

Q. Zhao and C. Koch. 2012. Learning visual saliency by combining feature maps in a nonlinear manner using adaboost. J. Vis. 12, 6.

Cited By

Majidi MSarkhoosh MMidoglu CSabet SKupka TJohansen DHalvorsen P(2024)SmartCrop-HProceedings of the 15th ACM Multimedia Systems Conference10.1145/3625468.3652195(471-477)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3625468.3652195
Dorcheh SSarkhoosh MMidoglu CSabet SKupka TRiegler MJohansen DHalvorsen P(2024)AI-Based Cropping of Sport Videos Using SmartCropInternational Journal of Semantic Computing10.1142/S1793351X2445002818:04(637-662)Online publication date: 27-Aug-2024
https://doi.org/10.1142/S1793351X24450028
Achary SGirmaji RDeshmukh AGandhi V(2024)Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00406(4096-4104)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00406
Show More Cited By

Index Terms

Gaze-Driven Video Re-Editing
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics
    1. Image manipulation
    2. Rendering

Recommendations

Computational video editing for dialogue-driven scenes

We present a system for efficiently editing video of dialogue-driven scenes. The input to our system is a standard film script and multiple video takes, each capturing a different camera framing or performance of the complete scene. Our system then ...
Multi-clip video editing from a single viewpoint
CVMP '14: Proceedings of the 11th European Conference on Visual Media Production

We propose a framework for automatically generating multiple clips suitable for video editing by simulating pan-tilt-zoom camera movements within the frame of a single static camera. Assuming important actors and objects can be localized using computer ...
Automatic Video Editing for Video-Based Interactive Storytelling
ICME '12: Proceedings of the 2012 IEEE International Conference on Multimedia and Expo

The development of interactive narratives with the quality of feature films is the central challenge of what we can name Video-Based Interactive Storytelling. A promising approach to this question is the use of prerecorded videos with real actors. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 34, Issue 2

February 2015

136 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2742222

Editor:
Holly Rushmeier
Yale University

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 March 2015

Accepted: 01 September 2014

Revised: 01 September 2014

Received: 01 December 2012

Published in TOG Volume 34, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
955
Total Downloads

Downloads (Last 12 months)38
Downloads (Last 6 weeks)2

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Majidi MSarkhoosh MMidoglu CSabet SKupka TJohansen DHalvorsen P(2024)SmartCrop-HProceedings of the 15th ACM Multimedia Systems Conference10.1145/3625468.3652195(471-477)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3625468.3652195
Dorcheh SSarkhoosh MMidoglu CSabet SKupka TRiegler MJohansen DHalvorsen P(2024)AI-Based Cropping of Sport Videos Using SmartCropInternational Journal of Semantic Computing10.1142/S1793351X2445002818:04(637-662)Online publication date: 27-Aug-2024
https://doi.org/10.1142/S1793351X24450028
Achary SGirmaji RDeshmukh AGandhi V(2024)Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00406(4096-4104)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00406
Sarkhoosh MDorcheh SMidoglu CSabet SKupka TJohansen DRiegler MHalvorsen P(2024)AI-Based Cropping of Ice Hockey Videos for Different Social Media RepresentationsIEEE Access10.1109/ACCESS.2024.344915212(118227-118249)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3449152
Sarkhoosh MDorcheh SMidoglu CSabet SKupka TJohansen DRiegler MHalvorsen P(2024)AI-Based Cropping of Soccer Videos for Different Social Media RepresentationsMultiMedia Modeling10.1007/978-3-031-53302-0_22(279-287)Online publication date: 29-Jan-2024
https://dl.acm.org/doi/10.1007/978-3-031-53302-0_22
Girmaji RAchary SDeshmukh AGandhi V(2023)Assessing active speaker detection algorithms through the lens of automated editingProceedings of the 2023 ACM International Conference on Interactive Media Experiences Workshops10.1145/3604321.3604373(123-130)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1145/3604321.3604373
Chen ZYang QShan JLin TBeyer JXia HPfister H(2023)iBall: Augmenting Basketball Videos with Gaze-moderated Embedded VisualizationsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581266(1-18)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581266
Dorcheh SSarkhoosh MMidoglu CSabet SKupka TRiegler MJohansen DHalvorsen P(2023)SmartCrop: AI-Based Cropping of Soccer Videos2023 IEEE International Symposium on Multimedia (ISM)10.1109/ISM59092.2023.00009(20-27)Online publication date: 11-Dec-2023
https://doi.org/10.1109/ISM59092.2023.00009
Hwang ELee J(2023)Attention-based automatic editing of virtual lectures for reduced production labor and effective learning experienceInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2023.103161(103161)Online publication date: Oct-2023
https://doi.org/10.1016/j.ijhcs.2023.103161
Lee DYoo JCho KKim BIm GNoh J(2022)PopStageACM Transactions on Graphics10.1145/3550454.355546741:6(1-13)Online publication date: 30-Nov-2022
https://dl.acm.org/doi/10.1145/3550454.3555467
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents