Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2380116.2380129acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections

Waken: reverse engineering usage information and interface structure from software videos

Published: 07 October 2012 Publication History


We present Waken, an application-independent system that recognizes UI components and activities from screen captured videos, without any prior knowledge of that application. Waken can identify the cursors, icons, menus, and tooltips that an application contains, and when those items are used. Waken uses frame differencing to identify occurrences of behaviors that are common across graphical user interfaces. Candidate templates are built, and then other occurrences of those templates are identified using a multi-phase algorithm. An evaluation demonstrates that the system can successfully reconstruct many aspects of a UI without any prior application-dependant knowledge. To showcase the design opportunities that are introduced by having this additional meta-data, we present the Waken Video Player, which allows users to directly interact with UI components that are displayed in the video.

Supplementary Material

JPG File (waken_uist12_correctedtitleslide.jpg)
suppl.mov (waken_uist12_correctedtitleslide.mp4)
Supplemental video


Bergman, L., Castelli, V., Lau, T., and Oblinger, D. 2005. DocWizards: a system for authoring follow-me documentation wizards. ACM UIST, 191--200.
Chang, T., Yeh, T., and Miller, R. C. 2010. GUI testing using computer vision. ACM CHI, 1535--1544.
Chang, T., Yeh, T., and Miller, R. C. 2011. Associating the visual representation of user interfaces with their internal structures and metadata. ACM UIST, 245--256.
Cheng, K., Luo, S., and Chen, B. 2009. SmartPlayer: user-centric video fast-forwarding. ACM CHI, 789--798.
Dixon M. and Fogarty, J. 2010. Prefab: implementing advanced behaviors using pixel-based reverse engineering of interface structure. ACM CHI, 1525--1534.
Dixon, M., Leventhal, D., and Fogarty, J. 2011. Content and hierarchy in pixel-based methods for reverse engineering interface structure. ACM CHI, 969--978.
Dixon M., Fogarty, J., and Wobbrock, J. 2012. A general-purpose target-aware pointing enhancement using pixel-level analysis on graphical interfaces. ACM CHI, 3167--3176.
Dong, T., Dontcheva, M., Joseph, D., Karahalios, K., Newman, M. W., and Ackerman, M. S. 2012. Discovery-based Games for Learning Software. ACM CHI, 2083--2086.
Dragicevic, P., Ramos, G., Bibliowitcz, J., Nowrouzezahrai, D., Balakrishnan, R., and Singh, K. 2008. Video browsing by direct manipulation. ACM CHI, 237--246.
Fernquist, J., Grossman, T., and Fitzmaurice, G. 2011. Sketch-sketch revolution: an engaging tutorial system for guided sketching and application learning. ACM UIST, 373--382.
Grabler, F., Agrawala, M., Li, W., Dontcheva, M., and Igarashi, T. 2009. Generating photo manipulation tutorials by demonstration. ACM SIGGRAPH. 66. 9 pages.
Grossman, T. and Fitzmaurice, G. 2010. ToolClips: an investigation of contextual video assistance for functionality understanding. ACM CHI, 1515--1524.
Grossman, T., Matejka, J., and Fitzmaurice, G. 2010. Chronicle: capture, exploration, and playback of document workflow histories. ACM UIST, 143--152.
Harrison, S. M. 1995. A comparison of still, animated, or nonillustrated on-line help with written or spoken instructions in a graphical user interface. ACM CHI, 82--89.
Hategekimana, C., Gilbert, S., and Blessing, S. 2008. Effectiveness of using an intelligent tutoring system to train users on off-the-shelf software. Society for Info. Tech. and Teacher Education Int'l Conf., AACE.
Hu, M. 1962. Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory IT-8, 8, 1409--1420.
Huang, J. and Twidale, M. B. 2007. Graphstract: minimal graphical help for computers. ACM UIST, 203--212.
Hurst, A., Hudson, S. E., and Mankoff, J. 2010. Automatically identifying targets users interact with during real world tasks. ACM IUI, 11--20.
Kelleher, C. and Pausch, R. 2005. Stencils-based tutorials: design and evaluation. ACM CHI, 541--550.
Knabe, K. 1995. Apple guide: a case study in user-aided design of online help. ACM CHI, 286--287.
Matejka, J., Grossman, T., and Fitzmaurice, G. 2011. Ambient help. ACM CHI, 2751--2760.
Nakamura, T. and Igarashi, T. 2008. An application-independent system for visualizing user operation history. ACM UIST, 23--32.
Palmiter, S. and Elkerton, J. 1991. An evaluation of animated demonstrations of learning computer-based tasks. ACM CHI, 257--263.
Petrovic, N., Jojic, N., and Huang, T. S. 2005. Adaptive Video Fast Forward. Multimed. Tools Appl. 26, 3.
Pongnumkul, S., Wang, J., Ramos, G., and Cohen, M. 2010. Content-aware dynamic timeline for video browsing. ACM UIST, 139--142.
Pongnumkul, S., Dontcheva, M., Li, W., Wang, J., Bourdev, L., Avidan, S., and Cohen, M. F. 2011. Pause-and-play: automatically linking screencast video tutorials with applications. ACM UIST, 135--144.
Ramesh, V., Hsu, C., Agrawala, M., and Hartmann, B. 2011. ShowMeHow: translating user interface instructions between applications. ACM UIST, 127--134.
Shneiderman, B. 1983. Direct Manipulation: A Step Beyond Programming Languages. Compt. 16, 8, 57--69.
Suzuki, S. and Abe, K. 1985. Topological structural analysis of digitized binary images by border following. Comput. Vision Graph., 30(1), 32--46.
Yeh, T., Chang, T., and Miller R. C. 2009. Sikuli: using GUI screenshots for search and automation. ACM UIST, 183--192.
Yeh, T., Chang, T., Xie, B., Walsh, G., Watkins, I., Wongsuphasawat, K., Huang, M., Davis, L. S., and Bederson, B. B. 2011. Creating contextual help for GUIs using screenshots. ACM UIST, 145--154.

Cited By

View all
  • (2024)AQuA: Automated Question-Answering in Software Tutorial Videos with Visual AnchorsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642752(1-19)Online publication date: 11-May-2024
  • (2023)Video2Action: Reducing Human Interactions in Action Annotation of App Tutorial VideosProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606778(1-15)Online publication date: 29-Oct-2023
  • (2023)SmartRecorder: An IMU-based Video Tutorial Creation by Demonstration System for Smartphone Interaction TasksProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584069(278-293)Online publication date: 27-Mar-2023
  • Show More Cited By

Index Terms

  1. Waken: reverse engineering usage information and interface structure from software videos



    Information & Contributors


    Published In

    cover image ACM Conferences
    UIST '12: Proceedings of the 25th annual ACM symposium on User interface software and technology
    October 2012
    608 pages
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 October 2012


    Request permissions for this article.

    Check for updates

    Author Tags

    1. pixel-based reverse engineering
    2. tutorials
    3. videos


    • Research-article


    UIST '12

    Acceptance Rates

    Overall Acceptance Rate 561 of 2,567 submissions, 22%

    Upcoming Conference

    UIST '25
    The 38th Annual ACM Symposium on User Interface Software and Technology
    September 28 - October 1, 2025
    Busan , Republic of Korea


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Feb 2025

    Other Metrics


    Cited By

    View all
    • (2024)AQuA: Automated Question-Answering in Software Tutorial Videos with Visual AnchorsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642752(1-19)Online publication date: 11-May-2024
    • (2023)Video2Action: Reducing Human Interactions in Action Annotation of App Tutorial VideosProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606778(1-15)Online publication date: 29-Oct-2023
    • (2023)SmartRecorder: An IMU-based Video Tutorial Creation by Demonstration System for Smartphone Interaction TasksProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584069(278-293)Online publication date: 27-Mar-2023
    • (2023)Screen Recognition: Creating Accessibility Metadata for Mobile Applications using View Type Detection2023 9th International Conference on Computer and Communications (ICCC)10.1109/ICCC59590.2023.10507590(1787-1793)Online publication date: 8-Dec-2023
    • (2023)Design Actions for the Design of Visualization Onboarding Methods2023 IEEE VIS Workshop on Visualization Education, Literacy, and Activities (EduVis)10.1109/EduVis60792.2023.00007(1-10)Online publication date: 22-Oct-2023
    • (2023)Visualization Onboarding Grounded in Educational TheoriesVisualization Psychology10.1007/978-3-031-34738-2_6(139-164)Online publication date: 7-Nov-2023
    • (2022)Contextual in situ help for visual data interfacesInformation Visualization10.1177/1473871622112006422:1(69-84)Online publication date: 9-Sep-2022
    • (2022)Describing UI Screenshots in Natural LanguageACM Transactions on Intelligent Systems and Technology10.1145/356470214:1(1-28)Online publication date: 9-Nov-2022
    • (2022)PONI: A Personalized Onboarding Interface for Getting Inspiration and Learning About AR/VR CreationNordic Human-Computer Interaction Conference10.1145/3546155.3546642(1-14)Online publication date: 8-Oct-2022
    • (2022)SoftVideo: Improving the Learning Experience of Software Tutorial Videos with Collective Interaction DataProceedings of the 27th International Conference on Intelligent User Interfaces10.1145/3490099.3511106(646-660)Online publication date: 22-Mar-2022
    • Show More Cited By

    View Options

    Login options

    View options


    View or Download as a PDF file.



    View online with eReader.







    Share this Publication link

    Share on social media