Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Enhancing Deep Learning with Visual Interactions

Published: 01 March 2019 Publication History

Abstract

Deep learning has emerged as a powerful tool for feature-driven labeling of datasets. However, for it to be effective, it requires a large and finely labeled training dataset. Precisely labeling a large training dataset is expensive, time-consuming, and error prone. In this article, we present a visually driven deep-learning approach that starts with a coarsely labeled training dataset and iteratively refines the labeling through intuitive interactions that leverage the latent structures of the dataset. Our approach can be used to (a) alleviate the burden of intensive manual labeling that captures the fine nuances in a high-dimensional dataset by simple visual interactions, (b) replace a complicated (and therefore difficult to design) labeling algorithm by a simpler (but coarse) labeling algorithm supplemented by user interaction to refine the labeling, or (c) use low-dimensional features (such as the RGB colors) for coarse labeling and turn to higher-dimensional latent structures that are progressively revealed by deep learning, for fine labeling. We validate our approach through use cases on three high-dimensional datasets and a user study.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI’16), Vol. 16. 265--283.
[2]
Shun-ichi Amari, Andrzej Cichocki, and Howard Hua Yang. 1996. A new learning algorithm for blind signal separation. In Advances in Neural Information Processing Systems. 757--763.
[3]
Saleema Amershi, Bongshin Lee, Ashish Kapoor, Ratul Mahajan, and Blaine Christian. 2011. CueT: Human-guided fast and accurate network alarm triage. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 157--166.
[4]
El-ad David Amir, Kara L. Davis, Michelle D. Tadmor, Erin F. Simonds, Jacob H. Levine, Sean C. Bendall, Daniel K. Shenfeld, Smita Krishnaswamy, Garry P. Nolan, and Dana Pe’er. 2013. viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nat. Biotechnol. 31, 6 (2013), 545--552.
[5]
William N. Anderson Jr and Thomas D. Morley. 1985. Eigenvalues of the Laplacian of a graph. Lin. Multilin. Algebr. 18, 2 (1985), 141--145.
[6]
Mikhail Belkin and Partha Niyogi. 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neur. Comput. 15, 6 (2003), 1373--1396.
[7]
Yushi Chen, Zhouhan Lin, Xing Zhao, Gang Wang, and Yanfeng Gu. 2014. Deep learning-based classification of hyperspectral data. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 7, 6 (2014), 2094--2107.
[8]
Hsueh-Chien Cheng, Antonio Cardone, Somay Jain, Eric Krokos, Kedar Narayan, Sriram Subramaniam, and Amitabh Varshney. 2018. Deep-learning-assisted volume visualization. IEEE Trans. Vis. Comput. Graph. 25, 2 (22 Jan. 2018), 1--14.
[9]
Hsueh-Chien Cheng, Antonio Cardone, Eric Krokos, Bogdan Stoica, Alan Faden, and Amitabh Varshney. 2017. Deep-learning-assisted visualization for live-cell images. In Proceedings of 2017 IEEE International Conference on Image Processing (ICIP’17). IEEE.
[10]
François Chollet, et al. 2015. Keras: Deep learning library for theano and tensorflow. Retrieved from https://keras.io/k.
[11]
Tuan Nhon Dang and Leland Wilkinson. 2014. Transforming scagnostics to reveal hidden features. IEEE Trans. Vis. Comput. Graph. 20, 12 (2014), 1624--1632.
[12]
Fabio Dell’Acqua, Paolo Gamba, and Alessio Ferrari. 2003. Exploiting spectral and spatial information for classifying hyperspectral data in urban areas. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS’03), Vol. 1. IEEE, 464--466.
[13]
Alex Endert, Patrick Fiaux, and Chris North. 2012. Semantic interaction for visual text analytics. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 473--482.
[14]
Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010. Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11 (Feb. 2010), 625--660.
[15]
Shantanu Godbole and Sunita Sarawagi. 2004. Discriminative methods for multi-labeled classification. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 22--30.
[16]
Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, 1735--1742.
[17]
Geoffrey E. Hinton and Ruslan R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504--507.
[18]
Nguyen Quoc Viet Hung, Duong Chi Thang, Matthias Weidlich, and Karl Aberer. 2015. Minimizing efforts in validating crowd answers. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 999--1014.
[19]
Cheuk Yiu Ip, Amitabh Varshney, and Joseph JaJa. 2012. Hierarchical exploration of volumes using multilevel segmentation of the intensity-gradient histograms. IEEE Trans. Vis. Comput. Graph. 18, 12 (2012), 2355--2363.
[20]
Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning. Springer, 137--142.
[21]
Xudong Kang, Shutao Li, and Jon Atli Benediktsson. 2014. Spectral--spatial hyperspectral image classification with edge-preserving filtering. IEEE Trans. Geosci. Remote Sens. 52, 5 (2014), 2666--2677.
[22]
Yehuda Koren. 2003. On spectral graph drawing. In Computing and Combinatorics, Tandy Warnow and Binhai Zhu (Eds.). Lecture Notes in Computer Science, Vol. 2697. Springer, Berlin, 496--508.
[23]
Eric Krokos, Catherine Plaisant, and Amitabh Varshney. 2018. Virtual memory palaces: Immersion aids recall. Virtual Reality (2018), 1--15. https://link.springer.com/article/10.1007/s10055-018-0346-3
[24]
Eric Krokos and Hanan Samet. 2014. A look into twitter hashtag discovery and generation. In Proceedings of the 7th ACM SIGSPATIAL Workshop on Location-Based Social Networks (LBSN’14).
[25]
Joseph B. Kruskal. 1964. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29, 1 (1964), 1--27.
[26]
David A. Landgrebe. 2005. Signal Theory Methods in Multispectral Remote Sensing. Vol. 29. John Wiley 8 Sons.
[27]
Dong-Hyun Lee. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the Workshop on Challenges in Representation Learning (ICML’13), Vol. 3. 2.
[28]
Shusen Liu, Bei Wang, P.-T. Bremer, and Valerio Pascucci. 2014. Distortion-guided structure-driven interactive exploration of high-dimensional data. In Computer Graphics Forum, Vol. 33. Wiley Online Library, 101--110.
[29]
Shusen Liu, Bei Wang, Jayaraman J Thiagarajan, P.-T. Bremer, and Valerio Pascucci. 2015. Visual exploration of high-dimensional data through subspace analysis and dynamic projections. In Computer Graphics Forum, Vol. 34. Wiley Online Library, 271--280.
[30]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9 (Nov. 2008), 2579--2605.
[31]
Jason Matheny. 2016. Intelligence Advanced Research Projects Activity. (2016). In Proceedings of the 3rd Annual BRAIN Initiative Investigators Meeting.
[32]
Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML’10). 807--814.
[33]
Mohammad Pezeshki, Linxi Fan, Philemon Brakel, Aaron Courville, and Yoshua Bengio. 2016. Deconstructing the ladder network architecture. In Proceedings of the International Conference on Machine Learning. 2368--2376.
[34]
Antti Rasmus, Mathias Berglund, Mikko Honkala, Harri Valpola, and Tapani Raiko. 2015. Semi-supervised learning with ladder networks. In Advances in Neural Information Processing Systems. 3546--3554.
[35]
Sam T. Roweis and Lawrence K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500 (2000), 2323--2326.
[36]
Dominik Sacha, Leishi Zhang, Michael Sedlmair, John A. Lee, Jaakko Peltonen, Daniel Weiskopf, Stephen C. North, and Daniel A. Keim. 2017. Visual interaction with dimensionality reduction: A structured literature analysis. IEEE Trans. Vis. Comput. Graph. 23, 1 (2017), 241--250.
[37]
Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8 (2000), 888--905.
[38]
Axel J. Soto, Ryan Kiros, Vlado Kešelj, and Evangelos Milios. 2015. Exploratory visual analysis and interactive pattern extraction from semi-structured data. ACM Trans. Interact. Intell. Syst. 5, 3 (2015), 16.
[39]
Graham W. Taylor, Geoffrey E. Hinton, and Sam T. Roweis. 2007. Modeling human motion using binary latent variables. Adv. Neur. Inf. Process. Syst. 20 (2007), 1345--1357.
[40]
Konstantinos Trohidis, Grigorios Tsoumakas, George Kalliris, and Ioannis P. Vlahavas. 2008. Multi-label classification of music into emotions. In The International Society of Music Information Retrieval, Vol. 5. Springer, 325--330.
[41]
Cagatay Turkay, Erdem Kaya, Selim Balcisoy, and Helwig Hauser. 2017. Designing progressive and interactive analytics processes for high-dimensional data analysis. IEEE Trans. Vis. Comput. Graph. 23, 1 (2017), 131--140.
[42]
Naonori Ueda and Kazumi Saito. 2003. Parametric mixture models for multi-labeled text. Advances in Neural Information Processing Systems 15 (2003), 737--744.
[43]
Malcolm Ware, Eibe Frank, Geoffrey Holmes, Mark Hall, and Ian H. Witten. 2001. Interactive machine learning: Letting users build classifiers. Int. J. Hum.-Comput. Stud. 55, 3 (2001), 281--292.
[44]
Martin Wattenberg, Fernanda Viégas, and Ian Johnson. 2016. How to use t-SNE effectively. Distill 5, 1 (2016).
[45]
Ka-Ping Yee, Kirsten Swearingen, Kevin Li, and Marti Hearst. 2003. Faceted metadata for image search and browsing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 401--408.
[46]
Matthew D. Zeiler. 2012. ADADELTA: An adaptive learning rate method. CoRR abs/1212.5701.

Cited By

View all
  • (2023)Explainable interactive projections of imagesMachine Vision and Applications10.1007/s00138-023-01452-934:6Online publication date: 13-Sep-2023
  • (2022)A survey of human-in-the-loop for machine learningFuture Generation Computer Systems10.1016/j.future.2022.05.014135(364-381)Online publication date: Oct-2022

Index Terms

  1. Enhancing Deep Learning with Visual Interactions

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Interactive Intelligent Systems
        ACM Transactions on Interactive Intelligent Systems  Volume 9, Issue 1
        March 2019
        168 pages
        ISSN:2160-6455
        EISSN:2160-6463
        DOI:10.1145/3312745
        Issue’s Table of Contents
        © 2019 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 01 March 2019
        Accepted: 01 July 2018
        Revised: 01 June 2018
        Received: 01 July 2017
        Published in TIIS Volume 9, Issue 1

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Deep learning
        2. dimensionality reduction
        3. knowledge discovery
        4. semantic interaction
        5. visual interaction

        Qualifiers

        • Research-article
        • Research
        • Refereed

        Funding Sources

        • DOD contract
        • NSF
        • State of Maryland's MPower initiative

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)111
        • Downloads (Last 6 weeks)24
        Reflects downloads up to 19 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Explainable interactive projections of imagesMachine Vision and Applications10.1007/s00138-023-01452-934:6Online publication date: 13-Sep-2023
        • (2022)A survey of human-in-the-loop for machine learningFuture Generation Computer Systems10.1016/j.future.2022.05.014135(364-381)Online publication date: Oct-2022

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Login options

        Full Access

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media