Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2858036.2858199acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Setwise Comparison: Consistent, Scalable, Continuum Labels for Computer Vision

Published: 07 May 2016 Publication History

Abstract

A growing number of domains, including affect recognition and movement analysis, require a single, real number ground truth label capturing some property of a video clip. We term this the provision of continuum labels. Unfortunately, there is often an uncacceptable trade-off between label consistency and the efficiency of the labelling process with current tools. We present a novel interaction technique, setwise comparison, which leverages the intrinsic human capability for consistent relative judgements and the TrueSkill algorithm to solve this problem. We describe SorTable, a system demonstrating this technique. We conducted a real-world study where clinicians labelled videos of patients with multiple sclerosis for the ASSESS MS computer vision system. In assessing the efficiency-consistency trade-off of setwise versus pairwise comparison, we demonstrated that not only is setwise comparison more efficient, but it also elicits more consistent labels. We further consider how our findings relate to the interactive machine learning literature.

Supplementary Material

suppl.mov (pn918.mp4)
Supplemental video

References

[1]
2016. TrueSkill Python Code. http://trueskill.org/. (2016). Accessed: Friday 8th January, 2016.
[2]
C.K. Abbey and M.P. Eckstein. 2002. Classification image analysis: estimation and statistical inference for two-alternative forced-choice experiments. Journal of vision 2, 1 (2002), 66--78.
[3]
S. Afzal and P. Robinson. 2014. Emotion Data Collection and Its Implications for Affective Computing. In The Oxford Handbook of Affective Computing. 359--369.
[4]
K. Ali, D. Hasler, and F. Fleuret. 2011. Flowboost -- appearance learning from sparsely annotated video. In IEEE computer vision and pattern recognition (CVPR).
[5]
Saleema Amershi, James Fogarty, Ashish Kapoor, and Desney S Tan. 2011. Effective End-User Interaction with Machine Learning. Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (2011), 1529--1532.
[6]
Paul N Bennett, David Maxwell Chickering, and Anton Mityagin. 2009. Learning consensus opinion: mining data from a labeling game. In Proceedings of the 18th international conference on World wide web. ACM, 121--130.
[7]
R. Bogacz, E. Brown, J. Moehlis, P. Holmes, and J.D. Cohen. 2006. The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasks. Psychological review 113, 4 (2006), 700.
[8]
RA Bradley. 1952. Rank Analysis of Incomplete Block Designs: The Method of Paired Comparisons. Biometrika 39 (1952), 324--345.
[9]
Carla E. Brodley and Mark A. Friedl. 1999. Identifying mislabeled training data. Journal of Artificial Intelligence Research (1999), 131--167.
[10]
Ben Carterette, Paul N. Bennett, David Maxwell Chickering, and Susan T. Dumais. 2008. Here or there preference judgments for relevance. Lecture Notes in Computer Science 4956 LNCS (2008), 16--27.
[11]
Jeffrey A Cohen, Stephen C Reingold, Chris H Polman, Jerry S Wolinsky, International Advisory Committee on Clinical Trials in Multiple Sclerosis, and others. 2012. Disability outcome measures in multiple sclerosis clinical trials: current status and future prospects. The Lancet Neurology 11, 5 (2012), 467--476.
[12]
R. Cowie, S. Douglas-Cowie, E. Savvidou, E. McMahon, M. Sawey, and M. Schröder. 2000. 'FEELTRACE': An instrument for recording perceived emotion in real time. In ISCA tutorial and research workshop (ITRW) on speech and emotion.
[13]
Jerry Alan Fails and Dan R. Olsen. 2003. Interactive machine learning. Proceedings of the 8th international conference on Intelligent user interfaces IUI '03 (2003), 39.
[14]
James Fogarty, Desney S Tan, Ashish Kapoor, and Simon Winder. 2008. CueFlik: interactive concept learning in image search. Proceeding of the twenty-sixth annual CHI conference on Human factors in computing systems CHI '08 (2008), 29.
[15]
Simon Fothergill, Robert Harle, and Sean Holden. 2008. Modeling the model athlete: Automatic coaching of rowing technique. In Structural, Syntactic, and Statistical Pattern Recognition. Springer, 372--381.
[16]
B. Frénay and M. Verleysen. 2014. Classification in the presence of label noise: a survey. IEEE Transactions on Neural Networks and Learning Systems 25, 5 (2014), 845--869.
[17]
Alex Groce, Todd Kulesza, Chaoqiang Zhang, Shalini Shamasunder, Margaret Burnett, Weng-Keen Wong, Simone Stumpf, Shubhomoy Das, Amber Shinsel, Forrest Bice, and Kevin McIntosh. 2014. You Are the Only Possible Oracle: Effective Test Selection for End Users of Interactive Machine Learning Systems. IEEE Transactions on Software Engineering 40, 3 (2014), 307--323.
[18]
Sandra G. Hart and Lowell E. Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. Advances in psychology 52 (1988), 139--183.
[19]
R. D. Hays, R. Anderson, and D. Revicki. 1993. Psychometric considerations in evaluating health-related quality of life measures. Quality of Life Research 2, 6 (dec 1993), 441--449. http://link.springer.com/article/10.1007/BF00422218
[20]
Ralf Herbrich, Tom Minka, and Thore Graepel. TrueSkill(TM): A Bayesian Skill Rating System. In Advances in Neural Information Processing Systems (NIPS2006). 2006.
[21]
P. G. Ipeirotis, F. Provost, V. S. Sheng, and J. Wang. 2014. Repeated labeling using multiple noisy labelers. Data Mining and Knowledge Discovery 28, 2 (2014), 402--441.
[22]
Christian P Kamm, Bernard MJ Uitdehaag, and Chris H Polman. 2014. Multiple sclerosis: current knowledge and future outlook. European neurology 72, 3--4 (2014), 132--141.
[23]
Peter Kontschieder, Jonas F Dorn, Cecily Morrison, Robert Corish, Darko Zikic, Abigail Sellen, Marcus D'Souza, Christian P Kamm, Jessica Burggraaff, Prejaas Tewarie, and others. 2014. Quantifying Progression of Multiple Sclerosis via Classification of Depth Videos. In Medical Image Computing and Computer-Assisted Intervention--MICCAI 2014. Springer, 429--437.
[24]
S. B. Kotsiantis, I. Zaharakis, and P. Pintelas. 2007. Supervised machine learning: A review of classification techniques. Informatica 31 (2007), 249--268.
[25]
Todd Kulesza, Saleema Amershi, Rich Caruana, Danyel Fisher, and Denis Charles. 2014. Structured labeling for facilitating concept evolution in machine learning. Proceedings of the 32nd annual ACM conference on Human factors in computing systems CHI '14 (2014), 3075--3084.
[26]
John F Kurtzke. 1983. Rating neurologic impairment in multiple sclerosis an expanded disability status scale (EDSS). Neurology 33, 11 (1983), 1444--1444.
[27]
Ivan Laptev, Marcin Marszalek, Cordelia Schmid, and Benjamin Rozenfeld. 2008. Learning Realistic Human Actions from Movies. In IEEE conference on computer vision and pattern recognition CVPR. 1--8.
[28]
Walter S Lasecki, Mitchell Gordon, Steven P Dow, and Jeffrey P Bigham. 2014. Glance : Rapidly Coding Behavioral Video with the Crowd. In Proceedings of UIST'14. 1--11.
[29]
Dan Lockton, David Harrison, and Neville Stanton. 2008. Design with Intent: Persuasive Technology in a Wider Context. In Persuasive Technology. Springer Berlin Heidelberg, Berlin, Heidelberg, 274--278.
[30]
Kenneth O McGraw and Seok P Wong. 1996. Forming inferences about some intraclass correlation coefficients. Psychological methods 1, 1 (1996), 30.
[31]
G. McKeown, M. Valstar, R. Cowie, M. Pantic, and M. Schroder. 2012. The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent. IEEE Transactions on Affective Computing 3, 1 (Jan 2012), 5--17.
[32]
F. Metze, D. Ding, E. Younessian, and A. Hauptmann. 2013. Beyond audio and video retrieval: topic-oriented multimedia summarization. International Journal of Multimedia Information Retrieval 2, 2 (2013), 131--144.
[33]
C. Morrison, K. Huckvale, B. Corish, J. Dorn, P. Kontschieder, K. O'Hara, ASSESS MS Team, A. Criminisi, and A. Sellen. 2016. Assessing Multiple Sclerosis with Kinect: Designing Computer Vision Systems for Real-World Use. To appear in Human-Computer Interaction (2016). http://research. microsoft.com/apps/pubs/default.aspx?id=255951
[34]
JH Noseworthy, MK Vandervoort, CJ Wong, and GC Ebers. 1990. Interrater variability with the Expanded Disability Status Scale (EDSS) and Functional Systems (FS) in a multiple sclerosis clinical trial. Neurology 40, 6 (1990), 971--971.
[35]
Advait Sarkar, Mateja Jamnik, Alan F. Blackwell, and Martin Spott. 2015. Interactive visual machine learning in spreadsheets. In Visual Languages and Human-Centric Computing (VL/HCC), 2015 IEEE Symposium on. IEEE, 159--163.
[36]
LL Thurstone. 1927. A law of comparative judgment. Psychol Rev 34 (1927), 273--286.
[37]
Job Van Exel and Gjalt de Graaf. 2005. Q methodology: A sneak preview. http://www.qmethodology.net/PDF/Q-methodology. (2005). Accessed: Friday 8th January, 2016.
[38]
Carl Vondrick, Donald Patterson, and Deva Ramanan. 2013. Efficiently scaling up crowdsourced video annotation: A set of best practices for high quality, economical video labeling. International Journal of Computer Vision 101, 1 (2013), 184--204.
[39]
Y. Yan, R. Rosales, G. Fung, M. W. Schmidt, G. H. Valadez, L. Bogoni, L Moy, and J. G. Dy. 2010. Modeling annotator expertise: Learning when everybody knows a bit of something. (pp. 932--939). In International conference on artificial intelligence and statistics. 932--939.

Cited By

View all
  • (2023)Assessment of Multiple Aspects of Upper Extremity Function Independent From Ambulation in Patients With Multiple SclerosisInternational Journal of MS Care10.7224/1537-2073.2021-06925:5(226-232)Online publication date: 14-Sep-2023
  • (2021)Improving Funding Operations of Equity‐based Crowdfunding PlatformsProduction and Operations Management10.1111/poms.1350530:11(4121-4139)Online publication date: 1-Nov-2021
  • (2021)A Survey of Human‐Centered Evaluations in Human‐Centered Machine LearningComputer Graphics Forum10.1111/cgf.1432940:3(543-568)Online publication date: 29-Jun-2021
  • Show More Cited By

Index Terms

  1. Setwise Comparison: Consistent, Scalable, Continuum Labels for Computer Vision

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '16: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems
    May 2016
    6108 pages
    ISBN:9781450333627
    DOI:10.1145/2858036
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. computer vision
    2. continuum labels
    3. health
    4. interactive machine learning
    5. machine learning
    6. setwise comparison
    7. video media

    Qualifiers

    • Research-article

    Conference

    CHI'16
    Sponsor:
    CHI'16: CHI Conference on Human Factors in Computing Systems
    May 7 - 12, 2016
    California, San Jose, USA

    Acceptance Rates

    CHI '16 Paper Acceptance Rate 565 of 2,435 submissions, 23%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 24 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Assessment of Multiple Aspects of Upper Extremity Function Independent From Ambulation in Patients With Multiple SclerosisInternational Journal of MS Care10.7224/1537-2073.2021-06925:5(226-232)Online publication date: 14-Sep-2023
    • (2021)Improving Funding Operations of Equity‐based Crowdfunding PlatformsProduction and Operations Management10.1111/poms.1350530:11(4121-4139)Online publication date: 1-Nov-2021
    • (2021)A Survey of Human‐Centered Evaluations in Human‐Centered Machine LearningComputer Graphics Forum10.1111/cgf.1432940:3(543-568)Online publication date: 29-Jun-2021
    • (2021)User‐guided global explanations for deep image recognitionApplied AI Letters10.1002/ail2.422:4Online publication date: 6-Dec-2021
    • (2019)Using rating arrays to estimate score distributions for player-versus-level matchmakingProceedings of the 14th International Conference on the Foundations of Digital Games10.1145/3337722.3337758(1-8)Online publication date: 26-Aug-2019
    • (2019)Setwise comparison: efficient fine-grained rating of movement videos using algorithmic support – a proof of concept studyDisability and Rehabilitation10.1080/09638288.2018.1563832(1-7)Online publication date: 20-Feb-2019
    • (2018)Tasks of activities of daily living (ADL) are more valuable than the classical neurological examination to assess upper extremity function and mobility in multiple sclerosisMultiple Sclerosis Journal10.1177/135245851879669025:12(1673-1681)Online publication date: 31-Aug-2018
    • (2018)A Review of User Interface Design for Interactive Machine LearningACM Transactions on Interactive Intelligent Systems10.1145/31855178:2(1-37)Online publication date: 13-Jun-2018
    • (2018)Visualizing Ubiquitously Sensed Measures of Motor Ability in Multiple SclerosisACM Transactions on Interactive Intelligent Systems10.1145/31816708:2(1-28)Online publication date: 14-Jul-2018
    • (2018)EASELProceedings of the 23rd International Conference on Intelligent User Interfaces10.1145/3172944.3173003(595-599)Online publication date: 5-Mar-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media