Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1660877.1660901acmotherconferencesArticle/Chapter ViewAbstractPublication PagespermisConference Proceedingsconference-collections
research-article

Evaluation of an integrated multi-task machine learning system with humans in the loop

Published: 28 August 2007 Publication History

Abstract

Performance of a cognitive personal assistant, RADAR, consisting of multiple machine learning components, natural language processing, and optimization was examined with a test explicitly developed to measure the impact of integrated machine learning when used by a human user in a real world setting. Three conditions (conventional tools, Radar without learning, and Radar with learning) were evaluated in a large-scale, between-subjects study. The study revealed that integrated machine learning does produce a positive impact on overall performance. This paper also discusses how specific machine learning components contributed to human-system performance.

References

[1]
Steinfeld, A., Bennett, R., Cunningham, K., Lahut, M., Quinones, P.-A., Wexler, D., Siewiorek, D., Cohen, P., Fitzgerald, J., Hansson, O., Hayes, J., Pool, M., and Drummond, M., The RADAR Test Methodology: Evaluating a Multi-Task Machine Learning System with Humans in the Loop. 2006, Carnegie Mellon University, School of Computer Science: Pittsburgh, PA. http://reports-archive.adm.cs.cmu.edu/anon/2006/abs tracts/06-125.html
[2]
Clymer, J. R. Simulation of a vehicle traffic control network using a fuzzy classifier system. In Proc. of the IEEE Simulation Symposium. 2002.
[3]
Clymer, J. R. and Harrsion, V. Simulation of air traffic control at a VFR airport using OpEMCSS. In Proc. IEEE Digital Avionics Systems Conference. 2002.
[4]
Zhang, L., Samaras, D., Tomasi, D., Volkow, N., and Goldstein, R. Machine learning for clinical diagnosis from functional magnetic resonance imaging. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2005.
[5]
Hu, Y., Li, H., Cao, Y., Meyerzon, D., and Zheng, Q. Automatic extraction of titles from general documents using machine learning. In Proc. of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL). 2005.
[6]
Allen, J., Chambers, N., Ferguson, G., Galescu, L., Jung, H., Swift, M., and Taysom, W. PLOW: A Collaborative Task Learning Agent. In Proc. Conference on Artificial Intelligence (AAAI). 2007. Vancouver, Canada.
[7]
Schrag, R., Pool, M., Chaudhri, V., Kahlert, R., Powers, J., Cohen, P., Fitzgerald, J., and Mishra, S. Experimental evaluation of subject matter expert-oriented knowledge base authoring tools. In Proc. NIST Performance Metrics for Intelligent Systems Workshop. 2002. http://www.iet.com/Projects/RKF/PerMIS02.doc
[8]
Shen, J., Li, L., Dietterich, T. G., and Herlocker, J. L. A hybrid learning system for recognizing user tasks from desktop activities and email messages. In Proc. International Conference on Intelligent User Interfaces (IUI). 2006.
[9]
Yoo, J., Gervasio, M., and Langley, P. An adaptive stock tracker for personalized trading advice. In Proc. International Conference on Intelligent User Interfaces (IUI). 2003.
[10]
Airspace: Tools for evaluating complex systems, machine language, and complex tasks. http://www.cs.cmu.edu/~airspace
[11]
Steinfeld, A., Quinones, P.-A., Zimmerman, J., Bennett, S. R., and Siewiorek, D. Survey measures for evaluation of cognitive assistants. In Proc. NIST Performance Metrics for Intelligent Systems Workshop (PerMIS). 2007.

Cited By

View all
  • (2018)Extending Knowledge Graphs with Subjective Influence Networks for Personalized FashionDesigning Cognitive Cities10.1007/978-3-030-00317-3_9(203-233)Online publication date: 19-Sep-2018
  • (2017)Evaluating intelligent knowledge systemsKnowledge and Information Systems10.1007/s10115-016-1011-352:2(379-409)Online publication date: 1-Aug-2017
  • (2012)Detection of imperative and declarative question--answer pairs in email conversationsAI Communications10.5555/2594622.259462325:4(271-283)Online publication date: 1-Oct-2012
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PerMIS '07: Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems
August 2007
293 pages
ISBN:9781595938541
DOI:10.1145/1660877
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • NIST: National Institute of Standards and Technology

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 August 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. evaluation
  2. intelligent systems
  3. machine learning
  4. mixed-initiative assistants

Qualifiers

  • Research-article

Funding Sources

Conference

PerMIS07
Sponsor:
  • NIST

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Extending Knowledge Graphs with Subjective Influence Networks for Personalized FashionDesigning Cognitive Cities10.1007/978-3-030-00317-3_9(203-233)Online publication date: 19-Sep-2018
  • (2017)Evaluating intelligent knowledge systemsKnowledge and Information Systems10.1007/s10115-016-1011-352:2(379-409)Online publication date: 1-Aug-2017
  • (2012)Detection of imperative and declarative question--answer pairs in email conversationsAI Communications10.5555/2594622.259462325:4(271-283)Online publication date: 1-Oct-2012
  • (2011)PTIMEACM Transactions on Intelligent Systems and Technology10.1145/1989734.19897442:4(1-22)Online publication date: 15-Jul-2011
  • (2010)Software engineering in an uncertain worldProceedings of the FSE/SDP workshop on Future of software engineering research10.1145/1882362.1882389(125-128)Online publication date: 7-Nov-2010
  • (2010)Agent-assisted task management that reduces email overloadProceedings of the 15th international conference on Intelligent user interfaces10.1145/1719970.1719980(61-70)Online publication date: 7-Feb-2010
  • (2008)A demonstration of the RADAR personal assistantProceedings of the 23rd national conference on Artificial intelligence - Volume 310.5555/1620270.1620397(1876-1877)Online publication date: 13-Jul-2008
  • (2008)RADARProceedings of the 23rd national conference on Artificial intelligence - Volume 310.5555/1620270.1620274(1287-1293)Online publication date: 13-Jul-2008
  • (2008)What to do when search failsProceedings of the SIGCHI Conference on Human Factors in Computing Systems10.1145/1357054.1357208(999-1008)Online publication date: 6-Apr-2008
  • (2007)Survey measures for evaluation of cognitive assistantsProceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems10.1145/1660877.1660902(175-179)Online publication date: 28-Aug-2007

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media