research-article

Evaluation of an integrated multi-task machine learning system with humans in the loop

Authors:

Pablo-Alejandro Quinones,

Mark DrummondAuthors Info & Claims

PerMIS '07: Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems

Pages 168 - 174

https://doi.org/10.1145/1660877.1660901

Published: 28 August 2007 Publication History

Get Access

Abstract

Performance of a cognitive personal assistant, RADAR, consisting of multiple machine learning components, natural language processing, and optimization was examined with a test explicitly developed to measure the impact of integrated machine learning when used by a human user in a real world setting. Three conditions (conventional tools, Radar without learning, and Radar with learning) were evaluated in a large-scale, between-subjects study. The study revealed that integrated machine learning does produce a positive impact on overall performance. This paper also discusses how specific machine learning components contributed to human-system performance.

References

[1]

Steinfeld, A., Bennett, R., Cunningham, K., Lahut, M., Quinones, P.-A., Wexler, D., Siewiorek, D., Cohen, P., Fitzgerald, J., Hansson, O., Hayes, J., Pool, M., and Drummond, M., The RADAR Test Methodology: Evaluating a Multi-Task Machine Learning System with Humans in the Loop. 2006, Carnegie Mellon University, School of Computer Science: Pittsburgh, PA. http://reports-archive.adm.cs.cmu.edu/anon/2006/abs tracts/06-125.html

Google Scholar

[2]

Clymer, J. R. Simulation of a vehicle traffic control network using a fuzzy classifier system. In Proc. of the IEEE Simulation Symposium. 2002.

Digital Library

Google Scholar

[3]

Clymer, J. R. and Harrsion, V. Simulation of air traffic control at a VFR airport using OpEMCSS. In Proc. IEEE Digital Avionics Systems Conference. 2002.

Crossref

Google Scholar

[4]

Zhang, L., Samaras, D., Tomasi, D., Volkow, N., and Goldstein, R. Machine learning for clinical diagnosis from functional magnetic resonance imaging. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2005.

Digital Library

Google Scholar

[5]

Hu, Y., Li, H., Cao, Y., Meyerzon, D., and Zheng, Q. Automatic extraction of titles from general documents using machine learning. In Proc. of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL). 2005.

Digital Library

Google Scholar

[6]

Allen, J., Chambers, N., Ferguson, G., Galescu, L., Jung, H., Swift, M., and Taysom, W. PLOW: A Collaborative Task Learning Agent. In Proc. Conference on Artificial Intelligence (AAAI). 2007. Vancouver, Canada.

Digital Library

Google Scholar

[7]

Schrag, R., Pool, M., Chaudhri, V., Kahlert, R., Powers, J., Cohen, P., Fitzgerald, J., and Mishra, S. Experimental evaluation of subject matter expert-oriented knowledge base authoring tools. In Proc. NIST Performance Metrics for Intelligent Systems Workshop. 2002. http://www.iet.com/Projects/RKF/PerMIS02.doc

Google Scholar

[8]

Shen, J., Li, L., Dietterich, T. G., and Herlocker, J. L. A hybrid learning system for recognizing user tasks from desktop activities and email messages. In Proc. International Conference on Intelligent User Interfaces (IUI). 2006.

Digital Library

Google Scholar

[9]

Yoo, J., Gervasio, M., and Langley, P. An adaptive stock tracker for personalized trading advice. In Proc. International Conference on Intelligent User Interfaces (IUI). 2003.

Digital Library

Google Scholar

[10]

Airspace: Tools for evaluating complex systems, machine language, and complex tasks. http://www.cs.cmu.edu/~airspace

Google Scholar

[11]

Steinfeld, A., Quinones, P.-A., Zimmerman, J., Bennett, S. R., and Siewiorek, D. Survey measures for evaluation of cognitive assistants. In Proc. NIST Performance Metrics for Intelligent Systems Workshop (PerMIS). 2007.

Digital Library

Google Scholar

Cited By

View all

Bollacker KDíaz-Rodríguez NLi X(2018)Extending Knowledge Graphs with Subjective Influence Networks for Personalized FashionDesigning Cognitive Cities10.1007/978-3-030-00317-3_9(203-233)Online publication date: 19-Sep-2018
https://doi.org/10.1007/978-3-030-00317-3_9
Berry PDonneau-Golencer TDuong KGervasio MPeintner BYorke-Smith N(2017)Evaluating intelligent knowledge systemsKnowledge and Information Systems10.1007/s10115-016-1011-352:2(379-409)Online publication date: 1-Aug-2017
https://dl.acm.org/doi/10.1007/s10115-016-1011-3
Kwong HYorke-Smith N(2012)Detection of imperative and declarative question--answer pairs in email conversationsAI Communications10.5555/2594622.259462325:4(271-283)Online publication date: 1-Oct-2012
https://dl.acm.org/doi/10.5555/2594622.2594623
Show More Cited By

Recommendations

Machine Learning Task as a Diclique Extracting Task
FSKD '09: Proceedings of the 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 01

As we know there exist several approaches and algorithms for data mining and machine learning task solution, for example, decision tree learning, artificial neural networks, Bayesian learning, instance-based learning, genetic algorithms, etc. They are ...
Machine learning task as a diclique extracting task
FSKD'09: Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1

As we know there exist several approaches and algorithms for data mining and machine learning task solution, for example, decision tree learning, artificial neural networks, Bayesian learning, instance-based learning, genetic algorithms, etc. They are ...
Machine Learning: The State of the Art

The two fundamental problems in machine learning (ML) are statistical analysis and algorithm design. The former tells us the principles of the mathematical models that we establish from the observation data. The latter defines the conditions on which ...

Comments

Information & Contributors

Information

Published In

PerMIS '07: Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems

August 2007

293 pages

ISBN:9781595938541

DOI:10.1145/1660877

General Chair:
Elena Messina
Intelligent Systems Division, NIST
,
Program Chair:
Raj Madhavan
Oak Ridge National Laboratory/NIST

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 August 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Defense Advanced Research Projects Agency

Conference

PerMIS07

Sponsor:

NIST

PerMIS07: Performance Metrics for Intelligent Systems 2007

August 28 - 30, 2007

D.C., Washington

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
168
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Bollacker KDíaz-Rodríguez NLi X(2018)Extending Knowledge Graphs with Subjective Influence Networks for Personalized FashionDesigning Cognitive Cities10.1007/978-3-030-00317-3_9(203-233)Online publication date: 19-Sep-2018
https://doi.org/10.1007/978-3-030-00317-3_9
Berry PDonneau-Golencer TDuong KGervasio MPeintner BYorke-Smith N(2017)Evaluating intelligent knowledge systemsKnowledge and Information Systems10.1007/s10115-016-1011-352:2(379-409)Online publication date: 1-Aug-2017
https://dl.acm.org/doi/10.1007/s10115-016-1011-3
Kwong HYorke-Smith N(2012)Detection of imperative and declarative question--answer pairs in email conversationsAI Communications10.5555/2594622.259462325:4(271-283)Online publication date: 1-Oct-2012
https://dl.acm.org/doi/10.5555/2594622.2594623
Berry PGervasio MPeintner BYorke-Smith N(2011)PTIMEACM Transactions on Intelligent Systems and Technology10.1145/1989734.19897442:4(1-22)Online publication date: 15-Jul-2011
https://dl.acm.org/doi/10.1145/1989734.1989744
Garlan DRoman GSullivan K(2010)Software engineering in an uncertain worldProceedings of the FSE/SDP workshop on Future of software engineering research10.1145/1882362.1882389(125-128)Online publication date: 7-Nov-2010
https://dl.acm.org/doi/10.1145/1882362.1882389
Faulring AMyers BMohnkern KSchmerl BSteinfeld AZimmerman JSmailagic AHansen JSiewiorek DRich CYang QCavazza MZhou M(2010)Agent-assisted task management that reduces email overloadProceedings of the 15th international conference on Intelligent user interfaces10.1145/1719970.1719980(61-70)Online publication date: 7-Feb-2010
https://dl.acm.org/doi/10.1145/1719970.1719980
Faulring AMyers BMohnkern KFreed M(2008)A demonstration of the RADAR personal assistantProceedings of the 23rd national conference on Artificial intelligence - Volume 310.5555/1620270.1620397(1876-1877)Online publication date: 13-Jul-2008
https://dl.acm.org/doi/10.5555/1620270.1620397
Freed MCarbonell JGordon GHayes JMyers BSiewiorek DSmith SSteinfeld ATomasic A(2008)RADARProceedings of the 23rd national conference on Artificial intelligence - Volume 310.5555/1620270.1620274(1287-1293)Online publication date: 13-Jul-2008
https://dl.acm.org/doi/10.5555/1620270.1620274
Chau DMyers BFaulring ACzerwinski MLund ATan D(2008)What to do when search failsProceedings of the SIGCHI Conference on Human Factors in Computing Systems10.1145/1357054.1357208(999-1008)Online publication date: 6-Apr-2008
https://dl.acm.org/doi/10.1145/1357054.1357208
Steinfeld AQuinones PZimmerman JBennett SSiewiorek DMessina EMadhavan R(2007)Survey measures for evaluation of cognitive assistantsProceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems10.1145/1660877.1660902(175-179)Online publication date: 28-Aug-2007
https://dl.acm.org/doi/10.1145/1660877.1660902

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Machine Learning Task as a Diclique Extracting Task

Machine learning task as a diclique extracting task

Machine Learning: The State of the Art

Comments

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Recommendations

Machine Learning Task as a Diclique Extracting Task

Machine learning task as a diclique extracting task

Machine Learning: The State of the Art

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations