article

Instance spaces for machine learning classification

Authors:

Mario A. Muñoz,

Laura Villanova,

Davaatseren Baatar,

Kate Smith-MilesAuthors Info & Claims

Machine Learning, Volume 107, Issue 1

Pages 109 - 147

https://doi.org/10.1007/s10994-017-5629-5

Published: 01 January 2018 Publication History

Abstract

This paper tackles the issue of objective performance evaluation of machine learning classifiers, and the impact of the choice of test instances. Given that statistical properties or features of a dataset affect the difficulty of an instance for particular classification algorithms, we examine the diversity and quality of the UCI repository of test instances used by most machine learning researchers. We show how an instance space can be visualized, with each classification dataset represented as a point in the space. The instance space is constructed to reveal pockets of hard and easy instances, and enables the strengths and weaknesses of individual classifiers to be identified. Finally, we propose a methodology to generate new test instances with the aim of enriching the diversity of the instance space, enabling potentially greater insights than can be afforded by the current UCI repository.

Cited By

View all

Neelofar NAleti A(2024)Identifying and Explaining Safety-critical Scenarios for Autonomous Vehicles via Key FeaturesACM Transactions on Software Engineering and Methodology10.1145/364033533:4(1-32)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3640335
Valeriano MPereira JVeiga Kiffer CLorena ALi XHandl J(2024)Explaining instances in the health domain based on the exploration of a dataset's hardness embeddingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664113(1598-1606)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664113
Rasulo ASmith-Miles KMuñoz MHandl JLópez-Ibáñez MLi XHandl J(2024)Extending Instance Space Analysis to Algorithm Configuration SpacesProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654264(147-150)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3654264
Show More Cited By

Index Terms

Instance spaces for machine learning classification
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees

Index terms have been assigned to the content through auto-classification.

Recommendations

Generating new test instances by evolving in instance space

Our confidence in the future performance of any algorithm, including optimization algorithms, depends on how carefully we select test instances so that the generalization of algorithm performance on future instances can be inferred. In recent work, we ...
Instance space analysis for a personnel scheduling problem
Abstract
This paper considers the Rotating Workforce Scheduling Problem, and shows how the strengths and weaknesses of various solution methods can be understood by the in-depth evaluation offered by a recently developed methodology known as Instance Space ...
Instance Space Analysis for Algorithm Testing: Methodology and Software Tools
Instance Space Analysis (ISA) is a recently developed methodology to (a) support objective testing of algorithms and (b) assess the diversity of test instances. Representing test instances as feature vectors, the ISA methodology extends Rice’s 1976 ...

Comments

Information & Contributors

Information

Published In

Machine Language Volume 107, Issue 1

January 2018

307 pages

ISSN:0885-6125

Issue’s Table of Contents

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 January 2018

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

46
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Neelofar NAleti A(2024)Identifying and Explaining Safety-critical Scenarios for Autonomous Vehicles via Key FeaturesACM Transactions on Software Engineering and Methodology10.1145/364033533:4(1-32)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3640335
Valeriano MPereira JVeiga Kiffer CLorena ALi XHandl J(2024)Explaining instances in the health domain based on the exploration of a dataset's hardness embeddingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664113(1598-1606)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664113
Rasulo ASmith-Miles KMuñoz MHandl JLópez-Ibáñez MLi XHandl J(2024)Extending Instance Space Analysis to Algorithm Configuration SpacesProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654264(147-150)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3654264
Lorena APaiva PPrudêncio R(2024)Trusting My Predictions: On the Value of Instance-Level AnalysisACM Computing Surveys10.1145/361535456:7(1-28)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3615354
Neelofar NAleti ARoychoudhury APaiva AAbreu RStorey M(2024)Towards Reliable AI: Adequacy Metrics for Ensuring the Quality of System-level Testing of Autonomous VehiclesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623314(1-12)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3623314
Alipour HMuñoz MSmith-Miles K(2024)On the impact of initialisation strategies on Maximum Flow algorithm performanceComputers and Operations Research10.1016/j.cor.2023.106492163:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.cor.2023.106492
Scherer MHill RLunday BCox BWhite E(2024)Verifying new instances of the multidemand multidimensional knapsack problem with instance space analysisComputers and Operations Research10.1016/j.cor.2023.106477162:COnline publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.cor.2023.106477
Pereira JSmith-Miles KMuñoz MLorena A(2024)Optimal selection of benchmarking datasets for unbiased machine learning algorithm evaluationData Mining and Knowledge Discovery10.1007/s10618-023-00957-138:2(461-500)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s10618-023-00957-1
Neelofar NSmith-Miles KMuñoz MAleti A(2023)Instance Space Analysis of Search-Based Software TestingIEEE Transactions on Software Engineering10.1109/TSE.2022.322833449:4(2642-2660)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1109/TSE.2022.3228334
Santos MAbreu PJapkowicz NFernández ASantos J(2023)A unifying view of class overlap and imbalanceInformation Fusion10.1016/j.inffus.2022.08.01789:C(228-253)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.inffus.2022.08.017
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

Cited By

Index Terms

Recommendations

Generating new test instances by evolving in instance space

Instance space analysis for a personnel scheduling problem

Instance Space Analysis for Algorithm Testing: Methodology and Software Tools

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations