Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Instance spaces for machine learning classification

Published: 01 January 2018 Publication History
  • Get Citation Alerts
  • Abstract

    This paper tackles the issue of objective performance evaluation of machine learning classifiers, and the impact of the choice of test instances. Given that statistical properties or features of a dataset affect the difficulty of an instance for particular classification algorithms, we examine the diversity and quality of the UCI repository of test instances used by most machine learning researchers. We show how an instance space can be visualized, with each classification dataset represented as a point in the space. The instance space is constructed to reveal pockets of hard and easy instances, and enables the strengths and weaknesses of individual classifiers to be identified. Finally, we propose a methodology to generate new test instances with the aim of enriching the diversity of the instance space, enabling potentially greater insights than can be afforded by the current UCI repository.

    Cited By

    View all
    • (2024)Identifying and Explaining Safety-critical Scenarios for Autonomous Vehicles via Key FeaturesACM Transactions on Software Engineering and Methodology10.1145/364033533:4(1-32)Online publication date: 11-Jan-2024
    • (2024)Explaining instances in the health domain based on the exploration of a dataset's hardness embeddingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664113(1598-1606)Online publication date: 14-Jul-2024
    • (2024)Extending Instance Space Analysis to Algorithm Configuration SpacesProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654264(147-150)Online publication date: 14-Jul-2024
    • Show More Cited By

    Index Terms

    1. Instance spaces for machine learning classification
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Machine Language
        Machine Language  Volume 107, Issue 1
        January 2018
        307 pages

        Publisher

        Kluwer Academic Publishers

        United States

        Publication History

        Published: 01 January 2018

        Author Tags

        1. Algorithm footprints
        2. Classification
        3. Instance difficulty
        4. Instance space
        5. Meta-learning
        6. Performance evaluation
        7. Test data
        8. Test instance generation

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 13 Aug 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Identifying and Explaining Safety-critical Scenarios for Autonomous Vehicles via Key FeaturesACM Transactions on Software Engineering and Methodology10.1145/364033533:4(1-32)Online publication date: 11-Jan-2024
        • (2024)Explaining instances in the health domain based on the exploration of a dataset's hardness embeddingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664113(1598-1606)Online publication date: 14-Jul-2024
        • (2024)Extending Instance Space Analysis to Algorithm Configuration SpacesProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654264(147-150)Online publication date: 14-Jul-2024
        • (2024)Trusting My Predictions: On the Value of Instance-Level AnalysisACM Computing Surveys10.1145/361535456:7(1-28)Online publication date: 9-Apr-2024
        • (2024)Towards Reliable AI: Adequacy Metrics for Ensuring the Quality of System-level Testing of Autonomous VehiclesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623314(1-12)Online publication date: 20-May-2024
        • (2024)On the impact of initialisation strategies on Maximum Flow algorithm performanceComputers and Operations Research10.1016/j.cor.2023.106492163:COnline publication date: 1-Mar-2024
        • (2024)Verifying new instances of the multidemand multidimensional knapsack problem with instance space analysisComputers and Operations Research10.1016/j.cor.2023.106477162:COnline publication date: 4-Mar-2024
        • (2024)Optimal selection of benchmarking datasets for unbiased machine learning algorithm evaluationData Mining and Knowledge Discovery10.1007/s10618-023-00957-138:2(461-500)Online publication date: 1-Mar-2024
        • (2023)Instance Space Analysis of Search-Based Software TestingIEEE Transactions on Software Engineering10.1109/TSE.2022.322833449:4(2642-2660)Online publication date: 1-Apr-2023
        • (2023)A unifying view of class overlap and imbalanceInformation Fusion10.1016/j.inffus.2022.08.01789:C(228-253)Online publication date: 1-Jan-2023
        • Show More Cited By

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media