Beyond Homemade Artificial Data Sets

Macià, Núria; Orriols-Puig, Albert; Bernadó-Mansilla, Ester

doi:10.1007/978-3-642-02319-4_73

Núria Macià²³,
Albert Orriols-Puig²³ &
Ester Bernadó-Mansilla²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5572))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

1709 Accesses
6 Citations

Abstract

One of the most important challenges in supervised learning is how to evaluate the quality of the models evolved by different machine learning techniques. Up to now, we have relied on measures obtained by running the methods on a wide test bed composed of real-world problems. Nevertheless, the unknown inherent characteristics of these problems and the bias of learners may lead to inconclusive results. This paper discusses the need to work under a controlled scenario and bets on artificial data set generation. A list of ingredients and some ideas about how to guide such generation are provided, and promising results of an evolutionary multi-objective approach which incorporates the use of data complexity estimates are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Genetic Selection of Training Sets for (Not Only) Artificial Neural Networks

Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines

Evolutionary Supervised Machine Learning

References

Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Google Scholar
Basu, M., Ho, T.K.: Data Complexity in Pattern Recognition. Springer, Heidelberg (2006)
Book MATH Google Scholar
Bernadó-Mansilla, E., Ho, T.K., Orriols-Puig, A.: Data complexity and evolutionary learning. In: Data Complexity in Pattern Recognition, pp. 115–134. Springer, Heidelberg (2006)
Chapter Google Scholar
Coello, C.A., Lamont, G.B., Veldhuizen, D.A.V.: Evolutionary Algorithms for Solving Multi-Objective Problems, 2nd edn. Springer, New York (2007)
MATH Google Scholar
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE TEC 6, 182–197 (2002)
Google Scholar
Ho, T.K.: Data complexity analysis: Linkage between context and solution in classification. In: Proceedings of the Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2008) and Statistical Techniques in Pattern Recognition, SPR 2008 (2008)
Google Scholar
Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Transactions on PAMI 24(3), 289–300 (2002)
Article Google Scholar
Jeske, D.R., Samadi, B., Lin, P.J., Ye, L.: Generation of synthetic data sets for evaluating the accuracy of knowledge discovery systems. In: 11th International Conference on Knowledge Discovery in Data mining, pp. 756–762 (2005)
Google Scholar
Macià, N., Bernadó-Mansilla, E., Orriols-Puig, A.: Preliminary approach on synthetic datasets generation for classification. In: 2008 International Conference on Pattern Recognition. LNCS, vol. 5342, pp. 986–995. Springer, Heidelberg (2008)
Google Scholar
Macià, N., Orriols-Puig, A., Bernadó-Mansilla, E.: Genetic-based synthetic data sets for the analysis of classifiers’ behavior. In: Proceedings of the 2008 Hybrid Intelligent Systems Conference, pp. 507–512 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Grup de Recerca en Sistemes Intel·ligents La Salle, Universitat Ramon Llull, C/ Quatre Camins 2, 08022, Barcelona, Spain
Núria Macià, Albert Orriols-Puig & Ester Bernadó-Mansilla

Authors

Núria Macià
View author publications
You can also search for this author in PubMed Google Scholar
Albert Orriols-Puig
View author publications
You can also search for this author in PubMed Google Scholar
Ester Bernadó-Mansilla
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Grupo de Investigación GICAP, Área de Lenguajes Higher Polytechnic School, Universidad de Burgos, Burgos, Spain
Emilio Corchado
Department of Computer Science, University of Vermont, 33 Colchester Avenue, Burlington, VT, USA
Xindong Wu
Computer and Information Science, Helsinki University of Technology, P.O. Box 5400, 02015 HUT, Finland
Erkki Oja
Grupo de Investigación GICAP, Área de Lenguajes y Sistemas Informáticos, Departamento de Ingeniería Civil, Escuela Politécnica Superior, Universidad de Burgos, Campus Vena, Francisco de Vitoria, 09006, Burgos, Spain
Álvaro Herrero & Bruno Baruque &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Macià, N., Orriols-Puig, A., Bernadó-Mansilla, E. (2009). Beyond Homemade Artificial Data Sets. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds) Hybrid Artificial Intelligence Systems. HAIS 2009. Lecture Notes in Computer Science(), vol 5572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02319-4_73

Download citation

DOI: https://doi.org/10.1007/978-3-642-02319-4_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02318-7
Online ISBN: 978-3-642-02319-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Beyond Homemade Artificial Data Sets

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Genetic Selection of Training Sets for (Not Only) Artificial Neural Networks

Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines

Evolutionary Supervised Machine Learning

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Beyond Homemade Artificial Data Sets

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Genetic Selection of Training Sets for (Not Only) Artificial Neural Networks

Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines

Evolutionary Supervised Machine Learning

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation