Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Comparing Shape-Constrained Regression Algorithms for Data Validation

  • Conference paper
  • First Online:
Computer Aided Systems Theory – EUROCAST 2022 (EUROCAST 2022)

Abstract

Industrial and scientific applications handle large volumes of data that render manual validation by humans infeasible. Therefore, we require automated data validation approaches that are able to consider the prior knowledge of domain experts to produce dependable, trustworthy assessments of data quality. Prior knowledge is often available as rules that describe interactions of inputs with regard to the target e.g. the target must be monotonically decreasing and convex over increasing input values. Domain experts are able to validate multiple such interactions at a glance. However, existing rule-based data validation approaches are unable to consider these constraints. In this work, we compare different shape-constrained regression algorithms for the purpose of data validation based on their classification accuracy and runtime performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.mosek.com.

  2. 2.

    https://www.miba.com.

References

  1. Bhushan, B.: Introduction to Tribology, chap. Friction, pp. 199–271. Wiley (2013). https://onlinelibrary.wiley.com/doi/abs/10.1002/9781118403259.ch5

  2. Bladek, I., Krawiec, K.: Solving symbolic regression problems with formal constraints. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 977–984. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3321707.3321743

  3. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939785

  4. Cozad, A., Sahinidis, N.V., Miller, D.C.: A combined first-principles and data-driven approach to model building. Comput. Chem. Eng. 73, 116–127 (2015)

    Article  Google Scholar 

  5. Ehrlinger, L., Wöß, W.: A survey of data quality measurement and monitoring tools. Front. Big Data, 28 (2022). https://doi.org/10.3389/fdata.2022.850611

  6. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29

    Chapter  Google Scholar 

  7. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4) (2014). https://doi.org/10.1145/2523813

  8. Hall, G.: Optimization over nonnegative and convex polynomials with and without semidefinite programming. Ph.D. thesis, Princeton University (2018)

    Google Scholar 

  9. Kronberger, G., de Franca, F.O., Burlacu, B., Haider, C., Kommenda, M.: Shape-constrained symbolic regression-improving extrapolation with prior knowledge. Evol. Comput. 30(1), 75–98 (2022). https://doi.org/10.1162/evco_a_00294

  10. Parrilo, P.A.: Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. Ph.D. thesis, California Institute of Technology (2000)

    Google Scholar 

Download references

Acknowledgement

The financial support by the Christian Doppler Research Association, the Austrian Federal Ministry for Digital and Economic Affairs and the National Foundation for Research, Technology and Development is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florian Bachinger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bachinger, F., Kronberger, G. (2022). Comparing Shape-Constrained Regression Algorithms for Data Validation. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds) Computer Aided Systems Theory – EUROCAST 2022. EUROCAST 2022. Lecture Notes in Computer Science, vol 13789. Springer, Cham. https://doi.org/10.1007/978-3-031-25312-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25312-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25311-9

  • Online ISBN: 978-3-031-25312-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics