Machine Learning

Validating your Machine Learning Model

Going beyond k-Fold Cross-Validation

Maarten Grootendorst

Published in

Towards Data Science

8 min readSep 26, 2019

I believe that one of the most underrated aspects of creating your Machine Learning Model is thorough validation. Using proper validation techniques helps you understand your model, but most importantly, estimate an unbiased generalization performance.

There is no single validation method that works in all scenarios. It is important to understand if you are dealing with groups, time-indexed data, or if you are leaking data in your validation procedure.

Which validation method is right for my use case?

When researching these aspects I found plenty of articles describing evaluation techniques, but validation techniques typically stop at k-Fold cross-validation.

I would like to show you the world that uses k-Fold CV and goes one step further into Nested CV, LOOCV, but also into model selection techniques.

The following methods for validation will be demonstrated:

Train/test split
k-Fold Cross-Validation
Leave-one-out Cross-Validation
Leave-one-group-out Cross-Validation

Machine Learning

Validating your Machine Learning Model

Going beyond k-Fold Cross-Validation

Written by Maarten Grootendorst