Training data quality is an evaluation of a data set’s fitness to serve its purpose in a given ML use case. To improve quality, it must be measured accurately. As Pearson’s Law states, “When performance is measured, performance improves.” This guide will walk you through the four core metrics to measure your data labeling so that you can improve your quality. Since the goal of the training data is to represent the answers the models need to predict, we use the same metrics to evaluate model performance.
Use these 4 core quality metrics to evaluate a labeled data set: