What is an overfitted model? How do I check if the model is overfitting? How do I solve this overfitting problem?
Let’s say we want to predict if a student will land a job interview based on her resume.
Now, assume we train a model from a dataset of 10,000 resumes and their outcomes.
Next, we try the model out on the original dataset, and it predicts outcomes with 99% accuracy
But now comes the bad news.
When we run the model on a new (“unseen”) dataset of resumes, we only get 50% accuracy
Our model doesn’t generalize well from our training data to unseen data.
This is known as overfitting
If our model does much better on the training set than on the test set, then we’re likely overfitting.
Steps to prevent:
2.Train with more data