What is balance and imbalance data? Is it applicable to the regression Target variable as well? I assume balance data is measured on TV (Target variable only)
Class imbalance is mainly encountered in classification problems. When one class of the target variable is in a very high proportion as compared to the other class, we refer to this phenomenon as a class imbalance. For example, in the customer churn analysis dataset, we may have up to 95% of the data related to non-churn customers but only 5% of the data related to churned customers. In Machine Learning problem where predicting the churn is our goal, this kind of data makes difficult for the model to learn patterns about churn customers. We overcome this challenge using oversampling, undersampling or SMOTE technique.