While doing pre-profiling for a data set(Movies Data from your GitHub) I received warnings related to high cardinality in categorical data. How to prepare charts for such type of data? One thing I see is to group them to reduce the category but looking at data it looks tough to me
If a column in a dataset has very high cardinality, then it might not provide us with any useful information at all related to the target variable.
If that is the case, then we can drop that feature/column out of our analysis and focus on other features.
But, if the cardinality is low, or the values of the feature can be grouped into bins then we can work with that feature if it provides some significant insight towards the target variable.
Comments
0 comments
Article is closed for comments.