If we use one hot encoding on a certain categorical field that has a lot of distinct values in it, we would have a lot of separate fields for it (each with 0 or 1). My question is do we use the one hot encoding in this situation as well or is there any other way?
You should use one-hot encoding when there is no order in categorical values otherwise you are good to go with label encoder. If you perform one hot encoding on un-ordered features, the machine will interpret it as there's no relation between values. For example, consider a feature will_play? and its values yes, no and maybe. There are three different answers and they have no relationship between them.Please go through this link(click on "link") for getting a clear view of what both the methods do.