There is no clear Yes/No answer to this question. But it is not mandatory to do scaling of one-hot-encoded or dummy-encoded features. The intuition behind why it is not mandatory to do scaling is as follows:
Let say you have got two encoded vectors as
A=[010] and B=[100] , you can see that
|A|= √(0*0 + 1*1 + 0*0) and
|B|= √(1*1 + 0*0 + 0*0)
will always be equals to 1 and the distance between them will be
√(1*1 + 1*1) = √2 = 1.41.
So why you should not do standardization is clear from this, as you can see the magnitude of the one-hot encoded features is 1 and the distance between them is √2 hence the variance in this one-hot encoded feature is not that much so as to standardize them. But when you should consider to do standardization? It is when, when you have vectors like [111011][111011] and [000001][000001] in which the variability is very high.
Comments
0 comments
Article is closed for comments.