A simple example which encodes an array using LabelEncoder, OneHotEncoder, LabelBinarizer is shown below.
I see that OneHotEncoder needs data in integer encoded form first to convert into its respective encoding which is not required in the case of LabelBinarizer.
from numpy import array
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelBinarizer
# define example
data = ['cold', 'cold', 'warm', 'cold', 'hot', 'hot', 'warm', 'cold',
'warm', 'hot']
values = array(data)
print "Data: ", values
# integer encode
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(values)
print "Label Encoder:" ,integer_encoded
# onehot encode
onehot_encoder = OneHotEncoder(sparse=False)
integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
onehot_encoded = onehot_encoder.fit_transform(integer_encoded)
print "OneHot Encoder:", onehot_encoded
#Binary encode
lb = LabelBinarizer()
print "Label Binarizer:", lb.fit_transform(values)
Another good link which explains the OneHotEncoder is: Explain onehotencoder using python
There may be other valid differences between the two which experts can probably explain.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…