I am struggling to get this pipeline to work. I'm working on a text classification problem where I have one binary feature and the other is text(TFIDF vectorized). I wanted to perform Oversampling to one of the classes and hence I'm defining my own method. Here's my trial so far: `
get_text_data = FunctionTransformer(lambda x: x['FinalText'], validate=False)
get_numeric_data = FunctionTransformer(lambda x: x[['boolean']], validate=False)
pipe_svm = Pipeline([
('features', FeatureUnion([
('numeric_features', Pipeline([
('selector', get_numeric_data)
])),
('text_features', Pipeline([
('selector', get_text_data),
('xtrain', CustomOversampling(X_train['FinalText']))
]))
])),
('clf', svm.LinearSVC(class_weight = 'balanced'))
])
pipe_svm.fit(X_train,y_train)`
def CustomOversampling(input)
.....
return Combinedmatrix,combinedyframe
TypeError: Last step of Pipeline should implement fit or be the string 'passthrough'. '(<157911x10951 sparse matrix of type '<class 'numpy.float64'>'
question from:
https://stackoverflow.com/questions/65838392/feature-union-and-function-returns-with-pipelines 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…