Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
495 views
in Technique[技术] by (71.8m points)

pandas - could not convert string to float: 'CC6000'

I am trying to build a Machine Learning model that would predict the delay (the difference between the clear_date and the due_in_date) from the given dataset.

I've split the dataset into x_train, y_train, x_test, validation_set. I'm using Linear Regression model from sklearn library. When I try to fit my data into a Linear Regression model I get a weird error

could not convert string to float: 'CC6000'

How can I resolve this?

Here are the pictures of x_train and y_train [1]: https://i.stack.imgur.com/8RP2J.png [2]: https://i.stack.imgur.com/jB7qN.png [3]: https://i.stack.imgur.com/bDRQH.png

question from:https://stackoverflow.com/questions/65949555/could-not-convert-string-to-float-cc6000

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

It seems that you have a string hidden in your dataframe: "CC6000".

Linear Regression only works with numerical samples, so he can't handle with this string.

I have looked your data, and I haven't seen this string, but for sure it has to be there. When you find him, you would have to eliminate this sample if it's the unique string or even, if all the feature is categorical, you would have to encode it or remove.

To look for this string try something like:

df.isin(['CC6000']).any()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...