Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
204 views
in Technique[技术] by (71.8m points)

python - Machine learning Prediction - Failed to convert a NumPy array to a Tensor

I'm having a movie rating prediction problem that I'm trying to solve for my personal machine learning practice. I have 2 csv files. One with movies(movieId, title, genres) and one with ratings(userId, movieId,rating, timestamp).

After doing some data preprocessing, apply word embeddings for movie titles and one-hot encoding for genres and shuffle my final dataframe Ι came up to this

    userId  movieId     rating  embeddings  genres
0   545     2020    5.0     [0.081246674, 0.046522498, -0.014943261, 0.025...   [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ...
1   427     3186    2.0     [0.09334839, 0.057055157, -0.020527517, 0.0301...   [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ...
2   102     2144    3.0     [0.062349755, 0.04466611, -0.011009981, 0.0187...   [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
3   30      5927    4.0     [0.18021354, 0.119208135, -0.036116328, 0.0466...   [0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, ...
4   537     1022    3.0     [0.026805451, 0.025356086, -0.004603084, 0.013...   [0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, ...
...     ...     ...     ...     ...     ...

After that I tried to split my data to X and y and apply train-test-split

X = df_ratings.drop(['rating'], axis=1).values
y = df_ratings['rating'].values

Tried to :

X = np.asarray(X).astype('float32')
y = np.asarray(y).astype('float32')

TypeError                                 Traceback (most recent call last)
TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-46-ee1241369db7> in <module>
----> 1 X = np.asarray(X).astype('float32')
      2 y = np.asarray(y).astype('float32')

ValueError: setting an array element with a sequence
from sklearn.model_selection import train_test_split 
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0) 

I created my model (I don't know if this is fully correct) and tried to fit my train-data to the model

def movies_model():
    model = Sequential()
    # Add layers
    model.add(Dense(512, input_dim = X_train.shape[1], activation='relu'))
    model.add(Dense(512, input_dim = X_train.shape[1], activation='relu'))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(128, activation='relu'))
    model.add(Dense(1, activation="linear"))
    return model

optimizer = tf.keras.optimizers.Adam()

model.compile(optimizer=optimizer, loss='mean_absolute_error')

history = model.fit(X_train, y_train, epochs=10, batch_size=1024, verbose=1)

I got this error :

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type int).

embeddings and genres are of type numpy.ndarray.

I've searched for a hint but with no result. I would appreciate if you could help me figure out where the error came from. (I also have tried to convert embeddings and genres to other types)

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100004 entries, 0 to 100003
Data columns (total 5 columns):
 #   Column      Non-Null Count   Dtype  
---  ------      --------------   -----  
 0   userId      100004 non-null  int64  
 1   movieId     100004 non-null  int64  
 2   rating      100004 non-null  float64
 3   embeddings  100004 non-null  object 
 4   genres      100004 non-null  object 
dtypes: float64(1), int64(2), object(2)
question from:https://stackoverflow.com/questions/65887340/machine-learning-prediction-failed-to-convert-a-numpy-array-to-a-tensor

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Try this before splitting the data.

import numpy as np

X = np.asarray(X).astype(dtype=np.float32)
y = np.asarray(y).astype(dtype=np.float32)

Edit:

Remove the extra Dense layer with an additional input_dim.

def movies_model():
    model = Sequential()
    # Add layers
    model.add(Dense(512, input_dim = X_train.shape[1], activation='relu'))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(128, activation='relu'))
    model.add(Dense(1, activation="linear"))
    return model

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...