I use this code to test CatBoostClassifier.
import numpy as np
from catboost import CatBoostClassifier, Pool
# initialize data
train_data = np.random.randint(0, 100, size=(100, 10))
train_labels = np.random.randint(0, 2, size=(100))
test_data = Pool(train_data, train_labels) #What is Pool?When to use Pool?
# test_data = np.random.randint(0,100, size=(20, 10)) #Usually we will use numpy array,will not use Pool
model = CatBoostClassifier(iterations=2,
depth=2,
learning_rate=1,
loss_function='Logloss',
verbose=True)
# train the model
model.fit(train_data, train_labels)
# make the prediction using the resulting model
preds_class = model.predict(test_data)
preds_proba = model.predict_proba(test_data)
print("class = ", preds_class)
print("proba = ", preds_proba)
The description about Pool is like this:
Pool used in CatBoost as a data structure to train model from.
I think usually we will use numpy array,will not use Pool.
For example we use:
test_data = np.random.randint(0,100, size=(20, 10))
I did not find any more usage of Pool, so I want to know when we will use Pool instead of numpy array?
question from:
https://stackoverflow.com/questions/65838830/what-is-pool-in-catboostwhen-to-use-pool-instead-of-numpy-array 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…