In class imbalance settings, artificially balancing the test/validation set does not make any sense: these sets must remain realistic, i.e. you want to test your classifier performance in the real world setting, where, say, the negative class will include the 99% of the samples, in order to see how well your model will do in predicting the 1% positive class of interest without too many false positives. Artificially inflating the minority class or reducing the majority one will lead to performance metrics that are unrealistic, bearing no real relation to the real world problem you are trying to solve.
For corroboration, here is Max Kuhn, creator of the caret
R package and co-author of the (highly recommended) Applied Predictive Modelling textbook, in Chapter 11: Subsampling For Class Imbalances of the caret
ebook:
You would never want to artificially balance the test set; its class frequencies should be in-line with what one would see “in the wild”.
Re-balancing makes sense only in the training set, so as to prevent the classifier from simply and naively classifying all instances as negative for a perceived accuracy of 99%.
Hence, you can rest assured that in the setting you describe the rebalancing takes action only for the training set/folds.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…