single : Feature ranking based on each single feature's prediction accuracy (wrapper method)
sfs : Feature selection using sequential forward selection (wrapper method)
ga : Feature selection using the genetic algorithm in Matlab (wrapper method)
rf : Feature ranking using random forest (embedded method)
stepwisefit : Feature selection based on stepwise fitting (embedded method)
boost : Feature selection using AdaBoost with the stump weak learner (embedded method)
svmrfe_ori : Feature ranking using SVM-recursive feature elimination (SVM-RFE), the original linear version (embedded method)
svmrfe_ker : Feature ranking using the kernel version of SVM-RFE (embedded method)
Representative sample selection (active learning)
cluster : Sample selection based on cluster centers
ted : Transductive experimental design
llr : Locally linear reconstruction
ks : Kennard-Stone algorithm
Interfaces
Feature processing
[Xnew, model] = ftProc_xxx_tr(X,Y,param) % training
Xnew = ftProc_xxx_te(model,X) % test
Classification
model = classf_xxx_tr(X,Y,param) % training
[pred,prob] = classf_xxx_te(model,Xtest) % test, return the predicted labels and probabilities (optional)
Regression
model = regress_xxx_tr(X,Y,param) % training
rv = regress_xxx_te(model,Xtest) % test, return the predicted values
Feature selection
[ftRank,ftScore] = ftSel_xxx(ft,target,param) % return the feature rank (or subset) and scores (optional)
Representative sample selection (active learning)
smpList = smpSel_xxx(X,nSel,param) % return the indices of the selected samples
Please see test.m for sample usages.
Besides, there are three uniform wrappers: ftProc_, classf_, regress_. They accept algorithm name strings as inputs and combine the training and test phase.
Characteristics
The training (tr) and test (te) phases are split for feature processing, classification and regression to allow more flexible use. For example, one trained model can be applied multiple times.
The struct "param" is used to pass parameters to algorithms.
Default parameters are set clearly at the top of the code, along with the explainations.
In brief, I aimed at three main objectives when developing this toolbox:
Unified and simple interface;
Convenient to observe and change algorithm parameters, avoiding tedious parameter setting and checking;
Extensibile. Simple file structures makes it easier to modify the algorithms.
Dependencies
In the toolbox, 20 algorithms are self-implemented, 11 are wrappers or mainly based on Matlab functions, and 9 are wrappers or mainly based on 3rd party toolboxes, which are listed below. They are included in the project, however, you may need to recompile some of them depending on your computer platform.
SVM and SVR: Chih-Chung Chang and Chih-Jen Lin, libsvm (this toolbox is so famous that you only need to google it)
Thanks to the authors and MathWorks Inc.! I know that there is so many important algorithms not contained in the toolbox, so everybody is welcomed to contribute new codes! Also, if you find any bug in the codes, please don't hesitate to let me know!
请发表评论