• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    公众号

转载Matlab随机森林库

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

什么是随机森林?Random forest is a classification technique that proposed by Leo Brieman (2001), given the set of class-labeled data, builds a set of classification trees. Each tree is developed from a bootstrap sample from the training data. When developing individual trees, an arbitrary subset of attributes is drawn (hence the term "random") from which the best attribute for the split is selected. The classification is based on the majority vote from individually developed tree classifiers in the forest

更为详细的解释:http://en.wikipedia.org/wiki/Random_forest

Matlab库下载

原始实现:

http://www.stat.berkeley.edu/~breiman/RandomForests/cc_software.htm,即将发布新版

从R改装来的实现:

http://randomforest-matlab.googlecode.com/files/Windows-Precompiled-RF_MexStandalone-v0.02-.zip

 

基于随机森林的集成分类应用:

ENSEMBLE CLASSIFICATION(1) A conference paper investigating binary classification strategies with ensemble classification has been published. [Chan J.C.-W., Demarchi, L., Van De Voorde, T., & Canters, F. (2008),”Binary classification strategies for mapping urban land cover with ensemble classifiers”, Proceedings of IEEE International Geoscience and Remote Sensing Symposium (IGARSS), July 6-11, 2008, Boston, Massachusetts, USA. Vol. III, pp. 1004-1007.] (see Annex A.9) Since the data sets related to HABISTAT were not ready in the beginning of 2008, a study on binary classification with ensemble classifiers was conducted using 2 data sets in suburban areas. In the paper, two binary classification strategies were examined to further extend the strength of ensemble classifiers for mapping of urban objects. The first strategy was a one-against-one approach. The idea behind it was to employ a pairwise binary classification where n(n-1)/2 classifiers are created, n being the number of classes. Each of the n(n-1)/2  classifiers was trained using only training cases from two classes at a time. The ensemble was then combined by majority voting. The second strategy was a one-against-all binary approach: if there are n classes, with a = {1,…, n} being one of the classes, then n classifiers were generated, each representing a binary classification of a and non-a. The ensemble was combined using accuracy estimates obtained for each class. Both binary strategies were applied on two single classifiers (decision trees and artificial neural networks) and two ensemble classifiers (Random Forest and Adaboost). Two multi-source data sets were used: one was prepared for an object-based classification and one for a conventional pixel-based approach. Our results indicate that ensemble classifiers generate significantly higher accuracies than a single classifier. Compared to a single C5.0 tree, Random Forest and Adaboost increased the accuracy by 2 to 12%. The range of increase depends on the data set that was used. Applying binary classification strategies often increases accuracy, but only marginally (between 1-3%). All increases are statistically significant, except on one occasion. Coupling ensemble classifiers with binary classification always yielded the highest accuracies. For our first data set, the highest accuracy was obtained with Adaboost and a 1-against-1 strategy, 4.3% better than for a single tree;  for the second data set with the Random Forest approach and a 1-against-all strategy, 13.6% higher than for a single tree.  While the results show statistically significant improvement, the increase in accuracy is marginal. Given its long training time, we have to consider carefully if it is worthwhile to apply this strategy.

(2) We used the ensemble classifier Random Forest to produce four levels of classification using 3 different data sets in the framework of workpackage Validation WP 5200. The data set that was used for this experiment is AHS airborne data. A total of 12 classifications were made (see Figure 9). The results with Random Forest were compared with the performance from other classifiers: Linear Discriminant Analysis, Markov Random Field.

The processing has a problem in terms of the number of training samples and also spatial independence (see Table 5). This issue with the training, testing and validation sets has been discussed during the mid-term evaluation and is under investigation.
Figure 9. Validation exercise using airborne AHS data. The columns represent 3 data sets and rows represent 4 levels of classification. Classifications were done using Random Forest.

Table 5. Table showing the classification scheme and training size at each level.

(3) The use of ensemble classification was studied in all classification tasks with spaceborne data. Two conference papers in relation to classification of heath lands using superresolution enhanced CHRIS data were presented. Random Forest were used for the classifications. The results show rather consistent and satisfactory results with Random Forest. Below are two illustrations (Figure 10 and Figure 11) of the application of Random Forest on the original CHRIS and superresolution enhanced CHRIS data set. For more details, please refer to the paper attached in annex A.8. Random Forest seems to have worked very well with our data sets. We will continue to use and investigate the strength of this ensemble classifier.

Figure 10. Random Forest classification of SR CHRIS (Kalmthout, Belgium). Results presented at IGARSS, July 6-11, 2008, Boston, Massachusetts, USA. (see Annex B of annual report #1)

Figure 11. Random Forest classification of SR CHRIS (Ginkel, the Netherlands). Results presented at the 6th EARSeL SIG Imaging Spectroscopy workshop 2009, Tel Aviv, March 16-19 2009. (see Annex A.8)

来源:http://habistat.vgt.vito.be/modules/Results/EC.php

 

Orange软件提供的随机森林实现

http://orange.biolab.si/doc/widgets/_static/Classify/RandomForest.htm

 

转载http://blog.csdn.net/alaclp/article/details/7484699


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
R适合替代Matlab作为计算器用发布时间:2022-07-18
下一篇:
汇编与高级语言(插图结合Delphi代码,来自linzhengqun)发布时间:2022-07-18
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap