• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    公众号

MatlabToolboxforDimensionalityReduction(v0.7.2-November2010)

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

The Matlab Toolbox for Dimensionality Reduction contains Matlab implementations of 33 techniques for dimensionality reduction and metric learning. A large number of implementations was developed from scratch, whereas other implementations are improved versions of software that was already available on the Web. The implementations in the toolbox are conservative in their use of memory. The toolbox is available for download here.

Currently, the Matlab Toolbox for Dimensionality Reduction contains the following techniques:

  1. • Principal Component Analysis (PCA)

  2. • Probabilistic PCA

  3. • Factor Analysis (FA)

  4. • Sammon mapping

  5. • Linear Discriminant Analysis (LDA)

  6. • Multidimensional scaling (MDS)

  7. • Isomap

  8. • Landmark Isomap

  9. • Local Linear Embedding (LLE)

  10. • Laplacian Eigenmaps

  11. • Hessian LLE

  12. • Local Tangent Space Alignment (LTSA)

  13. • Conformal Eigenmaps (extension of LLE)

  14. • Maximum Variance Unfolding (extension of LLE)

  15. • Landmark MVU (LandmarkMVU)

  16. • Fast Maximum Variance Unfolding (FastMVU)

  17. • Kernel PCA

  18. • Generalized Discriminant Analysis (GDA)

  19. • Diffusion maps

  20. • Neighborhood Preserving Embedding (NPE)

  21. • Locality Preserving Projection (LPP)

  22. • Linear Local Tangent Space Alignment (LLTSA)

  23. • Stochastic Proximity Embedding (SPE)

  24. • Multilayer autoencoders (training by RBM + backpropagation or by an evolutionary algorithm)

  25. • Local Linear Coordination (LLC)

  26. • Manifold charting

  27. • Coordinated Factor Analysis (CFA)

  28. • Gaussian Process Latent Variable Model (GPLVM)

  29. • Stochastic Neighbor Embedding (SNE)

  30. • Symmetric SNE (SymSNE)

  31. • new: t-Distributed Stochastic Neighbor Embedding (t-SNE)

  32. • new: Neighborhood Components Analysis (NCA)

  33. • new: Maximally Collapsing Metric Learning (MCML)

In addition to the techniques for dimensionality reduction, the toolbox contains implementations of 6 techniques for intrinsic dimensionality estimation, as well as functions for out-of-sample extension, prewhitening of data, and the generation of toy datasets.

The toolbox provides easy access to all these implementations. Basically, the only command you need to execute is:

mapped_data = compute_mapping(data, method, # of dimensions, parameters)

The function assumes the dimensions are the columns in the data, and the instances are the rows. The function also accepts PRTools datasets. Information on how parameters for certain techniques should be specified can be obtained by typing HELP COMPUTE_MAPPING in the Matlab prompt. For more instructions on how to install and use the toolbox, please read the Readme.txt file.
You are free to use, modify, or redistribute this software in any way you want, but only for non-commercial purposes. The use of the toolbox is at your own risk; the author is not responsible for any damage as a result from errors in the software. I would appreciate it if you refer to the toolbox or its author in your papers.

For more information on the techniques implemented in the toolbox, we refer to the following publications:

  1. • L.J.P. van der Maaten, E.O. Postma, and H.J. van den Herik. Dimensionality Reduction: A Comparative Review. Tilburg University Technical Report, TiCC-TR 2009-005, 2009. [ PDF ]

  2. • L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008. [ PDF ] [ Supplemental Material (24MB) ]

Frequently asked questions

  1. • When using the toolbox, the code quits saying that some function could not be found?
    Nine out of ten times, such errors is the result of you forgetting to add the the toolbox to the Matlab path. You can add the toolbox to the Matlab path by typing addpath(genpath(‘installation_folder/drtoolbox’)). Another probable cause is a naming conflict with another toolbox (e.g., another toolbox with a PCA function). You can investigate such errors using Matlab’s which function. If Matlab complains it cannot find the BSXFUN function, your Matlab is likely to be very outdated. You may try using this code as a surrogate.

  2. • Next to reducing the dimensionality of my data, Isomap/LLE/Laplacian Eigenmaps/LTSA also reduced the number of data points? Where did these points go?
    You may observe this behavior in most techniques that are based on neighborhood graphs. Isomap/LLE/Laplacian Eigenmaps/LTSA can only embed data that gives rise to a connected neighborhood graph. If the neighborhood graph is not connected, the implementations only embed the largest connected component of the neighborhood graph. You can obtain the indices of the embedded data points from mapping.conn_comp (which you can get from the compute_mapping function). If you really need to have al your data points embedded, don’t use a manifold learner.

  3. • How do I provide label information to the supervised techniques/metric learners?

You should specify label information to supervised techniques by setting the elements of the first column of the data matrix to the label of the corresponding data point. To this end, the labels should be numeric.

  1. • How do I project low-dimensional data back into the data space?

Back-projection is currently not implemented in the toolbox. Note that back-projection can only be implemented for linear techniques, for autoencoders, and for the GPLVM.

  1. • Which techniques support an exact out-of-sample extension?

Only parametric dimensionality reduction techniques, i.e., techniques that learn an explicit function between the data space and the low-dimensional latent space, support exact out-of-sample extension. All linear techniques (PCA, LDA, NCA, MCML, LPP, and NPE) support exact out-of-sample extension, and autoencoders do too.

  1. • Which technique should I use to visualize high-dimensional data in a scatter plot?
    Although I am slightly biased, t-SNE typically is very good at visualizing data. Manifold learners typically perform disappointingly for data visualization due to a problem in their covariance constraint. Parametric techniques are typically not well suited for visualization, because they constrain the mapping between the data and the visualization.

Bugs/questions/suggestions

You think you have found a bug? Please check the FAQ first! If the FAQ does not answer your question, feel free to send me an email!


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
Delphi Treeview 操作实例发布时间:2022-07-18
下一篇:
TreeView节点 - 疯狂delphi发布时间:2022-07-18
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap