STrenD: Subspace Trend Discovery
(→Clustering) |
|||
Line 1: | Line 1: | ||
== Clustering == | == Clustering == | ||
− | Clustering is for data dimension deduction to speed up the analysis. | + | Clustering is for data dimension deduction to speed up the analysis and to achieve better looking progression tree. The clustering has been undertaken on boths sides, samples and features. |
+ | |||
+ | For sample cluster, only one param "coherence" is taken into consideration. "Coherence" is mesured by the average Pearson correlation coefficient of each module. Therefore, it should be 0-1. The larger the coherence, the more correlated the module. | ||
+ | For feature cluster, besides "coherence", "merge coherence" is the Pearson correlation coefficient of two clustered modules, if the correlation coeffient of two modules exceeds the "merge coherence", these two modules will be merged. Its arrange is also 0-1. | ||
+ | |||
+ | '''Recommended param setting:''' | ||
+ | |||
+ | Samples clustering: coherence = 0.95; | ||
+ | |||
+ | Feature clustering: coherence = 0.9; merge coherence = 0.9. | ||
== Minimum Spanning Tree == | == Minimum Spanning Tree == |
Revision as of 15:32, 30 November 2011
Contents |
Clustering
Clustering is for data dimension deduction to speed up the analysis and to achieve better looking progression tree. The clustering has been undertaken on boths sides, samples and features.
For sample cluster, only one param "coherence" is taken into consideration. "Coherence" is mesured by the average Pearson correlation coefficient of each module. Therefore, it should be 0-1. The larger the coherence, the more correlated the module. For feature cluster, besides "coherence", "merge coherence" is the Pearson correlation coefficient of two clustered modules, if the correlation coeffient of two modules exceeds the "merge coherence", these two modules will be merged. Its arrange is also 0-1.
Recommended param setting:
Samples clustering: coherence = 0.95;
Feature clustering: coherence = 0.9; merge coherence = 0.9.
Minimum Spanning Tree
Build MST for each clustering module so as to tell how cells are related to each other in every module.
Similarity Matrix
Based on the MSTs, implement the Earth Mover's Distance(EMD) method for every two modules to see how each module is similar to the other.
Progression Tree
By checking the visualized similarity matrix, choose the ideal modules for the progression tree generation.