STrenD: Subspace Trend Discovery

From FarsightWiki
Revision as of 15:32, 30 November 2011 by Yan Xu (Talk | contribs)
Jump to: navigation, search

Contents

Clustering

Clustering is for data dimension deduction to speed up the analysis and to achieve better looking progression tree. The clustering has been undertaken on boths sides, samples and features.

For sample cluster, only one param "coherence" is taken into consideration. "Coherence" is mesured by the average Pearson correlation coefficient of each module. Therefore, it should be 0-1. The larger the coherence, the more correlated the module. For feature cluster, besides "coherence", "merge coherence" is the Pearson correlation coefficient of two clustered modules, if the correlation coeffient of two modules exceeds the "merge coherence", these two modules will be merged. Its arrange is also 0-1.

Recommended param setting:

Samples clustering: coherence = 0.95;

Feature clustering: coherence = 0.9; merge coherence = 0.9.

Minimum Spanning Tree

Build MST for each clustering module so as to tell how cells are related to each other in every module.

Similarity Matrix

Based on the MSTs, implement the Earth Mover's Distance(EMD) method for every two modules to see how each module is similar to the other.

Progression Tree

By checking the visualized similarity matrix, choose the ideal modules for the progression tree generation.

Personal tools