Increasingly, high-dimensional genomics data are becoming available for many organisms.Here, we develop OrthoClust for simultaneously clustering data across multiple species. OrthoClust is a computational framework that integrates the co-association networks of individual species by utilizing the orthology relationships of genes between species. It outputs optimized modules that are fundamentally cross-species, which can either be conserved or species-specific. We demonstrate the application of OrthoClust using the RNA-Seq expression profiles of Caenorhabditis elegans and Drosophila melanogaster from the modENCODE consortium. A potential application of cross-species modules is to infer putative analogous functions of uncharacterized elements like non-coding RNAs based on guilt-by-association.
Dartmouth Digital Commons Citation
Yan, Koon-Kiu; Wang, Daifeng; Rozowsky, Joel; Zheng, Henry; Cheng, Chao; and Gerstein, Mark Gerstein, "OrthoClust: An Orthology-Based Network Framework for Clustering Data Across Multiple Species" (2014). Dartmouth Scholarship. 896.