seegraph - Dynamic Visualization of Coexpression in Systems Genetics Data
Biologists hope to address grand scientific challenges by exploring the abundance of data made available through microarray analysis and other high-throughput techniques. However, the impact of this large volume of data is limited unless researchers can effectively assimilate the entirety of this complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene coexpression can make use of novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters and achieving data reduction. These tools should allow biologists to develop an intuitive understanding of the structure of biological networks and discover genes that reside in critical positions in networks and pathways.
By using a graph as a universal data representation of correlation in gene expression data, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene coexpression data integrates techniques such as graph layout, qualitative subgraph extraction through a novel 2D user interface (based on BTD transformation of adjacency matrices), quantitative data selection using graph-theoretic algorithms or by querying an optimized B-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using a real-world workflow from a large-scale systems genetics study of mammalian gene coexpression. This work has been published in IEEE Transactions on Visualization and Computer Graphics, Vol. 14, No. 5, 2008. pdf
In the screenshot above, we show the BTD based user interface and the resultant level-of-detail reduction of a 7000+ gene dataset to 10 groups. Using a tool like this, gene networks have been discovered with a single putatively coregulating gene as a potential target of knock-out study with proximity information for other potential regulatory genes undergoing further study. Our work led to the discovery of candidate genes, which can affect expression of several genes throughout the genome that play a role in the locomotor response of mice exposed to methamphetamine and cocaine.
Credits: The systems genetics data was provided by Dr. Elissa Chesler et al. under the auspices of Oak Ridge National Lab's Life Sciences Division while subsequent paraclique extraction and data processing was performed by Dr. Michael Langston et al. as part of the ongoing collaboration between The University of Tennessee and Oak Ridge National Laboratory.
Joshua New, Wesley Kendall, Jian Huang, Elissa Chesler, 'Dynamic Visualization of Gene Coexpression in Systems Genetics Data, IEEE Transactions on Visualization and Computer Graphics, 14(5), pp. 1081-1094, 2008.