Page 745 – Stanford Medicine X

A powerful knowledge engine for genomics data analysis

Nahil Sobh sobh@illinois.edu


The “KnowEnG" BD2K Center at UIUC (in partnership with Mayo Clinic) is dedicated to the construction of “Knowledge Engine for Genomics” (KnowEnG), a Cloud-based E-science framework for genomics where biomedical scientists will have access to powerful methods of data mining and machine learning to extract important insights out of genomics data. The scientist will go to the KnowEnG portal with their own data sets in the form of spreadsheets and use KnowEnG to analyze those data sets in the light of a massive compendium of community data sets. These data sets, stored in the form of the “Knowledge Network” – a heterogeneous network of genes and their relationships and annotations – will encapsulate prior knowledge that is incorporated into analysis of user data sets.
I will describe current progress of the KnowEnG Center, emphasizing the novel algorithms that we have developed and applied to the discovery of mechanisms underlying diverse phenotypes such as drug response and social behavior. For instance, we have developed:
(1) a technique based on diffusion component analysis that identifies cancer pathways associated with drug response,
(2) an approach that uses network-smoothing of gene expression data and random walks with restart on the Knowledge Network to better rank cytotoxicity-related genes,
(3) a probabilistic graphical model that integrates genotype, gene expression and transcription factor-DNA binding data with drug response data to identify regulatory mechanisms of drug response variation across individuals, and
(4) random walk-based methods for gene set characterization, as an alternative to existing techniques such gene set enrichment analysis, using it to glean systems-level insights about social regulation of aggressive behavior.
I will present key ideas of these new approaches to knowledge-guided analysis of omic data sets, as well as major features of the Cloud-based knowledge engine enabling these analyses.
