The Duke Institute for Genomes Sciences & Policy (IGSP), in partnership with the School of Medicine, is launching a brand new core resource aimed to make sense out of the rising tide of genomic data. The new resource is expected to be up and running by August, offering consultation and services to Duke researchers anywhere on campus for analyzing and interpreting big data, including datasets involving genome sequences and biology, gene expression, proteomics and more.
The new core resource will be led by Director David Corcoran and complements the institute's stable of data-generating genome facilities. It is the latest in a series of new ventures coming out of an IGSP 2.0 evaluation and planning process launched by IGSP Director Huntington Willard early last year in anticipation of the IGSP's tenth-year review in 2012.
“There has been widespread recognition of the need for a substantial Duke community resource for computational and statistical analysis of large and complex datasets,” said IGSP Director Huntington Willard. “There are now literally hundreds of laboratories that are generating and receiving enormous datasets from our microarray facility, the genome sequencing facility, the proteomics facility and the RNAi facility. It has become clear that everyone underestimated the bottleneck that would be created in terms of data storage, data analysis and statistical and computational support. This new resource addresses this need.”
“There is a giant gap that this facility can fill,” Corcoran said. “The demand is huge, and I can only imagine that it will continue to grow in the coming years.”
Corcoran speaks from experience. He has a Master's degree in biostatistics and a PhD in human genetics from the University of Pittsburgh and was most recently a member of Uwe Ohler's computational biology research group in the IGSP, where he led collaborative computational and systems biology projects on everything from Epstein–Barr virus infection to the Arabidopsis root. Now, his goal is to extend his knowledge, experience and skills to those across the Duke campus who can use the expertise for projects that may be daunting in their high-dimensional data complexity and scale.
The open-access resource will provide both production‐mode data analysis, as well as interactive and open-ended collaborative research support for projects extending from concept through to publication, as well as best practices on data management for a range of data‐intensive projects. The facility will also offer training and consultation for researchers who need guidance or are interested in learning more about the statistical and computational analyses themselves.
Initially, the resource is expected to include a master's or PhD level bioinformatician with experience in the analysis of high-throughput genomic datasets as well as a software engineer and IT liaison. In addition to analysis services and training, the core will make on-demand computational infrastructure available for exploring large datasets.
Learn more at http://genome.duke.edu/cores/analysis/