Major advances in understanding the regulation and organization of the human genome

JBC releases series of articles about ENCODE results

ENCODE Thematic Series artSept. 5, 2012 — The National Human Genome Research Institute today announced the results of a five-year international study of the regulation and organization of the human genome. The project is named ENCODE, which stands for the Encyclopedia of DNA Elements. In conjunction with the release of those results, The Journal of Biological Chemistry has published a series of minireviews that focus on several aspects of the findings.


Introduction to the Thematic Minireview Series on results from the ENCODE Project: Integrative global analyses of regulatory regions in the human genome 

The Encyclopedia of DNA Elements (ENCODE) Project ( is an international collaboration of research groups funded by the National Human Genome Research Institute, with the goal of delineating all functional elements encoded in the human genome. This project began in 2003 with a targeted analysis of a selected 1% of the human genome. The results from the pilot project were published in 2007 and a second phase of funding was then provided to scale the project to the entire human genome. Genome-scale projects in ENCODE involve the identification and quantification of RNA species in whole cells and subcellular compartments, mapping of protein-coding and non-coding genes by manual review and experimental methods, delineation of chromatin and DNA accessibility, mapping of histone modifications and transcription factor binding sites by chromatin immunoprecipitation (ChIP), and measurement of DNA methylation. More recently, ENCODE has adopted additional approaches that have not yet resulted in extensive datasets, including the examination of long-range chromatin interactions, the analysis of RNA binding proteins, and the validation of transcriptional enhancers and silencers. To date, more than 2000 datasets have been deposited for public use by the ENCODE Project at the University of California Santa Cruz (UCSC) Genome Browser; to encourage public use of the datasets, a “user’s guide” to the ENCODE datasets has been published. As the second phase of the ENCODE Project nears completion, the ENCODE Consortium has prepared a large, integrative manuscript that includes analyses of experiments from 147 cell types and provides a summary of their functional annotation of the human genome. Additionally, other more narrowly focused studies on subsets of ENCODE data have been or will soon be published; for a list of ENCODE publications, see

“The ENCODE project not only generated an enormous body of data about our genome, but it also analyzed many issues to better understand how the genome functions in different types of cells. These insights from integrative analyses are really stories about how molecular machines interact with each other and work on DNA to produce the proteins and RNAs needed for each cell to function within our bodies,” explains Ross Hardison of Pennsylvania State University, one of the JBC authors.

Hardison continued: “The Journal of Biological Chemistry recognized that the results from the ENCODE project also would catalyze much new research from biochemists and molecular biologists around the world. Hence, the journal commissioned these articles not only to communicate the insights from the papers now being published but also to stimulate more research in the broader community.”

The human genome consists of about 3 billion DNA base pairs, but only a small percentage of DNA actually codes for proteins. The roles and functions of the remaining genetic information were unclear to scientists and even referred to as “junk DNA.” But the results of the ENCODE project is filling this knowledge gap. The findings revealed that more than 80 percent of the human genome is associated with biological function.

The study showed in a comprehensive way that proteins switch genes on and off regularly – and can do so at distances far from the genes they regulate – and it determined sites on chromosomes that interact, the locations where chemical modifications to DNA can influence gene expression, and how the functional forms of RNA can regulate the expression of genetic information.

The results establish the ways in which genetic information is controlled and expressed in specific cell types and distinguish particular regulatory regions that may contribute to diseases.

“The deeper knowledge of gene regulation coming from the ENCODE project will have a positive impact on medical science,” Hardison emphasizes. For example, recent genetic studies have revealed many genomic locations that can affect a person’s susceptibility to common diseases. The ENCODE data show that many of these regions are involved in gene regulation, and the data provide hypotheses for how variations in these regions can affect disease susceptibility, adds Hardison.

The effort behind the ENCODE project was extraordinary. More than 440 scientists in 32 labs in United States, the United Kingdom, Spain, Singapore and Japan performed more than 1,600 sets of experiments on 147 types of tissue. The results were published today in one main integrative paper and five other papers in the journal Nature, 18 papers in Genome Research and six papers in Genome Biology.