ITHACA, New York – A new study in Pennsylvania and Cornell describes an effort to produce the most comprehensive and high-resolution map to date of chromosome engineering and gene regulation in yeast, a major step toward improving understanding of development, evolution and environmental responses in higher organisms.
Specifically, the study identified precise binding sites for more than 400 different chromosomal proteins in the yeast genome, most of which regulate the expression of genes.
Yeast cells provide a simple model system containing 6,000 genes, most of which are found in other organisms, including humans, making them excellent candidates for studying essential genes and complex biological pathways.
The paper titled “High-Resolution Protein Engineering of Emerging Yeast Genomes” was published March 10 in the journal Temperate nature.
“It’s a much more complex proposition, but like the yeast genome sequence that preceded the human sequence, I’m sure we’ll be able to see the organizational structure of the human genome,” said senior author B. 83, professor of molecular biology and genetics at the College of Arts and Sciences and former professor at Pennsylvania State University, where he started this work. Matthew Rossi, a research assistant professor at Pennsylvania, is the first author of the paper.
The team used a technique called ChIP-exo to map the binding sites of about 400 different proteins interacting with the yeast genome, some in a few sites and others in thousands of sites.
The team has conducted more than 1,200 individual ChIP-exo experiments, generating billions of individual data points. Analyzing much of the data required the use of Pennsylvania’s supercomputing clusters and the development of several new bioinformatics tools to identify patterns and uncover the regulation of regulatory proteins in the yeast genome. The analysis revealed a surprisingly small number of unique protein clusters that are being used frequently across the yeast genome.
The study revealed two distinct structures of gene regulation, which broadened the traditional paradigm of gene regulation. So-called formative genes – those that perform basic “housekeeping” functions and are always active at low levels – require only a basic set of regulatory controls, while those that are activated by environmental cues, known as pluripotent genes, have a more specialized structure.
The classic model of gene regulation includes proteins called “transcription factors,” which bind to specific DNA sequences to control the expression of a proximal gene. However, the researchers found that “housekeeping” genes – which make up the majority of the genes in yeast – lack the protein and DNA structure that would allow specific transcription factors to link, a hallmark of pluripotent genes.
“The accuracy and completeness of the data allowed us to identify 21 protein groups and also to determine the absence of specific regulatory control signals in the housekeeping genes,” said co-author Sean Mahoney, assistant professor of biochemistry and molecular biology at Pennsylvania. “The computational methods that we developed for analyzing this data could serve as a starting point for further development of gene regulation studies in more complex organisms.”
The co-authors included Associate Professor William K.M. Lai and postdoctoral researcher Chetvan Mittal, both of whom work in the BIOG lab. Greta de Kellogg, Director of the Epigenetics Facility at the Cornell Biotechnology Resource Center and Penn State researchers Prashant K. Kuntala, Naomi Yamada, Nitika Badgatia, Gaurai Cozo, Kylie Bucklund, Nina B. Farrell, Thomas R. Blanda, Joshua de Mayrose, Anne V. Basting, Caitlin S Mystreta, David J. Rocco and Emily S. Perkinson.
This work was supported by the National Institutes of Health, the National Science Foundation, the Pennsylvania State Institute of Computing and Data Sciences, and the computation from the Roar supercomputer in Pennsylvania.