The deep learning-based tool predicts transcription factors using protein sequences as input
A joint research team from KAIST and UCSD have developed a deep neural network called DeepTFactor that predicts transcription factors from protein sequences. DeepTFactor will serve as a useful tool to understand the organizational systems of living organisms, and to accelerate the use of deep learning to solve biological problems.
A transcription factor is a protein that specifically binds to DNA sequences to control the initiation of transcription. Transcriptional regulation analysis enables an understanding of how organisms control gene expression in response to genetic or environmental changes. In this regard, finding the transcription factor of an organism is the first step in analyzing the transcription regulatory system of an organism.
Previously, transcription factors were predicted by sequential homology analysis with already characteristic transcription factors or by data-based approaches such as machine learning. Traditional machine learning models require a rigorous feature selection process based on field experience such as computing the physical and chemical properties of molecules or analyzing homology in biological sequences. Meanwhile, deep learning can inherently learn the inherent features of a specific task.
A joint research team composed of PhD. Candidate Ji Bae Kim and Distinguished Professor Sang Yob Lee of the Department of Chemical and Biomolecular Engineering at KAIST, Wei Gao and Professor Bernard O. Palson of the Department of Biochemical Engineering at the University of California, reported on a deep learning-based tool for predicting transcription factors. Their paper “DeepTFactor: A Deep Learning Tool For Predicting Transcription Factors” was published online at PNAS.
Their article notes the development of DeepTFactor, a deep learning-based tool that predicts whether a specific protein sequence is a transcription factor using three parallel convolutional neural networks. The joint research team predicted 332 transcription factors of Escherichia coli K-12 MG1655 using DeepTFactor and the performance of DeepTFactor through experimental confirmation of genome-level binding sites of three predicted transcription factors (YqhC, YiaU, and YahB).
The joint research team also used the Prominence method to understand the thinking process of the DeepTFactor operator. The researchers emphasized that although the information regarding the DNA-binding domains of the transcription factor did not explicitly give the training process, DeepTFactor learned it and used it implicitly for the prediction. In contrast to previous transcription factor prediction tools that were only developed for protein sequences of specific organisms, DeepTFactor is expected to be used in analyzing the transcription systems of all organisms at a high level of performance.
Distinguished Professor Sang Yup Lee said, “DeepTFactor can be used to detect unknown transcription factors from numerous protein sequences that have not yet been distinguished. DeepTFactor is expected to serve as an important tool for analyzing regulatory systems of organisms of interest.”
This work was supported by the Technology Development Program for Solving Climate Change in Systems Metabolism Engineering for Biofineries (NRF-2012M1A2A2026556 and NRF-2012M1A2A2026557) from the Ministry of Science and Information Technology and Communications through the National Research Foundation (NRF) in Korea.
KAIST is the first and highest ranking university of science and technology in Korea. KAIST was established in 1971 by the Korean government to educate scientists and engineers committed to industrialization and economic growth in Korea.
Since then, KAIST and its 64,739 alumni have become a gateway to advanced science, technology, innovation and entrepreneurship. KAIST has emerged as one of the most innovative universities with more than 10,000 students enrolled in five colleges and seven schools including 1,039 international students from 90 countries.
On the brink of its half-year anniversary in 2021, KAIST continues to strive to make the world better through the pursuit of education, research, entrepreneurship and globalization.