Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Сумский государственный университет

Предмет:

Биомеханика

Файл:

Computational Methods for Protein Structure Prediction & Modeling V1 - Xu Xu and Liang

.pdf

Скачиваний:

Добавлен:

10.08.2013

Размер:

10.5 Mб

Скачать

☆

<<< < Предыдущая 12 / 412 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

Preface

structure–activity relationship. A number of software packages for structure-based design are compared.

Chapter 17 (Protein Structure Prediction as a Systems Problem) provides a novel systematic view on solving the complex problem of protein structure prediction. It introduces consensus-based approach, pipeline approach, and expert system for predicting protein structure and for inferring protein functions. This chapter also discusses issues such as benchmark data and evaluation metrics. An example of protein structure prediction at genome-wide scale is also given.

Chapter 18 (Resources and Infrastructure for Structural Bioinformatics) describes tools, databases, and other resources of protein structure analysis and prediction available on the Internet. These include the PDB and related databases and servers, structural visualization tools, protein sequence and function databases, as well as resources for RNA structure modeling and prediction. It also gives information on major journals, professional societies, and conferences of the ﬁeld.

Appendix 1 (Biological and Chemical Basics Related to Protein Structures) introduces central dogma of molecular biology, macromolecules in the cell (DNA, RNA, protein), amino acid residues, peptide chain, primary, secondary, tertiary, and quaternary structure of proteins, and protein evolution.

Appendix 2 (Computer Science for Structural Informatics) discusses computer science concepts that are essential for effective computation for protein structure prediction. These include efﬁcient data structure, computational complexity and NP-hardness, various algorithmic techniques, parallel computing, and programming.

Appendix 3 (Physical and Chemical Basis for Structural Bioinformatics) covers basic concepts of our physical world, including unit system, coordinate systems, and energy surfaces. It also describes biochemical and biophysical concepts such as chemical reaction, peptide bonds, covalent bonds, hydrogen bonds, electrostatic interactions, van der Waals interactions, as well as hydrophobic interactions. In addition, this chapter discusses basic concepts from thermodynamics and statistical mechanics. Computational sampling techniques such as molecular dynamics and Monte Carlo method are also discussed.

Appendix 4 (Mathematics and Statistics for Studying Protein Structures) covers various basic concepts in mathematics and statistics, often used in structural bioinformatics studies such as probability distributions (uniform, Gaussian, binomial and multinomial, Dirichlet and gamma, extreme value distribution), basics of information theory including entropy, relative entropy, and mutual information, Markovian process and hidden Markov model, hypothesis testing, statistical inference (maximum likelihood, expectation maximization, and Bayesian approach), and statistical sampling (rejection sampling, Gibbs sampling, and Metropolis–Hastings algorithm).

Ying Xu

Dong Xu

Jie Liang

John Wooley

April 2006

Acknowledgments

During the editing of this book, we, the editors, have received tremendous help from many friends, colleagues, and families, to whom we would like to take this opportunity to express our deep gratitude and appreciation. First we would like to thank Dr. Eli Greenbaum of Oak Ridge National Laboratory, who encouraged us to start this book project and contacted the publisher at Springer on our behalf. We are very grateful to the following colleagues who have critically reviewed the drafts of the chapters of the book at various stages: Nick Alexandrov, Nir Ben-Tal, Natasja Brooijmans, Chris Bystroff, Pablo Chacon, Luonan Chen, Zhong Chen, Yong Duan, Roland Dunbrack, Daniel Fischer, Juntao Guo, Jaap Heringa, Xiche Hu, Ana Kitazono, Ioan Kosztin, Sandeep Kumar, Xiang Li, Guohui Lin, Zhijie Liu, Hui Lu, Alex Mackerell, Kunbin Qu, Robert C. Rizzo, Ilya Shindyalov, Ambuj Singh, Alex Tropsha, Iosif Vaisman, Ilya Vakser, Stella Veretnik, Björn Wallner, Jin Wang, Zhexin Xiang, Yang Dai, Xin Yuan, and Yaoqi Zhou. Their invaluable input on the scientiﬁc content, on the pedagogical style, and on the writing style helped to improve these book chapters signiﬁcantly. We also want to thank Ms. Joan Yantko of the University of Georgia for her tireless help on numerous fronts in this book project, including taking care of a large number of email communications between the editors and the authors and chasing busy authors to get their revisions and other materials. Last but not least, we want to thank our families for their constant support and encouragement during the process of us working on this book project.

xiii

Contents

Contributors ..............................................................................		xvii
1	A Historical Perspective and Overview of Protein
	Structure Prediction ..............................................................	1
	John C. Wooley and Yuzhen Ye
2	Empirical Force Fields ...........................................................	45
	Alexander D. MacKerell, Jr.
3	Knowledge-Based Energy Functions for Computational
	Studies of Proteins.................................................................	71
	Xiang Li and Jie Liang
4	Computational Methods for Domain Partitioning of
	Protein Structures .................................................................	125
	Stella Veretnik and Ilya Shindyalov
5	Protein Structure Comparison and Classiﬁcation.........................	147
	Orhan C¸ amoglu˘ and Ambuj K. Singh
6	Computation of Protein Geometry and Its Applications:
	Packing and Function Prediction..............................................	181
	Jie Liang
7	Local Structure Prediction of Proteins.......................................	207
	Victor A. Simossis and Jaap Heringa
8	Protein Contact Map Prediction...............................................	255
	Xin Yuan and Christopher Bystroff
9	Modeling Protein Aggregate Assembly and Structure ...................	279
	Jun-tao Guo, Carol K. Hall, Ying Xu, and Ronald B. Wetzel
10	Homology-Based Modeling of Protein Structure ..........................	319
	Zhexin Xiang

xvi	Contents
11 Modeling Protein Structures Based on Density Maps
at Intermediate Resolutions.....................................................	359
Jianpeng Ma
Index ........................................................................................	389

Contributors

Natasja Brooijmans

Chemical and Screening Sciences

Wyeth Research

Pearl River, New York 10965

Christopher Bystroff

Department of Biology

Rensselaer Polytechnic Institute

Troy, New York 12180

Liming Cai

Department of Computer Science

University of Georgia

Athens, Georgia 30602-7404

Orhan Camoglu

Department of Computer Science

University of California Santa Barbara

Santa Barbara, California 93106

Yang Dai

Department of Bioengineering

University of Illinois at Chicago

Chicago, Illinois 60607-7052

Haobo Guo

Department of Biochemistry and

Cellular and Molecular Biology

University of Tennessee

Knoxville, Tennessee 37996

Hong Guo

Department of Biochemistry and

Cellular and Molecular

Biology

University of Tennessee

Knoxville, Tennessee 37996

Jun-tao Guo

Department of Biochemistry and

Molecular Biology

University of Georgia

Athens, Georgia 30602-7229

Carol K. Hall

Department of Chemical and

Biomolecular Engineering

North Carolina State University

Raleigh, North Carolina 27695

Jaap Heringa

Centre for Integrative Bioinformatics Vrije Universiteit

1081 HV Amsterdam, The

Netherlands

xvii

xviii

Contributors

Xiche Hu

Department of Chemistry

University of Toledo

Toledo, Ohio 43606

Ling-Hong Hung

Department of Microbiology

University of Washington

Seattle, Washington 98195-7242

Xiang Li

Department of Bioengineering

University of Illinois at Chicago

Chicago, Illinois 60607-7052

Jie Liang

Department of Bioengineering

University of Illinois at Chicago

Chicago, Illinois 60607-7052

Guohui Lin

Department of Computing Science

University of Alberta

Edmonton, Alberta T6G 2E8, Canada

Zhijie Liu

Department of Biochemistry and

Molecular Biology

University of Georgia

Athens, Georgia 30602-7229

Hui Lu

Department of Bioengineering

University of Illinois at Chicago

Chicago, Illinois 60607-7052

Jianpeng Ma

Department of Biochemistry and

Molecular Biology

Baylor College of Medicine

Houston, Texas 77030

and

Department of Bioengineering

Rice University

Houston, Texas 77005

Alexander D. MacKerell, Jr.

Department of Pharmaceutical

Chemistry

School of Pharmacy

University of Maryland

Baltimore, Maryland 21201

Shing-Chung Ngan

Department of Microbiology

University of Washington

Seattle, Washington 98195-7242

Ognjen Periˇsi´c

Department of Bioengineering

University of Illinois at Chicago

Chicago, Illinois 60607-7052

Contributors	xix
Brian Pierce	Stella Veretnik
Department of Biomedical	San Diego Supercomputer Center
Engineering	University of California San Diego
Boston University	San Diego, California 92093-0505
Boston, Massachusetts 02215
	Zhiping Weng
Kunbin Qu	Department of Biomedical
	Department of Biomedical
Department of Chemistry	Engineering
Rigel Pharmaceuticals, Inc.	Boston University
San Francisco, California 94080	Boston, Massachusetts 02215

Ram Samudrala

Department of Microbiology

University of Washington

Seattle, Washington 98195-7242

Ilya Shindyalov

San Diego Supercomputer Center

University of California San Diego

San Diego, California 92093-0505

Victor A. Simossis

Centre for Integrative Bioinformatics Vrije Universiteit

1081 HV Amsterdam, The Netherlands

Ambuj K. Singh

Department of Computer Science

University of California Santa Barbara

Santa Barbara, California 93106

Ronald B. Wetzel

Department of Structural Biology Pittsburgh Institute for

Neurodegenerative Diseases

University of Pittsburgh School of

Medicine

Pittsburgh, Pennsylvania 15260

John C. Wooley

Associate Vice Chancellor for Research

University of California San Diego

San Diego, California 92093-0043

Zhexin Xiang

Center for Molecular Modeling Center for Information Technology National Institutes of Health Bethesda, Maryland 20892-5624

	xx	Contributors
	Dong Xu	Yuzhen Ye
	Computer Science Department	Bioinformatics and Systems Biology
	University of Missouri—Columbia	Department
	Columbia, Missouri 65211-2060	The Burnham Institute for Medical
		Research
	Ying Xu	La Jolla, California 92037
	Institute of Bioinformatics and	Xin Yuan
	Department of Biochemistry	Xin Yuan
	Department of Biochemistry
	and Molecular Biology	Department of Computer Science
	University of Georgia	Florida State University
	Athens, Georgia 30602-7229	Tallahassee, Florida 32306

1A Historical Perspective and Overview of Protein Structure Prediction

John C. Wooley and Yuzhen Ye

1.1 Introduction

Carrying on many different biological functions, proteins are all composed of one or more polypeptide chains, each containing from several to hundreds or even thousands of the 20 amino acids. During the 1950s at the dawn of modern biochemistry, an essential question for biochemists was to understand the structure and function of these polypeptide chains. The sequences of protein, also referred to as their primary structures, determine the different chemical properties for different proteins, and thus continue to captivate much of the attention of biochemists. As an early step in characterizing protein chemistry, British biochemist Frederick Sanger designed an experimental method to identify the sequence of insulin (Sanger et al., 1955). He became the ﬁrst person to obtain the primary structure of a protein and in 1958 won his ﬁrst Nobel Price in Chemistry. This important progress in sequencing did not answer the question of whether a single (individual) protein has a distinctive shape in three dimensions (3D), and if so, what factors determine its 3D architecture. However, during the period when Sanger was studying the primary structure of proteins, American biochemist Christian Anﬁnsen observed that the active polypeptide chain of a model protein, bovine pancreatic ribonuclease (RNase), could fold spontaneously into a unique 3D structure, which was later called native conformation of the protein (Anﬁnsen et al., 1954). Anﬁnsen also studied the refolding of RNase enzyme and observed that an enzyme unfolded under extreme chemical environment could refold spontaneously back into its native conformation upon changing the environment back to natural conditions (Anﬁnsen et al., 1961). By 1962, Anﬁnsen had developed his theory of protein folding (which was summarized in his 1972 Nobel acceptance speech): “The native conformation is determined by the totality of interatomic interactions and hence, by the amino acid sequence, in a given environment.”

Anﬁnsen’s theory of protein folding established the foundation for solving the protein structure prediction problem, i.e., for predicting the native conformation of a protein from its primary sequence, because all information needed to predict the native conformation is encoded in the sequence. The early approaches to solving this problem were based solely on the thermodynamics of protein folding. Scheraga and his colleagues applied several computer searching techniques to investigate the

2	John C. Wooley and Yuzhen Ye

free energy of numerous local minimum energy conformations in an attempt to ﬁnd the global minimum conformation, i.e., the thermodynamically most stable conformation of the protein (Gibson and Scheraga, 1967a,b; Scott et al., 1967). The major challenge for an energy minimization approach to protein structure prediction is that proteins are very ﬂexible; thus, their potential conformation space is too large to be enumerated. [Despite the huge space of possible conformations, that proteins fold reliably and quickly to their native conformation is known as “Levinthal’s paradox” (Levinthal, 1968)]. To address this issue, one needs an accurate energy function to compute the energy for a given protein conformation and a rapid computer searching algorithm. The progress of peptide molecular mechanics enabled the development of molecular force ﬁelds that described the physical interactions between atoms using Newton’s equations of motion. In general, the interactions considered in the force ﬁeld include covalent bonds and noncovalent interactions, such as electrostatic interactions, the van der Waals interactions, and, sometimes, hydrogen bonds and hydrophobic interactions. The parameters used in these force ﬁelds were obtained through experimental studies of small organic molecules. On the other hand, many computational methods developed in the ﬁeld of optimization theory and mechanics have been applied to the rapid conformation search. These fall into two categories: the molecular dynamics method and the Brownian dynamics (or stochastic dynamics) method. Both methods sample a portion of potential protein conformations and evaluate their free energy. Molecular dynamics samples the conformations by simulating the protein motion based on Newton’s equation, starting from an arbitrarily chosen protein conformation. Brownian dynamics, instead, uses Monte Carlo random sampling technique or its derivatives to evaluate protein conformations. Combining various force ﬁelds and conformation searching methods, many software packages were developed, such as AMBER (Pearlman et al., 1995), CHARMM (Brooks et al., 1983) and GROMOS (van Gunsteren and Berendsen, 1990), all aimed at using computing simulations to predict the native conformation of proteins.

Despite the great theoretic interest in energy minimization methods, these have not been very successful in practice, because of the huge search space for potential protein conformations. In 1975, Levitt and Warshel used a simpliﬁed protein structure representation and successfully folded a small protein [bovine pancreatic trypsin inhibitor, (BPTI), 58 amino acid residues] into its native conformation from an open-chain conformation using energy minimization (Levitt and Warshel, 1975). Little progress, however, has been made since then; the simulation usually takes an unrealistic compute or run time, and the ﬁnal prediction is not very satisfactory. For instance, in 1998, Duan and Kollman reported a simulation experiment of one small protein (the villin headpiece subdomain, 36 amino acid residues), running on a Cray T3D and then a Cray T3E supercomputer, that took months of computation with the entire machine dedicated to the problem (Duan and Kollman, 1998). Even though the resulting structure is reasonably folded and shows some resemblance to the native structure, the simulated and native structure did not completely match. Currently, energy minimization methods are largely used to reﬁne a low-resolution initial structure obtained by experimental methods or by comparative modeling (Levitt and Lifson, 1969).

<<< < Предыдущая 12 / 412 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

Соседние файлы в предмете Биомеханика

#
10.08.201314.05 Mб69Biomolecular Sensing Processing and Analysis - Rashid Bashir and Steve Wereley.pdf
#
10.08.201320.53 Mб86Bioreaction Engineering Principles - Jens Nielsen.pdf
#
10.08.201326.55 Mб119Bioregenerative Engineering Principles and Applications - Shu Q. Liu..pdf
#
10.08.20134.43 Mб404Biosignal and Biomedical Image Processing MATLAB based Applications - John L. Semmlow.pdf
#
10.08.20133.76 Mб65Biotechnology for Biomedical Engineers - Martin L. Yarmush et al.pdf
#
10.08.201310.5 Mб61Computational Methods for Protein Structure Prediction & Modeling V1 - Xu Xu and Liang.pdf
#
10.08.201330.78 Mб43CRC Press - Biomedical Photonics Handbook.pdf
#
10.08.20134.33 Mб60Cytoskeletal Mechanics - Mofrad and Kamm.pdf
#
10.08.20133.42 Mб67E coli in Motion - Howard C. Berg.pdf
#
10.08.201316.8 Mб59Engineering and Manufacturing for Biotechnology - Marcel Hofman & Philippe Thonart.pdf
#
10.08.20137.9 Mб234Environmental Biotechnology - Jordening and Winter.pdf