- •Foreword
- •Preface
- •Contents
- •Introduction
- •Oren M. Becker
- •Alexander D. MacKerell, Jr.
- •Masakatsu Watanabe*
- •III. SCOPE OF THE BOOK
- •IV. TOWARD A NEW ERA
- •REFERENCES
- •Atomistic Models and Force Fields
- •Alexander D. MacKerell, Jr.
- •II. POTENTIAL ENERGY FUNCTIONS
- •D. Alternatives to the Potential Energy Function
- •III. EMPIRICAL FORCE FIELDS
- •A. From Potential Energy Functions to Force Fields
- •B. Overview of Available Force Fields
- •C. Free Energy Force Fields
- •D. Applicability of Force Fields
- •IV. DEVELOPMENT OF EMPIRICAL FORCE FIELDS
- •B. Optimization Procedures Used in Empirical Force Fields
- •D. Use of Quantum Mechanical Results as Target Data
- •VI. CONCLUSION
- •REFERENCES
- •Dynamics Methods
- •Oren M. Becker
- •Masakatsu Watanabe*
- •II. TYPES OF MOTIONS
- •IV. NEWTONIAN MOLECULAR DYNAMICS
- •A. Newton’s Equation of Motion
- •C. Molecular Dynamics: Computational Algorithms
- •A. Assigning Initial Values
- •B. Selecting the Integration Time Step
- •C. Stability of Integration
- •VI. ANALYSIS OF DYNAMIC TRAJECTORIES
- •B. Averages and Fluctuations
- •C. Correlation Functions
- •D. Potential of Mean Force
- •VII. OTHER MD SIMULATION APPROACHES
- •A. Stochastic Dynamics
- •B. Brownian Dynamics
- •VIII. ADVANCED SIMULATION TECHNIQUES
- •A. Constrained Dynamics
- •C. Other Approaches and Future Direction
- •REFERENCES
- •Conformational Analysis
- •Oren M. Becker
- •II. CONFORMATION SAMPLING
- •A. High Temperature Molecular Dynamics
- •B. Monte Carlo Simulations
- •C. Genetic Algorithms
- •D. Other Search Methods
- •III. CONFORMATION OPTIMIZATION
- •A. Minimization
- •B. Simulated Annealing
- •IV. CONFORMATIONAL ANALYSIS
- •A. Similarity Measures
- •B. Cluster Analysis
- •C. Principal Component Analysis
- •REFERENCES
- •Thomas A. Darden
- •II. CONTINUUM BOUNDARY CONDITIONS
- •III. FINITE BOUNDARY CONDITIONS
- •IV. PERIODIC BOUNDARY CONDITIONS
- •REFERENCES
- •Internal Coordinate Simulation Method
- •Alexey K. Mazur
- •II. INTERNAL AND CARTESIAN COORDINATES
- •III. PRINCIPLES OF MODELING WITH INTERNAL COORDINATES
- •B. Energy Gradients
- •IV. INTERNAL COORDINATE MOLECULAR DYNAMICS
- •A. Main Problems and Historical Perspective
- •B. Dynamics of Molecular Trees
- •C. Simulation of Flexible Rings
- •A. Time Step Limitations
- •B. Standard Geometry Versus Unconstrained Simulations
- •VI. CONCLUDING REMARKS
- •REFERENCES
- •Implicit Solvent Models
- •II. BASIC FORMULATION OF IMPLICIT SOLVENT
- •A. The Potential of Mean Force
- •III. DECOMPOSITION OF THE FREE ENERGY
- •A. Nonpolar Free Energy Contribution
- •B. Electrostatic Free Energy Contribution
- •IV. CLASSICAL CONTINUUM ELECTROSTATICS
- •A. The Poisson Equation for Macroscopic Media
- •B. Electrostatic Forces and Analytic Gradients
- •C. Treatment of Ionic Strength
- •A. Statistical Mechanical Integral Equations
- •VI. SUMMARY
- •REFERENCES
- •Steven Hayward
- •II. NORMAL MODE ANALYSIS IN CARTESIAN COORDINATE SPACE
- •B. Normal Mode Analysis in Dihedral Angle Space
- •C. Approximate Methods
- •IV. NORMAL MODE REFINEMENT
- •C. Validity of the Concept of a Normal Mode Important Subspace
- •A. The Solvent Effect
- •B. Anharmonicity and Normal Mode Analysis
- •VI. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Free Energy Calculations
- •Thomas Simonson
- •II. GENERAL BACKGROUND
- •A. Thermodynamic Cycles for Solvation and Binding
- •B. Thermodynamic Perturbation Theory
- •D. Other Thermodynamic Functions
- •E. Free Energy Component Analysis
- •III. STANDARD BINDING FREE ENERGIES
- •IV. CONFORMATIONAL FREE ENERGIES
- •A. Conformational Restraints or Umbrella Sampling
- •B. Weighted Histogram Analysis Method
- •C. Conformational Constraints
- •A. Dielectric Reaction Field Approaches
- •B. Lattice Summation Methods
- •VI. IMPROVING SAMPLING
- •A. Multisubstate Approaches
- •B. Umbrella Sampling
- •C. Moving Along
- •VII. PERSPECTIVES
- •REFERENCES
- •John E. Straub
- •B. Phenomenological Rate Equations
- •II. TRANSITION STATE THEORY
- •A. Building the TST Rate Constant
- •B. Some Details
- •C. Computing the TST Rate Constant
- •III. CORRECTIONS TO TRANSITION STATE THEORY
- •A. Computing Using the Reactive Flux Method
- •B. How Dynamic Recrossings Lower the Rate Constant
- •IV. FINDING GOOD REACTION COORDINATES
- •A. Variational Methods for Computing Reaction Paths
- •B. Choice of a Differential Cost Function
- •C. Diffusional Paths
- •VI. HOW TO CONSTRUCT A REACTION PATH
- •A. The Use of Constraints and Restraints
- •B. Variationally Optimizing the Cost Function
- •VII. FOCAL METHODS FOR REFINING TRANSITION STATES
- •VIII. HEURISTIC METHODS
- •IX. SUMMARY
- •ACKNOWLEDGMENT
- •REFERENCES
- •Paul D. Lyne
- •Owen A. Walsh
- •II. BACKGROUND
- •III. APPLICATIONS
- •A. Triosephosphate Isomerase
- •B. Bovine Protein Tyrosine Phosphate
- •C. Citrate Synthase
- •IV. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Jeremy C. Smith
- •III. SCATTERING BY CRYSTALS
- •IV. NEUTRON SCATTERING
- •A. Coherent Inelastic Neutron Scattering
- •B. Incoherent Neutron Scattering
- •REFERENCES
- •Michael Nilges
- •II. EXPERIMENTAL DATA
- •A. Deriving Conformational Restraints from NMR Data
- •B. Distance Restraints
- •C. The Hybrid Energy Approach
- •III. MINIMIZATION PROCEDURES
- •A. Metric Matrix Distance Geometry
- •B. Molecular Dynamics Simulated Annealing
- •C. Folding Random Structures by Simulated Annealing
- •IV. AUTOMATED INTERPRETATION OF NOE SPECTRA
- •B. Automated Assignment of Ambiguities in the NOE Data
- •C. Iterative Explicit NOE Assignment
- •D. Symmetrical Oligomers
- •VI. INFLUENCE OF INTERNAL DYNAMICS ON THE
- •EXPERIMENTAL DATA
- •VII. STRUCTURE QUALITY AND ENERGY PARAMETERS
- •VIII. RECENT APPLICATIONS
- •REFERENCES
- •II. STEPS IN COMPARATIVE MODELING
- •C. Model Building
- •D. Loop Modeling
- •E. Side Chain Modeling
- •III. AB INITIO PROTEIN STRUCTURE MODELING METHODS
- •IV. ERRORS IN COMPARATIVE MODELS
- •VI. APPLICATIONS OF COMPARATIVE MODELING
- •VII. COMPARATIVE MODELING IN STRUCTURAL GENOMICS
- •VIII. CONCLUSION
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Roland L. Dunbrack, Jr.
- •II. BAYESIAN STATISTICS
- •A. Bayesian Probability Theory
- •B. Bayesian Parameter Estimation
- •C. Frequentist Probability Theory
- •D. Bayesian Methods Are Superior to Frequentist Methods
- •F. Simulation via Markov Chain Monte Carlo Methods
- •III. APPLICATIONS IN MOLECULAR BIOLOGY
- •B. Bayesian Sequence Alignment
- •IV. APPLICATIONS IN STRUCTURAL BIOLOGY
- •A. Secondary Structure and Surface Accessibility
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Computer Aided Drug Design
- •Alexander Tropsha and Weifan Zheng
- •IV. SUMMARY AND CONCLUSIONS
- •REFERENCES
- •Oren M. Becker
- •II. SIMPLE MODELS
- •III. LATTICE MODELS
- •B. Mapping Atomistic Energy Landscapes
- •C. Mapping Atomistic Free Energy Landscapes
- •VI. SUMMARY
- •REFERENCES
- •Toshiko Ichiye
- •II. ELECTRON TRANSFER PROPERTIES
- •B. Potential Energy Parameters
- •IV. REDOX POTENTIALS
- •A. Calculation of the Energy Change of the Redox Site
- •B. Calculation of the Energy Changes of the Protein
- •B. Calculation of Differences in the Energy Change of the Protein
- •VI. ELECTRON TRANSFER RATES
- •A. Theory
- •B. Application
- •REFERENCES
- •Fumio Hirata and Hirofumi Sato
- •Shigeki Kato
- •A. Continuum Model
- •B. Simulations
- •C. Reference Interaction Site Model
- •A. Molecular Polarization in Neat Water*
- •B. Autoionization of Water*
- •C. Solvatochromism*
- •F. Tautomerization in Formamide*
- •IV. SUMMARY AND PROSPECTS
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Nucleic Acid Simulations
- •Alexander D. MacKerell, Jr.
- •Lennart Nilsson
- •D. DNA Phase Transitions
- •III. METHODOLOGICAL CONSIDERATIONS
- •A. Atomistic Models
- •B. Alternative Models
- •IV. PRACTICAL CONSIDERATIONS
- •A. Starting Structures
- •C. Production MD Simulation
- •D. Convergence of MD Simulations
- •WEB SITES OF INTEREST
- •REFERENCES
- •Membrane Simulations
- •Douglas J. Tobias
- •II. MOLECULAR DYNAMICS SIMULATIONS OF MEMBRANES
- •B. Force Fields
- •C. Ensembles
- •D. Time Scales
- •III. LIPID BILAYER STRUCTURE
- •A. Overall Bilayer Structure
- •C. Solvation of the Lipid Polar Groups
- •IV. MOLECULAR DYNAMICS IN MEMBRANES
- •A. Overview of Dynamic Processes in Membranes
- •B. Qualitative Picture on the 100 ps Time Scale
- •C. Incoherent Neutron Scattering Measurements of Lipid Dynamics
- •F. Hydrocarbon Chain Dynamics
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Appendix: Useful Internet Resources
- •B. Molecular Modeling and Simulation Packages
- •Index
344 |
Dunbrack |
predictive distribution and can be achieved by making draws from the posterior distribution and from these values, making draws from the likelihood function, i.e.,
p(y˜|y) Θ p(y˜|θ)p(θ|y) dθ |
Θ p(y˜|θ)p(y|θ)p(θ) dθ |
|
(56) |
|
Θ p(y|θ)p(θ) dθ |
||||
|
|
This distribution resembles the data closely for rotamer (3, 3, 3) but also forms a very reasonable distribution when there are only seven data points (3, 3, 1). A good posterior predictive distribution for any protein structural feature can be used in simulations of protein folding or structure prediction.
V.CONCLUSION
The field of statistics arose in the eighteenth and nineteenth centuries because of the need to develop good public policy based on demographic and economic data. Applications in the natural sciences were immediate, but generally natural scientists have lagged behind in their knowledge of modern statistics compared to social scientists. This is unfortunate, because many algorithms and methodologies have been developed in the last 20 years or so that make feasible sophisticated analysis of very complex data sets. Bayesian statistics has been used fruitfully in molecular and structural biology in recent years but has enjoyed more applications in genetics and clinical research and in the social sciences. Bayesian methods are particularly useful in modeling complex data, where the distribution of information may be uneven or hierarchical. This is true not only of the sequence and structure databases described in this chapter but also of more recently developed experimental methods such as DNA microarrays for analyzing mRNA expression levels over many thousands of genes [100–106]. The computational challenges for this kind of data are immense [107,108]. Particularly now, when the influx of data in biology is overwhelming, Bayesian statistical analysis promises to be an important tool.
ACKNOWLEDGMENTS
I thank Prof. Marc Sobel of Temple University for many useful discussions on Bayesian statistics. This work was funded in part by an appropriation from the Commonwealth of Pennsylvania and NIH Grant CA06927.
REFERENCES
1.RL Dunbrack Jr. Culling the PDB by resolution and sequence identity. 1999. http:// www.fccc.edu/research/labs/dunbrack/culledpdb.html
2.CA Orengo, AD Michie, S Jones, DT Jones, MB Swindells, JM Thornton. CATH—A hierarchic classification of protein domain structures. Structure 5:1093–1108, 1997.
3.L Holm, C Sander. Touring protein fold space with Dali/FSSP. Nucleic Acids Res 26:316– 319, 1998.
Bayesian Statistics |
345 |
4.TJ Hubbard, B Ailey, SE Brenner, AG Murzin, C Chothia. SCOP: A structural classification of proteins database. Nucleic Acids Res 27:254–256, 1999.
5.JS Shoemaker, IS Painter, BS Weir. Bayesian statistics in genetics: A guide for the uninitiated. Trends Genet 15:354–358, 1999.
6.S Greenland. Probability logic and probability induction. Epidemiology 9:322–332, 1998.
7.GM Petersen, G Parmigiani, D Thomas. Missense mutations in disease genes: A Bayesian approach to evaluate causality. Am J Hum Genet. 62:1516–1524, 1998.
8.DA Berry, DK Stangl, eds. Bayesian Biostatistics. New York: Marcel Dekker, 1996.
9.G D’Agostini. Bayesian reasoning in high energy physics: Principles and applications. CERN Lectures, 1998.
10.TJ Loredo. In: PF Fouge`re, ed. From Laplace to Supernova SN 1987A: Bayesian Inference in Astrophysics. Dordrecht, The Netherlands: Kluwer, 1990, pp 81–142.
11.TJ Loredo. In: ED Feigelson, GJ Babu, eds. The Promise of Bayesian Inference for Astrophysics. New York: Springer-Verlag, 1992, pp 275–297.
12.E Parent, P Hubert, B Bobe´e, J Miquel, eds. Statistical and Bayesian Methods in Hydrological Sciences. Paris: UNESCO Press, 1998.
13.CE Buck, WG Cavanaugh, CD Litton. The Bayesian Approach to Interpreting Archaeological Data. New York: Wiley, 1996.
14.A Zellner. An Introduction to Bayesian Inference in Econometrics. New York: Wiley, 1971.
15.J Zhu, JS Liu, CE Lawrence. Bayesian adaptive sequence alignment algorithms. Bioinformatics 14:25–39, 1998.
16.K Sjo¨lander, K Karplus, M Brown, R Hughey, A Krogh, IS Mian, D Haussler. Dirichlet mixtures: A method for improved detection of weak but significant protein sequence homology. Comput Appl Biosci 12:327–345, 1996.
17.K Karplus, K Sjolander, C Barrett, M Cline, D Haussler, R Hughey, L Holm, C Sander. Predicting protein structure using hidden Markov models. Proteins Suppl: 134–139, 1997.
18.RH Lathrop, TF Smith. Global optimum protein threading with gapped alignment and empirical pair score functions. J Mol Biol 255:641–665, 1996.
19.RH Lathrop, JR Rogers Jr, TF Smith, JV White. A Bayes-optimal sequence–structure theory that unifies protein sequence–structure recognition and alignment. Bull Math Biol 60:1039– 1071, 1998.
20.RA Chylla, JL Markley. Improved frequency resolution in multidimensional constant-time experiments by multidimensional Bayesian analysis. J Biomol NMR 3:515–533, 1993.
21.DA d’Avignon, GL Bretthorst, ME Holtzer, A Holtzer. Thermodynamics and kinetics of a folded–folded′ transition at valine-9 of a GCN4-like leucine zipper. Biophys J 76:2752– 2759, 1999.
22.JA Lukin, AP Gove, SN Talukdar, C Ho. Automated probabilistic method for assigning backbone resonances of (13C,15N)-labeled proteins. J Biomol NMR 9:151–166, 1997.
23.MT McMahon, E Oldfield. Determination of order parameters and correlation times in proteins: A comparison between Bayesian, Monte Carlo and simple graphical methods. J Biomol NMR 13:133–137, 1999.
24.MF Ochs, RS Stoyanova, F Arias-Mendoza, TR Brown. A new method for spectral decomposition using a bilinear Bayesian approach. J Magn Reson 137:161–176, 1999.
25.TO Yeates. The asymmetric regions of rotation functions between Patterson functions of arbitrarily high symmetry. Acta Crystallogr A 49:138–141, 1993.
26.S Doublie, S Xiang, CJ Gilmore, G Bricogne, CW Carter Jr. Overcoming non-isomorphism by phase permutation and likelihood scoring: Solution of the TrpRS crystal structure. Acta Crystallogr A 50:164–182, 1994.
27.CW Carter Jr. Entropy, likelihood and phase determination. Structure 3:147–150, 1995.
28.RL Dunbrack Jr, FE Cohen. Bayesian statistical analysis of protein sidechain rotamer preferences. Protein Sci 6:1661–1681, 1997.
346 |
Dunbrack |
29.P Baldi, S Brunak. Bioinformatics: The Machine Learning Approach. Cambridge, MA: MIT Press, 1998.
30.ET Jaynes. Probability Theory: The Logic of Science. http://bayes.wustl.edu/etj/prob.html. 1999.
31.M Gardner. The Second Scientific American Book of Mathematical Puzzles and Diversions. New York: Simon and Schuster, 1961.
32.T Bayes. An essay towards solving a problem in the doctrine of chances. Phil Trans Roy Soc Lond 53:370, 1763.
33.PS Laplace. Theorie Analytique des Probabilite´s. Paris: Courcier, 1812.
34.TM Porter. The Rise of Statistical Thinking. Princeton, NJ: Princeton Univ Press, 1988.
35.JO Berger, M Delampady. Testing precise hypotheses. Stat Sci 2:317–352, 1987.
36.TS Kuhn. Structure of Scientific Revolutions. Chicago: Univ Chicago Press, 1974.
37.DV Lindley. The 1988 Wald Memorial Lecture: The present position of Bayesian statistics. Stat Sci 5:44–89, 1990.
38.H Jeffreys. Theory of Probability. Oxford: Clarendon Press, 1939.
39.LJ Savage. The Foundations of Statistics. New York: Wiley, 1954.
40.WR Gilks, S Richardson, DJ Spiegelhalter, eds. Markov Chain Monte Carlo in Practice. London: Chapman & Hall, 1996.
41.IJ Good. The Bayes/non-Bayes compromise: A brief review. J Am Stat Assoc 87:597–606, 1992.
42.J Cornfield. In: DL Meyer, RO Collier, eds. The Frequency Theory of Probability, Bayes’ Theorem, and Sequential Clinical Trials. Bloomington, In: Phi Delta Kappa, 1970, pp 1– 28.
43.M Bower, FE Cohen, RL Dunbrack Jr. Prediction of protein sidechain rotamers from a back- bone-dependent rotamer library: A new homology modeling tool. J Mol Biol 267:1268– 1282, 1997.
44.A Gelman, JB Carlin, HS Stern, DB Rubin. Bayesian Data Analysis. London: Chapman & Hall, 1995.
45.N Metropolis, S Ulam. The Monte Carlo method. J Am Stat Assoc 44:335–341, 1949.
46.N Metropolis, AW Rosenbluth, MN Rosenbluth, AH Teller, E Teller. Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092, 1953.
47.WK Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109, 1970.
48.CP Robert. In: WR Gilks, S Richardson, DJ Spiegelhalter, eds. Mixtures of Distributions: Inference and estimation. London: Chapman & Hall, 1996, pp 441–464.
49.M Gribskov, AD McLachlan, D Eisenberg. Profile analysis: Detection of distantly related proteins. Proc Natl Acad Sci USA 84:4355–4358, 1987.
50.JU Bowie, ND Clarke, CO Pabo, RT Sauer. Identification of protein folds: Matching hydrophobicity patterns of sequence sets with solvent accessibility patterns of known structures. Proteins Struct Func Genet 7:257–264, 1990.
51.M Brown, R Hughey, A Krogh, IS Mian, K Sjolander, D Haussler. Using Dirichlet mixture priors to derive hidden Markov models for protein families. Intelligent Systems in Molecular Biology 1:47–55, 1993.
52.K Karplus. Evaluating regularizers for estimating distributions of amino acids. Intelligent Systems in Molecular Biology 3:188–196, 1995.
53.RL Tatusov, EV Koonin, DJ Lipman. A genomic perspective on protein families. Science 278:631–637, 1997.
54.TL Bailey, M Gribskov. The megaprior heuristic for discovering protein sequence patterns. Intelligent Systems in Molecular Biology 4:15–24, 1996.
55.S Pietrokovski, JG Henikoff, S Henikoff. The BLOCKS database—A system for protein classification. Nucleic Acids Res 24:197–200, 1996.
56.C Dodge, R Schneider, C Sander. The HSSP database of protein structure–sequence alignments and family profiles. Nucleic Acids Res 26:313–315, 1998.
Bayesian Statistics |
347 |
57.AE Sluder, SW Mathews, D Hough, VP Yin, CV Maina. The nuclear receptor superfamily has undergone extensive proliferation and diversification in nematodes. Genome Res 9:103– 120, 1999.
58.MO Dayhoff, WC Barker, PJ McLaughlin. Inferences from protein and nucleic acid sequences: Early molecular evolution, divergence of kingdoms and rates of change. Orig Life 5:311–330, 1974.
59.MO Dayhoff. The origin and evolution of protein superfamilies. Fed Proc 35:2132–2138, 1976.
60.WC Barker, MO Dayhoff. Evolution of homologous physiological mechanisms based on protein sequence data. Comp Biochem Physiol [B] 62:1–5, 1979.
61.JS Liu, CE Lawrence. Bayesian inference on biopolymer models. Bioinformatics 15:38–52, 1999.
62.TF Smith, MS Waterman. Identification of common molecular subsequences. J Mol Biol 147:195–197, 1981.
63.M Hendlich, P Lackner, S Weitckus, H Flo¨ckner, R Froschauer, K Gottsbacher, G Casari, MJ Sippl. Identification of native protein folds amongst a large number of incorrect models. J Mol Biol 216:167–180, 1990.
64.MAS Saqi, PA Bates, MJE Sternberg. Towards an automatic method of predicting protein structure by homology: An evaluation of suboptimal sequence alignments. Protein Eng 5: 305–311, 1992.
65.DT Jones, WR Taylor, JM Thornton. A new approach to protein fold recognition. Nature 358:86–89, 1992.
66.SH Bryant, CE Lawrence. An empirical energy function for threading protein sequence through the folding motif. Proteins Struct Funct Genet 16:92–112, 1993.
67.R Abagyan, D Frishman, P Argos. Recognition of distantly related proteins through energy calculations. Proteins Struct Funct Genet 19:132–140, 1994.
68.TJ Hubbard, J Park. Fold recognition and ab initio structure predictions using hidden Markov models and β-strand pair potentials. Proteins Struct Funct Genet 23:398–402, 1995.
69.NN Alexandrov. SARFing the PDB. Protein Eng 9:727–732, 1996.
70.D Fischer, D Eisenberg. Protein fold recognition using sequence-derived predictions. Protein Sci 5:947–955, 1996.
71.TR Defay, FE Cohen. Multiple sequence information for threading algorithms. J Mol Biol 262:314–323, 1996.
72.B Rost, R Schneider, C Sander. Protein fold recognition by prediction-based threading. J Mol Biol 270:471–480, 1997.
73.WR Taylor. Multiple sequence threading: An analysis of alignment quality and stability. J Mol Biol 269:902–943, 1997.
74.V DiFrancesco, J Garnier, PJ Munson. Protein topology recognition from secondary structure sequences: Application of the hidden Markov models to the alpha class proteins. J Mol Biol 267:446–463, 1997.
75.S Henikoff, JG Henikoff. Performance evaluation of amino acid substitution matrices. Proteins 17:49–61, 1993.
76.JG Henikoff, S Henikoff. BLOCKS database and its applications. Methods Enzymol 266: 88–105, 1996.
77.S Henikoff, JG Henikoff, S Pietrokovski. BLOCKS : A non-redundant database of protein alignment blocks derived from multiple compilations. Bioinformatics 15:471–479, 1999.
78.M Gerstein, M Levitt. A structural census of the current population of protein sequences. Proc Natl Acad Sci USA 94:11911–11916, 1997.
79.PY Chou, GD Fasman. Prediction of the secondary structure of proteins from their amino acid sequence. Adv Enzymol Relat Areas Mol Biol 47:45–148, 1978.
80.JF Gibrat, J Garnier, B Robson. Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J Mol Biol 198:425–443, 1987.
348 |
Dunbrack |
81.N Qian, TJ Sejnowski. Predicting the secondary structure of globular proteins using neural network models. J Mol Biol 202:865–884, 1988.
82.LH Holley, M Karplus. Protein secondary structure prediction with a neural network. Proc Natl Acad Sci USA 86:152–156, 1989.
83.B Rost, C Sander. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins Struct Funct Genet 19:55–72, 1994.
84.AL Delcher, S Kasif, HR Goldberg, WH Hsu. Protein secondary structure modelling with probabilistic networks. Intelligent Systems in Molecular Biology 1:109–117, 1993.
85.JM Chandonia, M Karplus. Neural networks for secondary structure and structural class predictions. Protein Sci 4:275–285, 1995.
86.JM Chandonia, M Karplus. The importance of larger data sets for protein secondary structure prediction with neural networks. Protein Sci 5:768–774, 1996.
87.GE Arnold, AK Dunker, SJ Johns, RJ Douthart. Use of conditional probabilities for determining relationships between amino acid sequence and protein secondary structure. Proteins 12: 382–399, 1992.
88.P Stolorz, A Lapedes, Y Xia. Predicting protein secondary structure using neural net and statistical methods. J Mol Biol 225:363–377, 1992.
89.MJ Thompson, RA Goldstein. Predicting protein secondary structure with probabilistic schemata of evolutionarily derived information. Protein Sci 6:1963–1975, 1997.
90.MJ Thompson, RA Goldstein. Predicting solvent accessibility: Higher accuracy using Bayesian statistics and optimized residue substitution classes. Proteins Struct Funct Genet 25:38– 47, 1996.
91.J Janin, S Wodak, M Levitt, B Maigret. Conformations of amino acid side-chains in proteins. J Mol Biol 125:357–386, 1978.
92.E Benedetti, G Morelli, G Nemethy, HA Scheraga. Statistical and energetic analysis of sidechain conformations in oligopeptides. Int J Peptide Protein Res 22:1–15, 1983.
93.JW Ponder, FM Richards. Tertiary templates for proteins: Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 193:775–792, 1987.
94.MJ McGregor, SA Islam, MJE Sternberg. Analysis of the relationship between sidechain conformation and secondary structure in globular proteins. J Mol Biol 198:295–310, 1987.
95.RL Dunbrack Jr, M Karplus. Backbone-dependent rotamer library for proteins: Application to sidechain prediction. J Mol Biol 230:543–571, 1993.
96.RL Dunbrack Jr, M Karplus. Conformational analysis of the backbone-dependent rotamer preferences of protein sidechains. Nature Struct Biol 1:334–340, 1994.
97.H Schrauber, F Eisenhaber, P Argos. Rotamers: To be or not to be? An analysis of amino acid sidechain conformations in globular proteins. J Mol Biol 230:592–612, 1993.
98.J Kuszewski, AM Gronenborn, GM Clore. Improving the quality of NMR and crystallographic protein structures by means of a conformational database potential derived from structure databases. Protein Sci 5:1067–1080, 1996.
99.BI Dahiyat, SL Mayo. Protein design automation. Protein Sci 5:895–903, 1996.
100.M Schena, D Shalon, RW Davis, PO Brown. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467–470, 1995.
101.M Schena, D Shalon, R Heller, A Chai, PO Brown, RW Davis. Parallel human genome analysis: Microarray-based expression monitoring of 1000 genes. Proc Natl Acad Sci USA 93:10614–10619, 1996.
102.D Shalon, SJ Smith, PO Brown. A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. Genome Res 6:639–645, 1996.
103.MB Eisen, PT Spellman, PO Brown, D Botstein. Cluster analysis and display of genomewide expression patterns. Proc Natl Acad Sci USA 95:14863–14868, 1998.
104.M Wilson, J DeRisi, HH Kristensen, P Imboden, S Rane, PO Brown, GK Schoolnik. Exploring drug-induced alterations in gene expression in Mycobacterium tuberculosis by microarray hybridization. Proc Natl Acad Sci USA 96:12833–12838, 1999.
Bayesian Statistics |
349 |
105.GP Yang, DT Ross, WW Kuang, PO Brown, RJ Weigel. Combining SSH and cDNA microarrays for rapid identification of differentially expressed genes. Nucleic Acids Res 27:1517– 1523, 1999.
106.VR Iyer, MB Eisen, DT Ross, G Schuler, T Moore, JCF Lee, JM Trent, LM Staudt, J Hudson Jr, MS Boguski, D Lashkari, D Shalon, D Botstein, PO Brown. The transcriptional program in the response of human fibroblasts to serum. Science 283:83–87, 1999.
107.MQ Zhang. Large-scale gene expression data analysis: A new challenge to computational biologists. Genome Res 9:681–688, 1999.
108.JM Claverie. Computational methods for the identification of differential and coordinated gene expression. Hum Mol Genet 8:1821–1832, 1999.
109.S-PLUS, Version 3.4. Mathsoft Inc., 1996.