Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Physics of biomolecules and cells

.pdf
Скачиваний:
52
Добавлен:
15.08.2013
Размер:
11.59 Mб
Скачать

R.F. Bruinsma: Physics of Protein-DNA Interaction

17

molecules in a solvent like water has the following general form:

µ([C]) = kBT ln([C]νC) + µC.

(2.2)

The first term, with νC the volume of the molecule, is very similar to the free energy per particle of an ideal gas and it is indeed due to the translational degrees of freedom of the solute particles. The second term, the “standard chemical potential”, can be viewed as the intrinsic free energy per solute particle meaning that it depends on the type of solute molecule, and on temperature and pressure, but not on concentration.

Assume that there is a very small variation δN in the number of R|DNA complexes. According to the reaction scheme R + DNA ↔ R|DNA, there must be corresponding variation of −δN in the number of uncomplexed DNA and repressor molecules. The change in G equals:

δG = [µ([R|DNA]) − µ([DNA]) − µ([R])] δN.

(2.3)

The Second Law of Thermodynamics demands that δG = 0 so µ([R|DNA]) = µ([DNA) + µ([R)]. Using equation (2.2) and this condition gives:

[R][DNA]

=

1

exp(G0

/kBT )

(2.4)

 

 

 

[R DNA]

ν

|

 

 

 

 

 

 

where we introduced the following two quantities:

G0 = µR + µDNA µR|DNA

 

ν =

νR νDNA

·

(2.5)

νR|DNA

The energy scale ∆G0 is called the “Standard Free Energy Change” of the reaction. At the intuitive level, you can think of it as the free energy gain when a repressor combines with a DNA strand, the binding energy in other words. The quantity ν has dimensions of a volume. You can think of it as the “reaction volume”: if the repressor is located inside this volume, then it can bind to DNA.

Equation (2.4) is a special case of a fundamental principle of chemical thermodynamics: the Law of Mass Action. The Law of Mass Action is such an important principle that the right hand side of equation (2.4) has it’s own name and symbol: the equilibrium constant Keq

 

1

 

 

Keq =

ν exp(G0

/kBT ).

(2.6)

Equilibrium constants of associative reactions have dimensions of concentration, so they are expressed in Molar. Using the Law of Mass Action, the

18

Physics of Bio-Molecules and Cells

equilibrium constant can be obtained by measuring the concentrations, and hence the standard free energy change. The beauty is that we obtain this way an important microscopic quantity, the standard free energy change, by measuring purely macroscopic quantities.

When such an experiment is performed in a test-tube (“in vitro”) on a DNA/repressor solution [6], one finds that the result is very sensitive to the absence or presence of the operator sequence on the DNA:

Keq

1010

operator

DNA

(2.7)

104

non operator

DNA.

This large di erence between the specific and non-specific equilibrium constants is the thermodynamic signature of the ability of repressor proteins to

read DNA sequences.

We call the interaction between lac repressor and operator DNA the “specific” protein-DNA interaction and that with non-operator DNA the “non-specific” interaction. You might expect the equilibrium constant for the non-specific interaction to be independent of the DNA sequence but it actually can vary over two orders of magnitude when the non-operator sequence is varied. Later, this will turn out to be a quite important e ect. From equations (2.6) and (2.7), one finds that the standard free energy change for the operator case ∆G0 (specific) is of the order of 2025 kBT while for the non-operator case ∆G0 (non-specific) is of the order of 510 kBT .

What happens if we apply the Law of Mass Action to conditions relevant to the crowded interior of E.Coli (rather than test-tubes)? The genome of E.Coli contains about 107 base-pairs (or “bp”) restricted to a volume of the order of one µ3. Let’s approximate the non-operator part of the bacterial genome as a fairly concentrated solution of short (10 bp) DNA sequences having a concentration of the order of 1063 (10 milliMolar). First suppose that the lac repressors all are bound to lactose molecules, so they will not recognize the operator sequence. Let F be the fraction of unbound lac repressors. This “free fraction” can be related to the equilibrium constant through the Law of Mass Action:

F =

[R]

 

 

 

 

 

[R] + [R|DNA]

 

=

[R][DNA]/[R|DNA]

 

 

[R][DNA]/[R|DNA] + [DNA]

 

=

Keq

(2.8)

 

·

Keq + [DNA]

R.F. Bruinsma: Physics of Protein-DNA Interaction

19

The Law of Mass Action was used in the last step. If we insert the measured value of the equilibrium constant (for the non-specific interaction) and our estimated value for [DNA], we find that F is of the order of 102 (the nonspecific equilibrium constant Keq is measured in the absence of lactose so we are assuming here that lactose binding does not a ect the non-specific interaction). That is an interesting result! Induced lac repressors in E.Coli still “live” most of the time on DNA even though they do not recognize the operator sequence.

Intermezzo: Energy scales in molecular biochemistry

This 25 kBT value for ∆G0 (specific) is a typical energy scale for the complexation of biological macromolecules. On the one hand, this energy scale must be su ciently high compared with the thermal energy scale kBT so thermal fluctuations do not break up the complex. On the other hand, the energy scale must be su ciently low so the binding is reversible and can be easily disrupted when required for the signaling process. In molecular biology, the universal “energy currency” for driving thermodynamically unfavorable processes is the hydrolysis of an ATP molecule: ATP +H2O ADP + Pi + H, which delivers about 10 kBT in free energy. A 25 kBT value for the binding energy is thus quite reasonable. Protein complexes are in general maintained by multiple “weak bonds”, such as the van der Waals attraction, hydrogen bonds, and the “polar” interaction (i.e. screened electrostatic interaction), all of the order of kBT . Spatial patterns of these weak links provide a basis for highly specific “lock-and- key” type recognition between proteins. This must be contrasted with the covalent “strong bonds” (of the order of a hundred kBT ) that maintain the structural integrity of the macromolecules.

2.1.2 Statistical mechanics and operator occupancy

Now assume that the lactose concentration has dropped so the lac repressor proteins can bind to the operator sequence. E cient design requires a high probability for the operator site to be occupied (to avoid unwanted gene transcription). We will compute the operator occupancy probability P using elementary statistical mechanics. Let there be M copies of the lac repressor distributed over N possible sites of the bacterial genome (with N , of the order of 107, large compared to M ). We will neglect the small fraction of free repressors. There are then A(N, M ) = N.(N − 1)..(N − M ) ways to distribute the M proteins over the N non-operator sites and there are C(N, M ) = M [N.(N − 1)....(N − (M − 1))] ways of choosing one of the M proteins to occupy the operator site and distribute the remaining M − 1 proteins over the non-operator sites, treating the proteins as

20

Physics of Bio-Molecules and Cells

classical, distinguishable objects. Let the Boltzmann factor of a protein occupying an operator site, respectively, a non-operator site, be Bs,ns exp(+∆G0(specific, non specific)/kBT ). The occupation probability is then

P =

 

C(N, M )Bs(Bns)M −1

(2.9)

 

 

·

C(N, M )Bs(Bns)M −1 + A(N, M )(Bns)M

This simplifies to

 

 

 

 

 

 

P =

1

 

 

(2.10)

 

 

 

 

1 + MN exp(∆∆G0/kBT )

where ∆∆G0 = ∆G0 (specific)G0 (non-specific) is the di erence between the specific and non-specific binding energies [7].

When we put in the “numbers” for the binding energy obtained earlier something interesting shows up: the very large number N and the very small number exp(∆∆G0/kBT ) nearly cancel each other (N exp(∆∆G0/kBT ) is about 10). Suppose we wanted to make sure that the operator is at least 99% of the time occupied. According to equation (2.10), that requires the number of copies M of the lac repressor to exceed 103. The actual number of lac repressors of an E.coli bacterium is maintained at a comparable value (about 102). There is thus a “design connection” between the values of the specific and non-specific binding energies on the one hand and the number of repressor copies maintained by the cell on the other hand. Simple statistical mechanics arguments provide us with insight how the “working parameters” are set for bacterial gene expression. The most important lesson is that the value of quantities such as ∆G0 (specific), ∆G0 (non-specific), N and M must be understood in the light of the functioning of the bacterium as an

integrated system.

What is puzzling at this stage is why we need the non-specific interaction in the first place. According to equation (2.10), if we turned o the nonspecific interaction, we would only need about 10 repressor copies. We will return to that question in the discussion of the kinetics.

2.1.3 Entropy, enthalpy, and direct read-out The Gibbs Free Energy is defined as

G = H − T S.

(2.11)

It is the sum of an “energetic” term: the enthalpy H = E + P V (E is the internal energy) and an “entropic” term. The change in Gibbs Free Energy ∆G0 that takes place when a repressor molecules binds to DNA can be obtained from the equilibrium constant. Can we obtain the separate

R.F. Bruinsma: Physics of Protein-DNA Interaction

21

enthalpic and entropic contributions as well? Under conditions of fixed pressure and temperature, the change in enthalpy equals:

H = ∆E + P V = ∆Q

(2.12)

with ∆Q the heat released (which is why we call the enthalpy also the “heat function”). We call chemical reactions exothermic if ∆Q > 0 and endothermic if ∆Q < 0. Endothermic reactions are interesting because the driving mechanism is entropy increase rather than reduction of the potential energy of interaction between molecules. The heat released by a reaction can be measured by calorimetry so the change in enthalpy can be found. Since the total change of the Gibbs Free Energy is known, we can also deduce the change in entropy.

When the enthalpic and entropic contributions ∆H and −T S are determined in this manner for the interaction between the lac repressor and DNA, one finds the following results [8].

Specific interaction

The dominant contribution to ∆G0 is entropic. As a function of temperature, −T S decreases significantly with T . ∆H is negative so the reaction is endothermic.

Non-specific interaction

The dominant contribution to ∆G0 is again entropic, but −T S now does not depend significantly on temperature. The enthalpic contribution is again negative.

Both are surprising results. To see why, we turn to the results of structure determinations of protein-DNA complexes. It is possible to grow crystals of repressor proteins complexed with short bit of DNA, known as “co-crystals”. X-ray di raction experiments on these crystals allow us to determine atomic positions with a resolution of 2A, and sometimes even better than that [9]. Below we show the result of such an experiment for case of “cro” a very simple bacterial repressor (unlike the lac repressor).

The first panel shows the pattern of chemical bonds. There is a C2 rotation symmetry. This symmetry is a characteristic of many prokaryote repressor proteins. The DNA operator sequence has a corresponding (approximate) rotation symmetry. Simple repressor proteins like cro address the DNA with “reading heads”. A reading head is an α-helix that can be inserted into the major or minor groove of the DNA double helix (usually the major groove). The second panel is a cartoon of the cro repressor/DNA complex showing the α-helices of the protein. There are two reading heads

22

Physics of Bio-Molecules and Cells

Fig. 9. Cro-repressor/DNA Complex. First panel: chemical bonds. Second panel: cartoon showing reading heads.

visible, one near the top and one near the bottom. The ends of certain side chains of the reading head can establish specific links with certain DNA bases. An example is the interaction between the amino-acid Arginine and the base Guanine shown in Figure 10 below.

The Arg side chain terminates with two N-H pairs. The two hydrogen atoms are positively charged and they “fit” exactly with negatively charged nitrogen and oxygen atoms of the Guanine base. The nitrogen and oxygen atoms act as proton acceptors so hydrogen bonds can be established, indicated in the figure by the two ovals. Base-pairs are surrounded by a unique combination of proton donors and proton acceptors that can be read by specific amino-acids. For instance, the amino-acid Glutamine “recognizes” an A-T pair in the major groove of DNA, just as Arginine recognizes a G-C pair in the major groove, while Asparagine recognizes a G-C pair in the minor groove.

We call this the “Direct Read-Out” mechanism [10] and it is based on hydrogen bonding between amino-acids and nucleic acids.

Intermezzo: The second code

Molecular Biologists have established long lists detailing contacts between the amino-acids of DNA-binding proteins and DNA base-pairs [11]. They originally hoped they could determine a “second code”. By this they mean a one-to-one relation between amino-acids and base-pairs so they could predict to which base-pair sequence a given repressor protein would bind. That would enable design of highly specific drugs turning on or o particular

R.F. Bruinsma: Physics of Protein-DNA Interaction

23

Fig. 10. Direct read-out.

genes. Unfortunately, there appears to be no universal second code. DNAassociating proteins come in di erent design forms. The same amino-acids interact di erently in di erent types of proteins.

There is an obvious discrepancy between the Direct Read-Out model and the results obtained from thermodynamics. If hydrogen bonding between the reading heads and DNA really was the dominant binding mechanism, then DNA/repressor binding should have been enthalpic in nature and formation of the complex would be associated with a loss of entropy. The puzzle is that there can be little doubt that Direct-Read Out is an important mechanism for the reading of DNA sequence by proteins.

2.1.4 The lac repressor complex: A molecular machine

The resolution of this paradox comes from X-ray structural studies of lac repressor/DNA co-crystals [12] shown below.

The actual structure responsible for the repression of gene transcription is a complex consisting of two lac repressor protein dimers, so four copies in all. They bind pair-wise to two separate operator sequences; note the four reading heads. The four reading heads are pair-wise attached to the body of the complex by a linker unit that undergoes an order-disorder

24

Physics of Bio-Molecules and Cells

Fig. 11. The lac repressor complex.

transition upon lactose binding. In the presence of lactose, the complex adopts a structure in which the linker unit is disordered and the reading heads can not be inserted into the DNA major groove. Release of the lactose produces ordering of the linkers and allows insertion of the reading heads into DNA. In addition, the transition brings two hydrophobic surfaces, belonging to the two dimers, into close contact. It seems reasonable to assume that if lactose-free repressor monomers or dimers move along non-operator DNA, locate the operator sequence, and form the full four-protein repressor complex, then the hydrophobic attraction plays a central role as well, so we can understand at least qualitatively why the specific binding of the lac repressor has an entropic character. The intervening DNA sequence between the two operator sequences loops around as shown in Figure 12. Interestingly, another protein, known as CAP, binds to the DNA sequence inside the loop. This stabilizes the loop but once the loop opens, it also stimulates gene expression!

The reading heads thus are only a small part of the lac repressor complex. We could view the complex as a molecular detector and amplifier. The binding of lactose to the repressor complex triggers a large structural transition that breaks up the complex and opens the loop. Release of the lactose closes the loop and restores the complex. Note that there is an analogy between the operation of the lac repressor complex and the molecular

R.F. Bruinsma: Physics of Protein-DNA Interaction

25

motors discussed in J. Howard’s lectures where the binding/release of ATP and ADP drives a cyclical structural transition that performs work.

Fig. 12. Order-disorder Transition of the lac repressor complex.

26

Physics of Bio-Molecules and Cells

The thermodynamics of the non-specific interaction also is puzzling. The binding free energy had a di erent temperature dependence, indicating that the hydrophobic interaction is not the interaction that dominates the thermodynamics, which is indeed the case. The structural studies tell us that the linker units connecting the reading heads of the lac repressor to the body of the protein are positively charged and interact with the adjacent minor groove: the non-specific interaction is electrostatic in origin. We must understand how electrostatic attraction can have an entropic character. We will postpone addressing this question to Section 4.

At this point, you should go the following Web site where you will find an elegant tutorial on the structural changes of the lac repressor tetramer and its interaction with DNA.

http://www.worthpublishers.com/lehninger3D/index title.html

2.2 Kinetics of repressor-DNA interaction

We now turn to the third engineering requirement: reactivity. How quickly does a lac repressor respond to environmental changes, such as a reduction in lactose concentration? We start again with a discussion of in vitro experiments.

2.2.1 Reaction kinetics

The rate of change with time of the concentration of a repressor-DNA complex is the sum of two terms. A positive contribution due to complex formation between a previously unbound DNA molecule and a previously free repressor, and a negative contribution due to complex break-up. At su - ciently low concentrations, the first term must be proportional to the probability of finding a free DNA molecules and a free repressor molecule at the same site, and the second term must be proportional to the concentration of the complex:

d

d [R|DNA] = ka[R][DNA] − kd[R|DNA]. (2.13) t

The proportionality constants ka and kd are called, respectively, the “onrate” and the “o -rate”. These constants are supposed not to depend on concentration though they can be quite strongly temperature dependent. The o -rate really does have dimensions of a rate but the (so-called) onrate has dimensions of Volume/Time (chemists and biologists have a free- and-easy attitude to units). The on-rate and the o -rate have a surprising connection. Under conditions of thermodynamic equilibrium, the concentrations of the reactants obviously must be constant, so the left hand side

Соседние файлы в предмете Химия