CONFORMATIONS, ENERGETICS, AND KINETICS OF PEPTIDES IN MEMBRANE ENVIRONMENT

Chapter1 : We present the exact solution of a microscopic statistical mechanical model for the transformation of a long polypeptide between an unstructured coil conformation and an α-helix conformation. The polypeptide is assumed to be adsorbed to the interface between a polar and a non-polar environment such as realized by water and the lipid bilayer of a membrane. The interfacial coil-helix transformation is the first stage in the folding process of helical membrane proteins. Depending on the values of model parameters, the conformation changes as a crossover, a discontinuous transition, or a continuous transition with helicity in the role of order parameter. Our model is constructed as a system of statistically interacting quasiparticles that are activated from the helix pseudo-vacuum. The particles represent links between adjacent residues in coil conformation that form a self-avoiding random walk in two dimensions. Explicit results are presented for helicity, entropy, heat capacity, and the average numbers and sizes of both coil and helix segments. Chapter 2: We investigate profiles of local attributes (densities of entropy, enthalpy, free energy, and helicity) for the backbone of long polypeptides in the heterogeneous environment of a lipid bilayer or cell membrane. From these profiles we infer landscapes of global attributes for the backbone of short peptides with given position and orientation in that environment. Our methodology interprets the broken internal H-bonds along the backbone of the polypeptide as statistically interacting quasiparticles activated from the helix reference state. The interaction depends on the local environment (ranging from polar to non-polar), in particular on the availability of external H-bonds (with H2O molecules or lipid headgroups) to replace internal H-bonds. The helicity landscape in particular is an essential prerequisite for the continuation of this part of the project with focus on the sidechain contributions to the free-energy landscapes. The full free-energy landscapes are expected to yield information on insertion conditions and likely insertion pathways. Chapter 3: We present the first part in the design of a kinetic model for the insertion of short peptides, including variants of pHLIP, into a lipid bilayer. The process under scrutiny combines a transport phenomenon and a change in protonation status of negatively charged sites near the C terminus. The two kinetic phenomena influence each other and set different time scales. Processes with a significant range of time scales, known to be a challenge for molecular dynamics simulations, are shown to be within the scope of the kinetic modeling presented here, which is based on interlocking Markov chain processes. The two processes governing protonation status and transport are run individually and then in combination. This makes it possible to investigate feedback mechanisms between the two component processes.


Introduction
The folding mechanisms of water-soluble proteins from primary to secondary and higher-order structures has been thoroughly investigated over many years. In the study of protein translocation pathways into and across cell membranes, which is a very important active area of current research, one important problem requiring further elucidation is the coil-helix transition that accompanies the insertion of a polypeptide into a lipid bilayer. The theoretical modeling of this ubiquitous process in biological matter is fairly complex due to the heterogeneous environment in which conformational changes occur and the simultaneity or rapid succession of conformational change and translocation. Experimental studies are limited to the small selection of polypeptides that are water soluble and undergo controllable insertion/folding and exit/unfolding processes.
The folding of all helical membrane proteins/peptides, independent of the insertion mechanism, is governed by the formation of helical segments in the lipid bilayer environment. This process is driven by hydrophobic interactions and hydrogen bonding [1,2,3,4]. Its two main steps are the transformation from coil to interfacial helix and the insertion of the helix into the membrane with transmembrane orientation. Variants of the pH Low Insertion Peptide (pHLIP) family are water soluble and prove to be well suited for the investigation of membraneassociated folding and unfolding [5,6], reversibly driven by changes in pH. A drop in pH leads to the protonation of negatively charged side chains, which enhances the hydrophobicity of the peptide and initiates the aforementioned two-step process of folding and insertion. A subsequent rise in pH reverses the process: the peptide unfolds and exits. Recent experimental studies have already established important thermodynamic and kinetic parameters of the peptide-membrane interaction [7,8,9].
What has been lacking for these and related experiments is a microscopic statistical mechanical model with experimentally testable attributes that is amenable to an exact analysis. Our goal is to construct, solve, develop, and test such a model in three successive stages. The first stage, which is the theme of this paper, involves the design and solution of a microscopic model that describes the coil-helix transformation of a long polypeptide adsorbed to the lipid bilayer of a membrane (see Fig. 1.1). Such a model is an indispensable part of a theory of membrane- associated folding and will be used as the foundation for the next two stages. They include (a) the investigation of profiles of local attributes for generic polypeptides and landscapes of global attributes for short peptides such as pHLIP in the heterogeneous water/lipid environment and (b) the kinetics of insertion and exit as can be inferred from the landscapes of free energy and conformational attributes.
The pioneering theoretical studies of coil-helix transformations and related phenomena that appeared throughout 1960s were admirably compiled and reviewed in a monograph by Poland and Scheraga [10]. A series of model systems were introduced at that time. Many of them are still being used today in textbooks and research papers. This includes the familiar Zimm-Bragg model [11] and generalizations thereof, all amenable to the transfer matrix method of analysis. Also emerging at that time was the highly original Lifson method of statistical me-chanical analysis [12], the scope of which includes (smooth) crossovers and (sharp) transitions [13,14]. The need at this time for yet another model solved by yet a different method is dictated by the three stages of our project as will become apparent in what follows.
In Sec. 1.2 we present the microscopic model of our design as a system of statistically interacting links and describe the method of its exact analysis. Depending on the parameter settings the exact solution produces a conformational change in the form of a crossover or a transition (Sec. 1.3). The transition may be of first or second order as we discuss in Sec. 1.4 with focus on the helicity (order parameter) and entropy (measure of disorder) and other quantities. In Sec. 1.5 we summarize the main advances of this work and point out their role as the foundation for the continuation of this project in two different directions in the arenas of biological physics and statistical mechanics. We also discuss how this work connects to recent studies by other researchers and how the continuation of this project can benefit from those studies.

Model system
The microscopic model that we present here is a system of statistically interacting quasiparticles with shapes. The methodology employed for its exact statistical mechanical analysis is built on the concept of fractional statistics, invented by Haldane [15], and developed by Wu [16], Isakov [17], Anghel [18], and others [19] in the context of quantum many-body systems. The adaptation of this approach to classical statistical mechanical systems of particles with shapes was developed in a recent series of studies with applications to Ising spins [20,21,22,23], jammed granular matter [24,25], lattice gases with long-range interactions [26], and DNA under tension [27]. The application to the coil-helix transition of a long polypeptide adsorbed to a water-lipid interface worked out in the following is conceptually simple but surprisingly rich in scope.

Coil segments from helix vacuum
The reference state (pseudo-vacuum) of our model system is the ordered helix conformation of N residues bound by peptide bonds into N −1 links and stabilized by internal hydrogen bonds along the backbone. Thermal fluctuations or environmental change cause the nucleation of disordered coil segments, which then grow by unravelling adjacent helical order.
In our model the coil segments are represented by thermally activated links that combine to form a self-avoiding random walk between the ends of successive helical segments. Both coil and helix segments are confined to the water-lipid interface ( Fig. 1.1). Coil segments carry configurational entropy that grows with their size and range in the interfacial plane. That range is controllable at a microscopic level by the integer-valued model parameter µ, henceforth called range parameter.
Each residue can be in µ + 1 states, of which one (denoted h) represents the helix conformation and µ (numbered 1, . . . , µ) represent the coil conformation.
Access by a residue to these states is constrained by the states of its neighboring residues as illustrated in Fig. 1.2 for the case µ = 3.
Helical links (hh) are short and form straight segments (horizontal,in Fig. 1.2) in the plane of the water-lipid interface. Coil links are more extended and directed either horizontally (11,22,33) or vertically (h1, 12, 23, 32, 21, 1h). Our model allows each coil segment to randomly explore the interface on one side of the helical direction without intersecting itself. 1 That space is discretized and constrained by the length of the segment and by the number of distinct coil states.
We assign different activation energies to coil links relative to hh links depend-1 The one-side restriction has no significant impact on the quantities we are calculating in this work. Generalizations to coil segments that explore the plane of the interface more freely are planned. ing on whether they contribute to nucleation or to growth. To the former we assign the activation energy n and to the latter g (two model parameters). Nucleation of one coil link requires the simultaneous break-up of several internal hydrogen bonds whereas growth proceeds by the break-up of one bond per link (two bonds shared by different pairs of residues). On our way to calculating a partition function we now face the task of counting microstates of given link content.

Combinatorics of links
For the combinatorial analysis we introduce a set of statistically interacting quasiparticles that contain individual links or pairs of links. It turns out that we need 2µ species of particles. In the case of µ = 3 they are the six species and the multiplicity expression [15,16,17,18,19,20,21],   The particles form nested structures as indicated in Fig. 1.2. We have species from three categories in the taxonomy of Ref. [21]: one species of hosts, two species of hybrids, and three species of tags. Hosts cannot be hosted, tags cannot host any particles from a different species, hybrids can do both.
A system of N residues in the helix pseudo-vacuum has the capacity of nucleating a coil segment at A 1 = N − 2 different locations by activating a host particle with activation energy 1 = n . The activation of particles from any species reduces the capacity d 1 of the system for placing further hosts on account of (1.3) and g 1m > 0. Hosts and hybrids have twice the size of tags. The former thus reduce the capacity at double the rate of the latter.
For any mix of particles the system has a finite capacity. When that capacity has been reached, we have d 1 = 1, which makes the associated binomial factor in (1.2) equal to one. If we attempt to add one more particle from any species, d 1 becomes zero or a negative integer and, in consequence, the associated binomial factor vanishes. The helix pseudo-vacuum has zero capacity for the placement of hybrids and tags, as implied by A 2 = · · · = A 6 = 0. Such capacity of the system is generated dynamically by the placement of particles with hosting capacity. This generation of capacity is encoded in the negative interaction coefficients. Hosts 1 generate capacity for placing hybrids 2 and tags 4. Hybrids 2, in turn, generate capacity for placing hybrids 3 and tags, 4, 5 etc. Tags do not generate capacity for placing any particles.
The particle content of the coil segment of N = 16 residues shown in Fig. 1.2 is N 1 = 1, N 2 = 2, N 3 = 2, N 4 = 1, N 5 = 3, N 6 = 1. Its activation energy (1.1) thus becomes E(1, 2, 2, 1, 3, 1) − E pv = n + 13 g and the number of coil segments of equal contour length and with the same particle content is, according to (1.2), (1.3), W (1, 2, 2, 1, 3, 1) = 360. Further microstates with equal activation energy are generated if we exchange hybrids or tags from one species by hybrids or tags from a different species or if we replace hybrids by pairs of tags (or vice versa), all within the constraints imposed by the nesting. The constraints are encoded in the multiplicity expression. It does not allow spurious particle combinations.
The generalization to any µ is straightforward: the zoo of 2µ particle species now comprises one host, µ − 1 hybrids, and µ tags, labeled consecutively in this order. The host has activation energy n , reflecting the nucleation of coil segments, whereas hybrids and tags have activation energies 2 g and g , respectively, reflecting the growth of coil segments. The capacity constants remain the same for each The nonzero interaction coefficients generalize naturally in accordance with the sizes and nested structure of the particles: g 1m = 2 (1) for m = 1, . . . , µ (µ + 1, . . . , 2µ); g m m = −1 for three sets of index pairs: (i) m = m − 1, m = 2, . . . , µ; . . , 2µ−1. The case µ = 1 has no hybrids: g 11 = 2, g 12 = 1, g 21 = −1, g 22 = 0. It is equivalent to the Zimm-Bragg model [11]. With the combinatorial analysis completed we turn to the statistical mechanical analysis.

Free energy of polypeptide
The partition function for the adsorbed polypeptide, modeled as a system of statistically interacting and thermally activated particles [15,16,17,18,19,20,21], depends on energy (1.1) and multiplicity (1.2) with ingredients m , A m , g mm from Sec. 1.2.2. The thermal equilibrium macrostate in the thermodynamic limit follows from an extremum principle. Its implementation yields the partition function for a macroscopic system in the (generic) form [16,17,18,19,20,26], 5) where M = 2µ in our case and the (real, positive) w m are solutions of the coupled nonlinear equations, (1.6) The average number of particles from species m are derived from the coupled linear equations, It is useful and economical to express all results as functions of the two control 9) with an additional dependence on the discrete range parameter µ implied. The nucleation parameter τ is a measure of cooperativity and controls the average length of coil and helix segments. High cooperativity (τ 1) means a high nucleation threshold. Low cooperativity (τ 1) means little difference in enthalpic cost of nucleation and growth. The growth parameter t controls the preference of one or the other conformation. Coil is preferred at small t and helix at large t.
Equations (1.6) for w 1 , . . . , w 2µ with parameters t, τ used on the left and the g mm from Sec. 1.2.2 used on the right can be reduced to a single polynomial equation of order µ + 1 for w µ+1 : 10) where the S m (w) are Chebyshev polynomials of the second kind. Among all the solutions of Eq. (1.10) there exists exactly one, 11) that satisfies the criterion of physical relevance, requiring that (1.11) and all the remaining w m inferred from it via 12) are non-negative. The derivation of this reduction is outlined in Appendix A.
The Gibbs free energy per residue inferred from (1.5) then depends on that physical solution as follows:  16) the density of (helix or coil) segments, 17) and the average sizes of helix segments and coil segments, The population densitiesN m . = N m /N , m = 1, . . . , 2µ, of particles can be extracted from the solution of the linear Eqs. (1.7) as shown in Appendix A.

Structure of solution
Changing the level of pH primarily affects the growth parameter t. At normal pH we have t 1, which favors the random coil conformation. A drop in pH pushes the growth parameter to higher values, t > 1, which increasingly favors a conformation with helical ordering. 3 Depending on the value of the nucleation parameter τ and the discrete parameter µ, which controls the amount of entropy that coil segments can generate, the growth of helicity takes place in a crossover or in a transition of first or second order. To illuminate the criteria for these alternatives we investigate the nature of the physically relevant solution (1.11) of Eq. (1.10), in particular the singularities it acquires in the limits τ → 0 at µ < ∞ and µ → ∞ at τ > 0.

Crossover
For τ > 0 and µ < ∞ the solution w(t, τ ) is bounded from below by 19) which is the location of the last zero of S µ (w). That value is only realized at t = 0 as illustrated in Fig. 1.3(a). For t 1 the solution converges toward the linear asymptote, (1.20) Note that w 0 depends on µ but not on τ whereas w as depends on τ but not on µ.
The smooth dependence on t of w(t, τ ) for τ > 0 and µ < ∞ describes a crossover from low helicity at small t to high helicity at large t.
The Zimm-Bragg model [11] is represented by the case µ = 1, for which (1.10) . Panel (a) shows the emergence of a singularity in the limit τ → 0 for µ = 1, 2, 3 and panel (b) the emergence of a singularity in the limit µ → ∞ at τ ≥ 0.
is a quadratic equation with physical solution It is mathematically equivalent to an Ising chain. 4 The case µ = 2 is qualitatively different in that each coil segment now carries entropy. The physical solution of the associated (cubic) Eq. (1.10) reads For 3 ≤ µ < ∞ and τ > 0 the solution (1.11) must be determined numerically.
In that context it is advisable to rewrite (1.10) as 23) using the function (1. 24) inferred from trigonometric/hyperbolic representations of Chebyshev polynomials.
For large µ the functions r µ (w) are much smoother than the polynomials S µ (w).
Standard methods with initial values from the analytic solution for µ → ∞ derived in Sec. 1.3.3 below work quite well.
(1. 26) It describes a discontinuous phase transition between a pure coil at t < t 0 and a pure helix at t > t 0 in a sense that requires some explanations (Sec. 1.4). Discontinuities are manifest in the order parameter and the entropy. The latter is associated with a latent heat.

Second-order transition
In the limit µ → ∞ at τ > 0 the solution (1.11) acquires a quadratic cusp at (1. 27) Performing that limit in (1.24) yields (1.28) The resulting analytic solution then reads and is graphically represented in Fig. 1.3(b). The singularity at t + c , with t 0 = 3 for µ = ∞ represents a continuous transition between a coil phase (t < t c ) and a helix phase (t > t c ) with helical ordering subject to thermal fluctuations.
Expression (1.31) is to be interpreted as an asymptotic expansion with coefficients that diverge as τ → 0. In that limit the cusp turns linear as in (1.25).

Order and disorder
Helix means order and coil means disorder, clearly. However, both attributes can be looked at from different angles and a more comprehensive picture emerges.
In the following we investigate several thermodynamic quantities, derived from the free energy (1.13) as functions of the experimentally controllable growth parameter t at fixed values of the other two parameters τ and µ.
Each quantity will illuminate the competition between order and disorder from a somewhat different vantage point. All are functions of w 1 (t, τ ), which depends on the solution (1.11) of (1.10) via (1.12). The analytic expression in the limit τ → 0 for µ < ∞ as inferred from (1.25) reads 32) and the analytic result in the limit µ → ∞ at τ > 0 as inferred from (1.29) becomes (1.33)

Helicity and entropy
We begin by considering the two thermodynamic functions that represent order and disorder most directly: helicity (1. 16), and entropy (1.14), (1.35) In Figs. 1.4 and 1.5 we show the dependence of helicity and entropy on the growth parameter t at fixed nucleation parameter τ (five curves) and range parameter µ (two panels). At finite cooperativity (τ > 0) the helicity crosses over from a low to a high value near t 0 . The rise in helicity becomes sharper with increasing cooperativity and turns into a step discontinuity in the limit τ → 0. Analytically, expression (1.34) with (1.32) substituted yields While order as reflected in the helicity increases monotonically with t, the disorder as reflected in the entropy is not monotonically decreasing. It shows a shallow maximum at t 1 separate from the shoulder at t t 0 . The reason for this difference is that there is only one source of order -helical links -but two sources of disorder: disorder inside coil segments and disorder in the sequence of helical/coil segments of diverse lengths. It is the first source of disorder that produces the shoulder and the second source that produces the shallow maximum. The Zimm-Bragg case µ = 1 is pathological in this respect. It produces coil segments without internal entropy as noted and commented on before [28].
As τ → 0, the segments grow larger in size and become fewer in numbers (see Sec. 1.4.2 below). This reduces disorder of the second kind. At infinite cooperativity the entropy turns into a step discontinuity of µ-dependent height and location.
Expression (1.35) with (1.32) substituted becomes This discontinuity signals the presence of a latent heat (see Sec. 1.4.3 below).
Next we investigate the same measures of order and disorder as functions of t at fixed τ = 1.0 (low cooperativity) or τ = 0.2 (high cooperativity) for increasing numbers µ of coil states per residue including the limit µ → ∞. Our results are shown in Figs. 1.6 and 1.7. The crossover behavior for small µ turns into a continuous order-disorder transition as µ → ∞. With µ increasing, the internal source of disorder in coil segments gains dominance over the entropy of mixing between coil and helix segments. The shoulder in S/k B becomes flatter, higher, and sharper. In the limit µ → ∞ at t < t c , the helicity approaches zero identically and the entropy approaches the valueS/k B = ln 3, independent of τ . Disorder defeats order hands down.
The helix phase at t > t c , by contrast, remains a battleground between ordering and disordering tendencies. Both the helicity and the entropy expressions, (1.39) have linear cusps at t c with slopes that diverge in the limit τ → 0. The leading critical singularities at t + c arē (1.41) With the growth parameter increasing from t c the helicity steeply rises from zero and gradually bends over toward its saturation value whereas the entropy steeply descends from a high value and gradually approaches zero. The plots suggest that cooperativity impedes the onset of ordering, yet assists the quick rise of ordering once it has set in.

Segments of coil and helix
Further insight into how helical ordering grows during the crossover or near the transition point between conformations can be gained from the two quantities (1.17) and (1.18), representing, respectively, the density and average length of segments in one or the other conformation. Coil segments alternate with helix segments. Hence they come in equal numbers. However, their average lengths vary independently with t. Parametric representations can be constructed as before. We use (1.34) andN (1.42) We first examine the t-dependence ofN seg , L cs , and L hs near the first-order transition that takes place at t 0 in the limit τ → 0. Our results for µ = 2, 3 are shown in Figs. 1.8 and 1.9. We observe that the density of segments is near zero at small t. Here the system is strongly coil-like. The segments grow in numbers with t increasing. They become most numerous at t 0 , where ordering and disordering tendencies compete evenly. The density of segments becomes smaller again as t further increases into the stability regime of the helix conformation. The maximum ofN seg at t 0 strongly depends on τ . In the limit τ → 0 we haveN seg ≡ 0, which means that, in a macroscopic system, the number of segments grows more slowly (if at all) than the number of residues. Unsurprisingly, L cs decreases and L hs increases monotonically with t. As expected, both variations are enhanced by cooperativity. Most interesting is the limit τ → 0. The exact expressions for the average lengths of helix segments at t < t 0 and coil segments at t > 0 read 44) respectively, with r µ (w) from (1.24). Both expressions diverge ∝ |t − t 0 | −1 as t approaches t 0 from opposite sides and then stay infinite.
These results tell us that the macrostate with zero helicity and saturated entropy at t < t 0 still contains helix segments albeit only in numbers that do not add up to a nonzero density but still produce a well-defined average size. They coexist with an equal number of coil segments of macroscopic lengths. Conversely, the macrostate of zero entropy (per residue) and saturated helicity at t > t 0 is not a single helical domain. Here helix segments of short average length in numbers that amount to zero density coexist with an equal number of helix segments of macroscopic lengths.
A different picture emerges near the second-order transition at t = t c in the limit µ → ∞. Results for the density of segments at high and low cooperativity are shown in Fig. 1.10 and results for their average lengths in Fig. 1.11. This includes numerical results for µ < ∞ and analytical results for µ = ∞. The density of segments vanishes identically in the coil phase (t < t c ) and then rises in a linear cusp to a smooth maximum in the helix phase (t > t c ) : (1. 45) The average length of coil segments in the helix phase, 46) diverges at t + c and remains infinite in the coil phase. The average length of helix segments, by contrast, remains finite in both phases, 47) where again, t 0 = 3 for µ = ∞ in (1.45)-(1.47). The graph of L hs is continuous and smooth at t c . The singularity is of higher order. Only in the helix phase does the shape of the curve depend on τ .
The most striking feature in the data shown concerns the helix segments.
Unlike in the case of the first-order transition, the ordered phase near t c supports a significant density of coil and helix segments of comparable finite size. The average length of helix segments depends only weakly on µ and only moderately on τ , in strong contrast to the average length of coil segments, which exhibits strong dependences on both parameters.

Heat capacity and latent heat
The heat capacity,C . = T (∂S/∂T ) n, g , illuminates the competition between order and disorder from yet a different angle. From (1.35) we derivē (1.48) Figure 1.12 shows the dependence of the heat capacity on the growth parameter for the case µ = 2 at moderate to high cooperativity and for the case µ = ∞ over a wider range of cooperativity. At t < t 0 = 2 in panel (a) we observe a weak signal that is associated with the entropy caused by alternating coil and helix segments as discussed previously.
This contribution fades away at high cooperativity (best seen in the inset) as the density of segments diminishes. The peak at t t 0 , on the other hand, is associated with the entropy inside coil segments. With increasing cooperativity, this contribution grows in a more and more narrow range at t 0 . Similar structures have been obtained in recent Monte Carlos simulations, albeit upon variation of temperature and in a somewhat different scenario [29].
In the limit τ → 0 for µ < ∞, where the coil-helix crossover sharpens into a first-order transition at t 0 , the heat capacity approaches zero everywhere except at the transition point, where it diverges and produces, via (1.37), a latent heat of magnitude g . Conversely, in the limit µ → ∞ at τ > 0, where the coil-helix crossover turns into a second-order transition at t c , the heat capacity approaches zero in the coil conformation and remains nonzero in the helix conformation as shown in panel (b). When the transition changes from second to first order when τ → 0 for µ = ∞, implying t c → t 0 = 3, the heat capacity throughout the helix conformation approaches zero as illustrated in the inset.

Conclusion and outlook
We have launched this project mainly for the purpose of interpreting (ongoing and projected) experiments on pHLIP. In this first of three stages of analysis we have constructed a microscopic model for the pH-driven coil-helix conformational change of a long polypeptide adsorbed to a water-lipid interface. We have employed a methodology that facilitates the exact statistical mechanical analysis of our model. The three model parameters t, τ, µ have settings for which the conformation changes either in a crossover, a first-order transition, or a second-order transition.
We have carried out the analysis to the extent needed for a discussion of the sources and agents of order and disorder. Our results include the t-dependence of the helicity (order parameter), the average numbers and the average lengths of helix and coil segments, the entropy, and the heat capacity. The behavior of these quantities near the continuous or the discontinuous transition has been given special attention. We have plotted all quantities versus t at constant τ and µ for a reason. The experimentally relevant processes for which we use our model will primarily involve variations of the growth parameter t. These variations are caused by changes in pH. The targeted peptides, adsorbed to the water-lipid interface, include side chains that are strongly hydrophobic (e.g. Leu) and side chains that are negatively charged (e.g. Asp, Glu).
A drop in pH leads to the protonation of the negatively charged side chains and, therefore, enhances the overall hydrophobicity. The backbone of a coil segment thus pushed past the lipid headgroups is now more likely to satisfy an Hbond internally than externally. The enthalpic cost for broken internal H-bonds increases. This cost is encoded in t. Any increase in t favors a growth of helix segments at the expense of coil segments. A rise in pH has the opposite effect.
The value of t decreases. Coil segments grow at the expense of helix segments.
The cooperativity parameter τ , by contrast, is much less sensitive to a change in pH. In the nucleation process of coil segments from the helix conformation, for example, the internal H-bonds are much more isolated from environmental influences than are those at the border between coil segments and helix segments.
At this point, our project has reached a fork, where natural continuations point in two different directions and address the interests of somewhat different audiences. These continuations, already in the works, are outlined as follows.

Heterogeneous environment and short peptides
In one continuation we begin by considering long polypeptides that are no longer confined to a plane parallel to a flat water-lipid interface. The growth parameter t, which drives the conformational change, then becomes a field t(x) and acquires a profile that depends on the local medium. Here x is a position coordinate in the direction perpendicular to the plane of the membrane. Such circumstances pose a serious challenge to any existing model and its method of analysis. However, the methodology used here is well positioned in that respect.
It has already been proven (in different applications [25,27,30]) to be adaptable to heterogeneous environments.
The shape of the parameter field t(x) will be determined by the availability of polar molecules to satisfy external H-bonds along the backbone of the peptide.
The dominant factor that shapes the field t(x) will be the density profile ρ w (x) across the membrane, for which data from experiments [31] and simulations [32] are available. Subdominant factors include electrostatic interactions and fluidmechanical properties of lipids.
From the analysis of our extended model emerge profiles for the densities of free energy, enthalpy, entropy, and helicity of long polypeptides that traverse the heterogeneous environment (ranging from polar to non-polar) along some path that is subject to conformational constraints [33]. These profiles, in turn, will be interpreted as propensities for the statistical mechanical behavior of short peptides in the same environment.
At this stage of the analysis, additional enthalpic and entropic effects involving the side chains, the semi-fluid bilayer of lipid amphiphiles, and the hydrogen- This first continuation can also benefit from recent studies in the same area of research. Not yet included in our modeling are effects related to torsion and tension, which are bound to be present in the heterogeneous membrane environment. Experimental, computational, and analytic studies of force-extension and torque-twist characteristics and the associated steric constraints [34,35,36,37] will be of great value for that purpose. The kinetic modeling of pHLIP insertion while undergoing a conformational change will find valuable guidance from recent studies that have investigated the fluctuation properties of helical polymers in confined environments including narrow channels [38,39,40] and studies that have investigated the Brownian dynamics of polymers in the membrane environment [41].

Extensions of analysis, model, and scope
A second continuation focuses on the statistical mechanics of phase transitions and critical singularities in the context of the microscopic model presented in this work and extensions thereof. It is well known that the presence of a phase transition at nonzero temperature in a system that is, in some sense, one-dimensional requires interactions of long-range to stabilize an ordered phase in the face of strong thermal fluctuations. In our model, which is truly microscopic and analyzed exactly, this stabilizing agent comes in the form of quasiparticles that extend over entire coil segments (hosts) or over parts thereof (hybrids).
In the context of the experiments that motivated this work we have examined conformational changes driven by the control parameter t at fixed τ, µ as reflected in just a few relevant quantities. The phase transitions that occur in the limits τ → 0 (first-order) or µ → ∞ (second order) produce different singularities in other quantities of no less interest for the statistical mechanical analysis. Such quantities of general interest include a mechanical response function, a correlation length, and a correlation function. The further analysis of critical exponents and scaling laws is best presented in a more general framework and along with model extensions that remain inside the reach of our method of exact analysis.
In a final note, we should like to draw the reader's attention to a different set of applications, for which our statistical mechanical model and its extensions are likely to produce significant new insights. These applications investigate the statistical mechanics and the dynamics of DNA melting (thermal denaturation) [13,14,42,43,44,45,46,47,48] or the loop formation in RNA [49]. Of particular interest is the loop exponent in the configurational entropy of loop formation [14,49], which is frequently used as an adjustable parameter. The further development of our project aims for the analytic calculation of loop exponents pertaining to realistic scenarios.
Discussions of and debates about crossovers, first-order transitions, and secondorder transitions are at the center of most of these studies. The transcription and adaptation of our methodology to this particular physics context is already in progress. The main challenge in the endeavor is the extension of the self-avoiding random walk to three dimensions.
[  The solubility in water of pHLIP is important. The presence and positioning in the sequence of the charged residues and polar residues is instrumental for this key attribute. The hydrophobic residues provide the affinity for adsorption of pHLIP to the water-lipid interface. The protonatable negative charges at and near the (inserting) C terminus make pHLIP sensitive to the experimentally controllable and reversible change in pH.
The Trp residues are markers for fluorescent spectroscopy, by which the insertion and exit processes are monitored. The three variants with Trp at different positions in the sequence of residues promise to yield clues about the modes of insertion and exit. The conformational changes between coil and helix segments that accompany both insertion and exit are monitored by CD and OCD spectroscopy. The helix-inhibiting Pro residue near the center of the sequence may be instrumental for the mechanics of the insertion process 1 . The following sections are about fields, profiles, and landscapes: fields of environmental parameters (Sec. 2.2), profiles of local properties for long polypeptides (Sec. 2.3), and landscapes of global properties of short peptides (Sec. 2.4).

Membrane environment
We consider a patch of lipid bilayer with negligible curvature. Heterogeneity is then associated with the spatial coordinate x, perpendicular to the bilayer. As a convention we set x = 0 at the center of the bilayer. The outside of the cell or liposome is at positive x and the inside at negative x. Any effects of curvature are higher-order corrections to the results presented in the following.
The membrane environment is characterized by several parameters. In the context of this work the dominant parameter field is the concentration of H 2 O molecules. Hydrophobic interactions are prevalent. Subdominant parameter fields involve electrostatic interactions including trans-membrane, surface, and dipole potentials [5]. Further parameter fields are related to properties of lipids, notably the profile of lateral pressure and the entropy reduction along the contact line with the peptide. We begin by examining the effect of the dominant environmental parameter.

Density field of water
We assume that the density field, ρ w (x), of H 2 O molecules is symmetric under reflection. It is a dimensionless quantity varying between ρ w (x) = 1 sufficiently far from the lipids and ρ w (x) 0 near the center of the bilayer. We use a smoothedramp density field as a model representation in our statistical mechanical analysis: It has two adjustable parameters, x b /x a > 1 and x s /x a > 0. A density field of such shape ( Fig. 2.3) is well-established from experiment [6] and computer simulations [7]. A peptide of 35 residues 2 in helix conformation would have a length of roughly 50Å, if we assume that each helical link adds an element l h 1.5Å of length in the direction of the axis [8]. The length of the peptide in the coil conformation is a fluctuating quantity. The actual size of each link is l a 4Å. If the peptide is fully extended it reaches the contour length of ∼ 140Å, almost three times its length in helix conformation.
In a homogeneous medium the coil conformation can be modelled by a selfavoiding random walk. The average end-to-end distance predicted by an unre-stricted random walk is ∼ 24Å, about half of its value in helix conformation.
Geometrical and dynamical constraints [9] make the average end-to-end distance somewhat longer. In the membrane environment the average end-to-end distance between N terminus and C terminus is much harder to estimate and best handled case by case.

Free-energy
The term free-energy landscape as introduced in the title and then invoked throughout this work is in need of some explanation. The system under consideration includes a peptide in an environment consisting of a lipid bilayer surrounded by water. The thermodynamic potential in use is the Gibbs free energy G. For a homogeneous system it can be expressed in the form, where U is the internal energy, H the enthalpy, S the entropy, and V the volume.
The control variables are the pressure p and the temperature T .
Our system is homogeneous in some but not all respects. We only consider situations at uniform T , typically room temperature. The pressure is uniform in the aqueous environment and its normal component, p N , also across the bilayer.
However, the lateral component, p L , has a characteristic profile that averages out to the value of the normal pressure.
For the purpose of this study we only consider quasistatic processes of a restricted type in which both T , and p N remain constant, at ambient pressure and room temperature, respectively. One natural energy scale, therefore, uses k B T rm 4.0 × 10 −21 J 0.58kcal/mol (at T rm = 293K) as its unit. These processes involve the translocation of the peptide, accompanied by changes in conformation of the peptide and in its interactions with water and lipids.
Each such process can be described as a path in some parameter space. The variation of G along any such path is then a section of a free-energy landscape. Path segments where ∆G < 0 are favorable and path segments with ∆G > 0 unfavorable regarding spontaneous occurrence. Paths that are all 'downhill' are likely to have fast realizations in experiments and paths that have significant barriers between initial and final points may be realized only on much slower time scales.
In much of this study we categorize all changes of G as either enthalpic or entropic in nature. We write, and refer to changes with ∆H < 0 as producing an enthalpic gain and changes with ∆S > 0 as producing an entropic gain. In both cases a gain is thus associated with a negative contribution to the free energy and loss with a positive contribution. (p h L − p l L )∆V > 0, comes into play when a peptide segment (e.g. a residue) of volume ∆V translocates from a position of low lateral pressure, p l L , to position of high lateral pressure, p h L . It is quite challenging, in general, to estimate all these contributions with some accuracy. Existing estimates found in the literature vary widely and often contradict each other, in part due to differences in underlying assumptions. In some instances, several contributions are lumped together such as in tabulated data for hydrophobic transfer free energies.

Enthalpic cost of H-bonds
In the α-helix conformation the backbone of each residue is involved in two H-bonds. The CO group of residue n is acceptor to the NH group of residue n + 4 acting as donor (Fig. 2.4). The helix conformation thus involves one internal H-bond per residue. In the conversion of a helix segment into a coil segment, a number of internal H-bonds must be broken, for which there is an enthalpic cost.
H−bond 13   then more likely to result in an enthalpic gain than loss.

N terminus
In the context of the methodology developed in [1], the enthalpic contribution to the conformational tendency in the heterogeneous membrane environment is accounted for by turning the activation energies of coil links from the helix reference state into fields 4 . We use the general ansatz, 4) for their dependence on the density field of water, ρ w (x), which represents the dominant environmental influence.
Near the center of the lipid bilayer we have ρ w (x) 1, which maximizes (x) to roughly the strength of an internal H-bond. At positions closer to the lipid-water interface, (x) decreases as ρ w (x) increases. This change is due to the growing probability that internal H-bonds can be replaced by external ones. The parameter α determines whether in the aqueous environment, with ρ w (x) 1, we have an enthalpic gain (α > 1) or an enthalpic loss (α < 1) 5 .

Entropic cost of H-bonds
The enthalpic cost reduction associated with external H-bonds in the polar environment of liquid water comes with an entropic price that has yet to be included in the accounting. Every H-bond formed between an exposed backbone CO or NH group with an H 2 O molecule immobilizes that water molecule and thus lowers its contribution to the entropy.
It is hard to estimate this entropy reduction from first principles but, in all likelihood, it is somewhat larger in magnitude than the entropic gain per residue produced when a segment of (ordered) helix transforms into a segment of (dis- which we estimate to be an upper limit.
The main message for Sec. 2.4 below is that the entropic contribution to the free energy of the peptide backbone in coil conformation is likely to be positive, i.e. unfavorable, when the peptide is contact with water. Here we have one factor favoring insertion. It will be weighed against other factors yet to be identified and examined 6 .

Model for peptide conformation
Previously The model solved in [1], therefore, assigns different activation energies for the control of nucleation than for the control of growth, namely 1 . = n for hosts, 2 = · · · µ . = 2 g for hybrids, and µ+1 = · · · 2µ . = g for tags. These activation energies in units of the thermal energy k B T are usefully expressed by the nucleation parameter τ (also named cooperativity) and the growth parameter t: In addition to these two continuous control parameters, the discrete model parameter µ controls the range of the random walk away from the axis of the local helix segments. All model features used in the following are reviewed in Appendix B.

Model parameter field
Of the three model parameters the growth parameter t is the one most sensitive to the environment by far. We begin our model adaptation to the membrane environment by keeping the cooperativity parameter τ and the number µ of coils states per residue uniform, while we turn t into a field.
For this purpose we now use the ansatz (2.4) for the construction of two fields of scaled activation energy, one for hosts, the other for hybrids and tags 7 . The former (m = 1) we leave undetermined and the latter (m = 2, . . . , 2µ) we link to the density field of water: 8) where Hb /k B T represents the scaled energy of an H-bond and α H 1 is the enthalpy parameter introduced previously, now assumed to be equal for hybrids and tags.
The growth parameter field, is environmentally sensitive via the shape of ρ w (x) and the value of α H . The cooperativity, is kept as a continuous model parameter controlling the nucleation of coil segments.
The function K 1 (x) is then determined via (2.10).
Our reasoning for this choice of modelling is that cooperativity, which controls the nucleation of coil segments, is a process initiated by thermal fluctuations within the backbone of an intact segment of α-helix. Multiple H-bonds must be broken simultaneously. They are all protected from immediate contact with the environment. Nucleation is affected indirectly by an environmental change from non-polar to polar. In the non-polar environment the nucleation energy barrier is followed by a high plateau and in the polar environment by a low plateau. The difference is accounted for by the growth parameter field. For the coil conformation to be stable, the growth parameter must be below the upper dashed line (at t = 3) at the very least. We see that is not the case unless α H > 1, i.e. unless the breaking internal H-bonds along the backbone of the polypeptide and replacing them by external H-bonds with available H 2 O molecules is an enthalpic gain. Now we have the extended model system ready for applications to the hetero-geneous membrane environment. We have converted t into the field (2.9) and keep the control parameters τ and µ as in [1]. We have already stated reasons not to turn the nucleation parameter τ into a field. Regarding discrete model parameter µ 8 we will primarily consider the cases µ = 2 and µ = ∞, for which analytic solutions are available to our analysis in extension to the solutions presented in [1].
The two values span a range that is correlated with a range of entropy generated inside coil segments of given length.

Profiles
The analysis as carried out in the following yields profiles for some relevant local attributes of peptides in this heterogeneous membrane environment: densities of free energyḠ, enthalpyH, entropyS, and helicityN hl . These profiles are still attributes of long, generic polypeptides. What is taken into account at this stage are the internal H-bonds along the backbone of the polypeptide and external Hbonds with water or lipid headgroups depending on their availability. Also taken into account is the entropic effect the exposed backbone in coil conformation has on the surrounding water or lipid molecules.
The polypeptide is oriented perpendicular to the membrane and the length per residue is taken to be independent of the conformation. We can then use the expressions forN hl (t, τ ),S(t, τ ),Ḡ(t, τ ), andH(t, τ ) from Appendix B with t = t(x) from (2.8), (2.9) and any choice of τ . The growth parameter field t(x) determines all profiles via local relations. Profiles for µ = 2 are shown Fig. 2.7 and profiles for µ = ∞ in Fig. 2.8.
Well inside the lipid bilayer, at x 0, the helix conformation is firmly es-8 It is fair to argue that the coil becomes more constrained in the lipid environment than in the aqueous environment, which suggest that we should use a larger value of µ in the latter than in the former. To implement the restricted space available to the polypeptide coil in this way is difficult from a technical point of view. It would entail approximations of little control and is likely to obscure other environmental effects. We, therefore, opt to account for the space restrictions imposed by the lipids in different ways. Therefore, the order parameter (helicity) is very close to saturation whereas the densities of enthalpy and entropy are near zero. In consequence, the free-energy density of the polypeptide rises only imperceptibly above its (zero) reference value as well. As we move the position x away from the center of the bilayer, out of the membrane environment into the aqueous environment, the helicity decreases and the entropy increases, the former reflecting a drop in (helical) order and the latter a rise in (coil-like) disorder, both associated with the same conformational change.
In There are some qualitiative and some quantitative differences between the curves for µ = 2 shown in Fig. 2.7 and those for µ = ∞ shown in Fig. 2.8. The case µ = ∞ produces pure coil with maximum entropy in the aqueous environment.
Pure coil means zero helicity. The entropy that can be generated in coil segments by the case µ = ∞ is significantly higher than that generated by the case µ = 2.
The enthalpic spikes near the interface are more pronounced in the case µ = ∞. This difference is attributable to an entropic effect. The breaking of an Hbond at significant enthalpic cost is more likely to happen if the entropy produced is large (µ = ∞) than if it is small (µ = 2).
The free-energy profiles in Figs. 2.7 and 2.8 tell us that the incentives for the insertion of peptides must come from a source other than what has already been taken into account. The free-energy densityḠ(x) is significantly higher in the membrane environment than in the aqueous environment. The enthalpic balance strongly depends on the parameter α H but it is not decisive. The entropic contribution strongly favors coil over helix, which, in this context, means water over lipids. At this stage of the analysis we consider one source that modifies the free-energy density profile toward favoring insertion. Further sources, associated with side chains and their interaction with lipids.
The replacement of backbone internal H-bonds with external H-bonds that immobilize H 2 O molecules from the aqueous environment, while providing an enthalpic discount as already accounted for, comes at a significant entropic cost. This effect can be taken into account via an amended free-energy density constructed as follows:  Note also that for α > 1 the global minimum switches position from lipid environment to water environment as the ∆S H /k B is lowered below 1.0. The significance of these features for stable and metastable states of short peptides will be further investigated below.

Landscapes
At this next stage of the statistical mechanical analysis we interpret the profiles calculated in Sec. 2.3 as propensities of residues in short peptides such as pHLIP.
The density profiles represent average local attributes of long, generic polypeptides in and around the membrane environment. Here we use them as one factor affecting the behavior of residues of short peptides in the same environment, specifically their ends as follows: 13) for n = 2, . . . , 1 2 N R if N R is even. The function l(x) is a part of the modeling that varies with context (see Sec. 2.4.3 below). The expression for free energy per residue and for helicity can then be calculated from the profile functions evaluated at specific positions viā 15) if N R is odd, andḠ 17) if N R is even. The functionN hl (x) and the backbone contribution to the function G H (x) that enter these expressions are those used in Sec. 2.3 for profiles of long polypeptides. We now present results for two cases of this scenario. Results for two cases of this scenario follow in Sec. 2.4.3 below.

Scenario #2
Here we consider free-energy landscapes for short peptides of varying position and orientation. The model peptide in this scenario consists of two straight segments as illustrated in Fig. 2.11. In applications to pHLIP, the position of the kink could naturally be associated with the helix inhibiting proline residue but it can be anywhere in the sequence. One simulation study [10] places a kink at the position of the Asp residue which is somewhat closer to the N terminus than the Pro residue. We use the position x n K of the kink on the normal to the bilayer and the angles θ N , θ C of the segments with the N, C termini, respectively, as the variables that uniquely specify the position and orientation of the peptide in the membrane environment. The distances between residues in each segment are modeled as before but with all positions of residues now generated recursively from the position of the kink.
Free-energy landscapes that depend on three continuous variables are hard to visualize in a compact fashion. We must adapt the mode of graphical presentation to best serve our chief purpose, which is to explore different pathways of insertion such as have been suggested, for example, on the basis of experimental evidence. We therefore investigate the variation of free energy along different insertion pathways. Each pathway is described by a specific synchronized variation of the variables indicated in Fig. 2.11. In practice, we express all three variables as functions of a single parameter with range 0 ≤ z ≤ 1. Each set of functions, x n K (z), θ N (z), θ C (z), traces a specific insertion pathway.
In pHLIP applications the initial state (named state II) is an adsorbed peptide, largely oriented parallel to the bilayer and positioned close to the lipid headgroups.
The final state (named state III) is the TM state, oriented perpendicular to the bilayer, with the C terminus having crossed the membrane. Hence the initial and final values of the three variables are where the values x ini n K , x fin n K are the results of optimizations. Along each insertion pathway we can calculate the free energy as we have done in Sec. 2.4.1. Modified versions of (2.12), (2.14), (2.15) read x n−1 = x n + l(x n ) cos θ N : n = n K , . . . , 2, (2.19) x n+1 = x n + l(x n ) cos θ C : n = n K , . . . , N R − 1,  No matter whether we assign little entropy to coil segments (µ = 2) or a lot of entropy (µ = ∞), the landscapes have similar shapes. The helicity landscapes are almost independent of the enthalpy parameter. That parameter affects the free-energy primarily in the aqueous environment as expected.

Landscapes from backbone alone
Insertion into the membrane is clearly favored in all three cases and for both variants of the model. The plots also tell us that insertion is accompanied by a conformational change from coil to helix. For the longest peptide the minimum in free energy is not as deep and the maximum in helicity is not as high as is the case for the two shorter ones. The obvious reason is that the former has significant flanking ends that remain in water.
Of particular interest is the free energy barrier that separates states with the center of the peptide in aqueous or membrane environments, the former mostly in coil conformation and the latter in helix conformation. This free-energy barrier is very shallow for µ = 2 and only exists if α H > 1. For µ = ∞, on the other hand, it is more conspicuous and present even for α = 1. This difference is related to the higher entropy that must be expelled by coil segments for µ = ∞ when they order into helix segments before they can cash in the enthalpic benefit of the lipid environment.
One message we take from this simplest scenario is that insertion is not automatic. An environmental change may be needed to push the peptide over the barrier. One environmental change that acts on protonatable charges carried by some side chains (e.g. Asp and Glu) is an increase in acidity. The consequences for pHLIP are well documented by experiments [11,12,13,14]. Let us recall that the extent of insertion as predicted by free-energy landscapes and the extent of ordering as predicted by helicity landscapes can be directly monitored experimentally, namely by tryptophan fluorescence and by circular dichroism experiments, respectively.
An improved level of modeling takes into account that the distance between adjacent residues is different in the coil and helix conformations. This feature can readily be implemented in the framework of our methodology. Here we keep the orientation of the peptide perpendicular to the bilayer. The distance between successive residues now depends on the local conformation of the backbone at their position in the membrane environment. That conformation is either coil or helix with probabilities for which we use propensity profile as calculated in Sec. 2.3. We where we use (averaged) lengths l c for coil segments and l h for helix segments.
The values of l c and l h are two physical parameters in this model version.
We again mark the location of the peptide as the position of its center residue or center link and generate the relative positions of residues recursively from that center as in (2.12) and (2.13). The expression for free energy per residue and for helicity do not change from (2.14)-(2.16) in structure.
Whereas the shift l h associated with a helix segment remains fixed at the value   2.14. Helicity (peaked at center) and scaled free-energy of peptide with N R residues oriented and positioned as described in the text versus the coordinate x 0 of the central residue for µ = 2 (left) and µ = ∞ (right) and three sizes. The solid and dashed curves pertain to the values α H = 1 and α H = 1.05, 1.1 of the enthalpy parameter, respectively. The length specifications are l h = 1.5Åand l c = 3.0Å.
bend becomes more pronounced. In addition we observe a systematic change in a feature already present in Fig. 2.12. The locations where both landscapes level off shift further away from the center of them membrane.
Consider the the peptide with N R = 35 residues. It just fits into the non-polar space between the two layers of headgroups. As its center residue is gradually displaced away from the middle of the membrane the two stretches of residues on either side respond differently two the changing environment. The outer stretch gradually moves into the polar environment, converting from helix to coil in the process. The inner half remains in helix conformation. The outer stretch gets continually longer a while the inner stretch maintains its length.
For as long as the center residue is inside the membrane thus observe a steady descent of the helicity and a steady rise of the free energy. When the center residues moves into the non-polar environment, then the inner stretch begins to unravel from helix to coil and thus increase is length. The implication is the helical portion of the inner stretch now moves away from the center of the membrane at a slower rate than the center residue does. The consequences are that the both landscapes now change more slowly and level off later.
The systematic dependence on N R of the landscapes remain very similar in the improved model as is evident when we compare the panels of Fig. 2.12 with the corresponding panels of Figs. 2.13 and 2.14.

Conclusion and outlook
The continuation of this project will focus on the effects of side chains and additional effects of the lipids. The contributions to the free-energy landscape originating from the side chains of a peptide with a given sequence of residues are manifold. Some of them are associated with properties of the lipid bilayer that we have yet to discuss. For this investigation the protagonist will be a peptide with a specific sequence of residues, in particular the pHLIP variants identified in Fig. 2

.2.
We shall estimate the peptide free energy (in the sense explained in Sec. 2.2.2) in three principal configurations relative to its environment and along specific pathways between two of them. The three states are • State I: pHLIP is in aqueous solution and in coil conformation.
• State II: pHLIP is adsorbed to the outside interface of a cell membrane with interstitial fluid or of a liposome with water. At high pH the degree of adsorption is shallow and the conformation is coil. At low pH the degree of adsorption is deeper and the conformation is α-helix, at least to a large part.
• State III: pHLIP is in a trans-membrane state with a helical central part and short coil-like flanking ends.
Given that the side chains of residues range from strongly hydrophobic to strongly hydrophilic, it is not surprising that they contribute significantly to the free-energy landscapes. Some side chains carry (positive or negative) electric charges. The protonation of the negatively charged Glu and Asp residues and of the negatively charged C terminus at low pH changes the overall hydrophobicity significantly. The current consensus is that the protonations caused by a drop in pH trigger the destabilization of the adsorbed state II in favor of the transmembrane state III.
For the continuation of our modeling we plan to employ and compare existing data for transfer free energies in order to estimate the side-chain contribution to the free-energy owing to hydrophobicity. Our goal is to compare such free-energy contributions between states I, II, and III at high pH and at low pH.
The primary effect of the lipids has been present in this study from the outset.
The lipids produce the heterogeneous membrane environment described in Sec. 2.2 and govern the conformational propensities of peptides placed into this environment as described in Sec. 2.3. In the projected continuation we are concerned with secondary effects of the lipids as they interact with pHLIP in its adsorbed or trans-membrane states. Our focus will be on three such effects that appear most relevant for this study: • lateral pressure profile within lipid bilayer, • hydrophobic mismatch of peptide in trans-membrane state, • lipid entropy reduction due to contact with peptide The first two effects are enthalpic in nature and the third entropic. It is not a priori clear which effects are the most important. Further effects may require consideration. [2] G. A. Nemnes and D.-V. Anghel, "Fractional exclusion statistics in nonhomogeneous interacting particle systems," Roman. Rep. Phys., vol. 66, no. 336, 2014. ates at the bilayer surface, followed by insertion [1,2,3,4,5]. The evidence from fluorescent experiments is mostly indirect [6,7,8] and the results of simulation studies are somewhat ambiguous in this respect [5,9,10,11,12,13,14,15].

List of References
The kinetic modeling presented in this work is custom-made, as were two predecessor studies [16,17] that serve as its foundation, for variants of the pH Low Insertion Peptide (pHLIP) family but not to the exclusion of other peptides with similar attributes. All variants of pHLIP considered here are moderately hydrophobic membrane peptides containing protonatable residues. Our published biophysical data indicate that at neutral and high pH, pHLIP is monomeric and largely unstructured [18].
Peptides in aqueous solution (state I) coexist with peptides adsorbed to the surface of a lipid bilayer (state II). The fraction of the adsorbed peptides is controlled by the lipid-to-peptide ratio. Lowering the pH shifts the equilibrium towards membrane insertion and the formation of a trans-membrane (TM) helix  The process of the peptide association with the membrane is experimentally distinguishable from the process of peptide partitioning into bilayer. The latter is accompanied by the coil-helix transition and triggered by a drop in pH. Caloric experiments [19,20] have established that the bilayer affinity of the peptide is about 30-50 times higher at low pH than at high pH.
Another fact is that pHLIP insertion comes in the wake of the protonation of Asp/Glu residues in the trans-membrane (TM) part and the inserting end (C terminus) of the peptide. The protonation leads to an increase in hydrophobicity which, in turn, triggers the peptide to partition into the bilayer and fold across the lipid bilayer [21,22]. Since the peptide is located at the border between polar (aqueous) and non-polar (membrane) environments, the pK a for the protonation of Asp and Glu residues is significantly shifted to higher values [23]. Moreover, the pK a of individual groups changes during peptide propagation into a bilayer [18,20,22,24,25,26].
In Sec. 3.2 we describe the working hypothesis underlying our kinetic modeling, first in general terms and then regarding specific tasks ahead. The physical phenomena to be described involve a combination of processes on a range of time scales and with interactions that affect these time scales. For molecular dynamics (MD) simulations this is a serious challenge. Our Markov chain modeling is designed to cope with this challenge. It can do that at a price that is significant (spatial resolution) but still worth the effort.
Next we present Markov chain models for two separate processes. One process describes changes in protonation status of one protonatable residue at a given location in the membrane environment with a controllable level of pH in water process that interlocks all three component processes is the logical next step of kinetic modeling.

Working Hypothesis
The proposed modeling of the peptide insertion is largely anchored by experimental data on pHLIP.

Experimental evidence
The initial state in the kinetic model is a pHLIP molecule adsorbed at the surface of a bilayer in coil conformation. The first step in the insertion process, taking place on a ∼ 100ms time scale, is associated with a coil-helix transformation [27].
The second step (insertion, formation of TM helical state) would be completed within the same time scale if no protonatable residues or polar cargo were present at the peptide inserting end (C terminus) [28]. The presence of protonatable residues (or polar cargo) leads to the stabilization of intermediate states. These intermediate states involve helical segments located near the water-lipid interface and oriented mostly parallel to it.
There are two dominant effective forces: a hydrophobic force F in pointing toward the center of the bilayer and an electric force F out pointing away from the bilayer. The force F out acts directly on the negative charges of the Asp/Glu residues while the force F in originates from nearby non-polar residues (e.g. Leu, Ile). It is well established that the removal of Asp/Glu residues from the peptide inserting end increases the rate of insertion by orders of magnitude, thus greatly reducing the stopping power of F out [28].

Forces in control of insertion
Consider a coil segment near one of the Asp residues. At pH 8 the Asp residue has a high probability for being in a deprotonated (negatively charged) state. There are two major effective forces acting on that segment: a hydrophobic force F in pointing toward the lipids (positive x) and an electric force F out pointing away from the lipids (negative x) (Fig. 3.2). The force F out mostly originates from negatively charged Asp residues. The force F in originates from nearby hydrophobic residues (Leu and Ile, for example). Both forces depend on the heterogeneity of the medium, its shifting attribute from polar to non-polar with increasing position coordinate x. We expect both forces to have profiles with maxima (in strength) near the interface, and tails fading in both directions. For a stable equilibrium near the interface to exist, for which there is ample experimental evidence at high pH, the maximum of F in must be further out than the maximum of F out as sketched. This establishes a potential minimum at the position of the vertical arrows. Near this mechanical equilibrium position the force F in + F out acts like a restoring force.
The average state of protonation of the Asp residue is determined by its pK a in a given environment. In aqueous solution the pK a of protonation of Asp residues is around 4. As the environment changes from polar to non-polar across the interface, the pK a of protonation rises. Hence the probability of protonation at constant pH increases while the Asp residue moves into the lipid environment.

Kinetics of protonatable residue
There are strong experimental indicators that the insertion of a pHLIP molecule from (adsorbed) state II to the (transmembrane) state III is driven by the kinetics of protonatable residues near the C terminus. In the modeling presented in the following we focus on one such residue and consider two processes that overlap in time. We first analyze them separately, in Secs. 3.3 and 3.4, and then, in Sec. 3.5, we combine them. They are • a process of protonation/deprotonation of Asp/Glu residues controlled by the availability of protons and the polar/nonpolar nature of the environment, • a process of in/out motion through the heterogeneous membrane environment driven by unbalanced hydrophobic and electrostatic forces.
The speed and direction of each process depends on the instantaneous status of the other process.
We begin with the design of Markov chain models for each process, which we then combine into a single stochastic process. The challenge we face is to simultaneously accommodate all relevant time scales. In a Markov chain model one universal clock ticks at the rate in which transition matrices are multiplied.
The physical time scales are controlled via adjustments in transition rates.

Protonation status of residue
We begin with a negatively charged, protonatable residue such as Glu or Asp in water. If the rate constants of deprotonation and protonation are k d and k p , respectively, then chemical equilibrium in water dictates (as detailed in Appendix C) that the ratio depends on the level of pH and the pK a of the residue as follows: If the residue is located in the heterogeneous membrane environment, i.e. away from pure water, then the pH in (3.1) must be replaced by 1 2 pK w , where pK w is the pK of water in that environment. In Appendix C we present further details about the modification of (3.1). We conclude that between pure water and the center of the membrane both k d and k p decrease with k d decreasing faster than k p . In the following modeling we introduce parameters that reflect key attributes of the rate constants k p , k d .

Markov chain
The Markov chain model discretises the line of positions between pure water and the center of the membrane into an array of N cells (see Fig. 3.3). For convenience we assign odd i and even i+1 to the same position of the protonatable residue in its protonated (pro) and deprotonated (dep) states, respectively. The index i = 1, . . . , 2N thus specifies the position of the residue and its pro/dep status The physics is contained in the nonzero off-diagonal elements A eo and A oe between states with odd index i and even index i+1. The diagonal elements are determined by the requirement that probability is normalized:

Transition rates
For the specification of the transition probabilities of protonation and deprotonation as encoded in the matrix elements A eo and A oe , respectively, we divide the range of positions into three intervals of variable relative size, representing the exterior, interface, and interior parts of the membrane, respectively (see Fig. 3.3).
Across the interface, both transition probabilities change between the values, (3.4) The drop of one rate by a factor a p and the other by a factor of a 2 p , where 0 < a p < 1, encodes the slowing-down of both processes between polar and non-polar environments. The linear variation of A eo and A oe across the interface as indicated in Fig. 3.3 is the default choice. It can be modified on the basis of empirical evidence.
The rationale behind our choices of parametrization for these rates is explained in Appendix C. The significance of the three model parameters are the following: • K p sets the time scale for the pro/dep process associated with the computational clock. The latter advances by one tick per matrix multiplication in the Markov chain model.
• b d is controlled by the level of pH.
• a p is controlled by the change in pK a (and pK w ) between water and membrane interior.

Time scale
With all ingredients for the kinetics of the protonation status in place we must choose an initial state as encoded in a (normalized) probability vector, P(0) = P 1 (0), P 2 (0), . . . , P 2N (0) . (3.5) It specifies the initial probability for the residue being at a given position in a given protonation status. Here we are concerned with the time evolution of the pro/dep status alone. As mentioned previously, on the computational clock time advances one tick per matrix multiplication, expressed as follows: For the Markov chain process specified by matrix A all times t will be stated in units of K −1 p . This has two advantages.
• By choosing the value of K p sufficiently small we can ensure that the computational clock provides the necessary time resolution of the underlying physical process.
• For values of K p below a certain threshold, the results expressed as functions of t become independent of K p .

Stationary state and equilibration
The structure of the matrix A guarantees that a stationary state P(∞) exists.
The block-diagonal nature of A conserves probability within each pair of states, P i (n) = P o (n), P i+1 (n) = P e (n), : i = 1, 3, . . . 2N − 1. (3.7) The stationary state can then be computed iteratively, 8) or determined from the eigenvalue equation, 9) and the normalization as an auxiliary condition. We find (3.10) For the purpose of illustrating the local equilibration of the protonation status for situations where the water is at high pH or low pH we consider

High level of pH
For this case, the deprotonation rate A oe in water is much higher than the protonation rate A eo , yielding, as we shall see, a high probability for the residue to be deprotonated. Both rates slow down as the position shifts from the exterior to the interior region. The deprotonation rate slows down more rapidly as argued in Appendix C. In Fig. 3.5 we show the dependence on physical time t of the probabilities P i , P i+1 with opposite initial values for positions in the exterior, interface, and interior regions. The value of K p was chosen sufficiently small to ensure optimal time resolution.  Figure 3.5. Equilibration of probabilities P 3 , P 4 in the exterior region, P 7 , P 8 at the center of the interface, and P 11 , P 12 in the interior region at a high level of pH (in water). The parameters are specified in Fig. 3.4(a). The initial state for each probability shown is either fully protonated or fully deprotonated. All curves were produced with K p = 0.01. The dashed lines represent the equilibrium values from (3.10).
Taking into account the different horizontal scales we can clearly see the progressive slowing down of the pro/dep process as we move from the exterior region, Fig. 3.5(a), across the interface, Fig. 3.5(b), to the interior, Fig. 3.5(c). The time it takes to reach equilibrium is much the for opposite initial states. We can also observe how the rising pK a between exterior and interior regions changes the equilibrium values (3.10). At high pH, the equilibrium in water strongly favors deprotonation. That tendency becomes much weaker as the position shifts from the polar exterior to the non-polar interior.

Low level of pH
Here we are concerned with the time evolution of the pro/dep process at constant, low pH. For that purpose we have lowered the relevant parameter from b d = 4 to b d = 0.25. The parameter a p is left unchanged. The modified transition rates at low pH are shown in Fig. 3.4(b) along with the stationary state (3.10).
We see that the lower pH affects the rate of deprotonation, A oe , more strongly than the rate of protonation, A eo . The former is now much lower both in water and inside the membrane. This shifts the equilibrium toward the protonated state everywhere.
The equilibration process at the same positions as in Fig. 3.4 with the same initial states but now in an environment of lower pH produces the results shown in  Figure 3.6. Equilibration of probabilities P 3 , P 4 in the exterior region, P 7 , P 8 at the center of the interface, and P 11 , P 12 in the interior region at a high level of pH (in water). The parameters are specified in Fig. 3.4(a). The initial state for each probability shown is either fully protonated or fully deprotonated. All curves were produced with K p = 0.01. The dashed lines represent the equilibrium values from (3.10).

Drop of pH
Lowering the pH as part of a pHLIP insertion experiment involves the mixing of fluids, which takes place on its own time scale. The so-called dead time before the first data are taken is of the order of 10ms, which is certainly much longer than the time scale on which the protonation status equilibrates, at least, in water.
Equilibration is slower in the membrane environment but, in all likelihood, comes to completion well within the dead time. This conclusion will be used in Sec. 3.5, where we combine the kinetics of protonation status and motion.

Motion of residue
Depending on whether the residue is protonated or deprotonated it experiences different forces. We have already identified these forces in Sec. 3.2.2 as being predominantly hydrophobic and elecrostatic in nature with profiles as sketched in Fig. 3.2.
It is reasonable to argue that the translocation of the protonatable residue is an overdamped motion driven by these forces. The velocity is then proportional to the magnitude of the resultant force. In the Markov chain model, the units of both distance and time are fixed. The velocity of motion is accounted for by the probability of a step of unit distance in (i → i + 2) or out (i → i − 2) in the time period assigned to one matrix multiplication (computational clock).

Transition rates
The motion of the residue with fixed protonation status is encoded in the transition matrix, 11) where in each row we have, in addition to the nonzero diagonal element B i,i , one is proportional to the magnitude of the resultant force. The values of the diagonal elements again follow from the normalization condition, (3.12) The direction and magnitude of the resultant force depends on the position and pro/dep state of the residue. The force is F in if the residue is protonated and As a default, we use piecewise linear force profiles with F in and F out related to each other by two successive reflections as illustrated in Fig. 3.7. This choice, which can be modified as prompted by empirical evidence, produces a stable equilibrium at the center of the membrane when only F in is acting as is the case when the residue is protonated. When both forces are acting as is the case when the residue is deprotonated, then they produce a stable equilibrium state at the center of the interface.  This simple model force profile leaves us with a single model parameter c f .
We write (3.13c) whereF in = F in /F max andF out = F out /F max are scaled forces.

Time scale and stationarity
Equations ( In the following we separately investigate the motion of protonated and deprotonated residues as driven by the applicable forces. The protonated residue only experiences the hydrophobic force F in . The deprotonated residue also experiences the electrostatic force F out .

Protonated residue
Locations of the residue in its protonated state have odd indices i. The curves in Fig P 1 is monotonically decreasing as t increases and will approach zero asymptotically for t → ∞. The P i for intermediate positions i = 3, 5, . . . , 11 are represented (in that order) by the curves that start at zero and approach a smooth maximum at later and later times. All these curves also reach zero asymptotically for t → ∞. P 13 also starts from zero but is monotonically increasing and will level off at 1 asymptotically for t → ∞. The non-monotonic variation in the heights of the maxima reflects information about average position and velocity as will be analyzed below.
In Fig. 3.8(b) we have only changed the initial state. We now assume that the (protonated) residue is in any position with equal probability. All P i except P 13 reach zero asymptotically for t → ∞. The residue again ends up with unit probability at the center of the membrane. The curves for low values of i are monotonically decreasing whereas the curves for higher i go through a maximum.
This feature encodes information about the unidirectional motion to be analyzed later.

Deprotonated residue
Locations of the residue in its deprotonated state have even indices i. The   In panel (a) P 10 , P 12 , and P 14 stay strictly zero. The probability P 2 is monotonically decreasing and the probability P 8 monotonically increasing between the same extreme values. The probabilities P 4 , and P 6 start from zero and go back to zero asymptotically. The initial rise is quicker and higher for the former. The long-time asymptotics is the same for both. In panel (b) the reflection symmetry of the combined forces makes the curves overlap in pairs. Naturally, the final state is the same in both panels.

Location and speed
Some of the information contained in Figs. 3.8 and 3.9 can be unlocked by using different formats. For example, if our chief interest is in the location of the residue and the speed at which it changes location under given circumstances we can proceed as follows.
We consider the cases where the initial state is either i = 1 or i = 2 as represented by Figs. 3.8(a) and 3.9(a). Both residues can be at N distinct scaled positions across the range 0 ≤ x i ≤ 1. The position x i is related to the (odd or even) state index of the protonated or deprotonated residue via (3.14) The average position x of the protonated and the deprotonated residue as functions of physical time t is then calculated as different weighted averages of state probabilities.
The results are shown as two curves in Fig. 3.10. The protonated residue, which experiences only the hydrophobic force F in , moves to the center of the membrane (x = 1) whereas the deprotonated residue, which is also subject to the electrostatic force F out , moves to the interface (x = 0.5).
Both residues are released in water (x = 0) at time t = 0. They reach their destination within 30 units of physical time. That time unit is arbitrary in the sense that it can be assigned any value based on empirical evidence. We have seen that changing the value of c f below a certain threshold leaves the shape of the curves on the time scale t invariant.
In the overdamped dynamics used here the residues inertia is neglected, whcih The protonated residue experiences a stronger force than the deprotonated residue.
In consequence the former moves faster and reaches its destination a bit earlier even though it travels a longer distance.

Motion with status change
In Secs. 3.3 and 3.4 we have explored two processes that we intend to combine here. The physical time scale relative to the tick of the computational clock was controllable by a single parameter in each case. That parameter was K p for matrix A and c f for matrix B. In both processes we achieved results whose t-dependence were independent of those parameter values provided they were chosen sufficiently small. The unit of t in each process was arbitrary and could be assigned any value based on empirical evidence.
One way to combine the two processes within the framework of the same methodology employs the transition matrix (3. 15) with A from (3.2) and B from (3.11). Whereas the t-dependence of Markov chain process driven by matrix A alone or by matrix B alone are independent of the relevant parameters K p and c f , respectively, provided they are sufficiently small, the t-dependence of the Markov chain process driven by matrix W varies with the ratio r w . = c f /K p . A small (large) value of that ratio means that the kinetics of motion is slow (fast) compared to the kinetics of protonation and deprotonation.
Variations of r w are likely to produce very diverse results as will be explored in the following.
The dynamics of W has a sort of adiabatic regime. For small r w all motion takes place with the pro/dep status always close to being equilibrated. In practice this means that the motion consists of tiny steps forward and back that average out to an effectively smooth motion of a residue with fractional protonation status. If we release the residue at some location then its subsequent motion does not significantly depend on its initial protonation status. This adiabatic limit is expected to produce the least complex dynamics.
As we investigate the dynamics of the process W for progressively larger values of r w , the motion becomes dependent on the initial protonation status. The residue is now able to move a significant distance, at least in some locations, before the protonation status has a chance to switch.
From Sec. 3.3 we know that the protonation status shifts its equilibrium as a function of position and the equilibrium value in water is controllable by the level of pH. From Sec. 3.4 we know that the deprotonated residue tends to settle at the interface and the protonated residue in the center of the membrane.
With the level of pH we now control the average initial protonation status with which the residue is being released in water. It is to be expected that the residue will either settle at the interface (adsorbed state) or in the center of the membrane (inserted state) with probabilities that depend on the level of pH.

Stationarity
It makes sense to begin our exploration with stationary states, i.e. solutions of the eigenvalue equation, (3. 16) These solutions are all independent of the initial state of the residue, both regarding location and protonation status. We are looking for the physically relevant regions in the space spanned by the three parameters a p (controlling the change in pK a between exterior and interior), b d (controlling the pH in the exterior), and r w (controlling the relative time scales of processes A and B).
The solution of (3. 16) for given values of these three parameters produces a unique stationary probability distribution, 17) containing all information about that state. In Fig. 3.11 we show stationary distributions for for a model with N = 7, where we only vary parameter b d .
Panel (a) represents a situation of high pH. The protonatable residue is with very high probability in state i = 8. This state represents a deprotonated residue at the center of the interface. As we decrease b d in the three steps we observe the probabilities to shift towards locations in the interior and then also toward protonation. In panel(d) the residue is with overwhelming probability in state i = 13, meaning protonated and located at the center of the membrane.
As we explore the parameter space in search of the physically relevant and interesting regions, it is advisable to focus on quantities that are graphically more i , yet express the most important information. Two natural selections are the mean scaled position of the residue, 18) with x i from (3.14), and the probabilities 19) for its protonation status. The latter are, of course, complementary: P  How are these stationary states affected when we vary the other two parameters a p and r w ? This is shown in Fig. 3.13. In panel (a) we vary a p . Recall that the lowest value of a p represents the highest rise in pK a of the residue between exterior and interior positions. We observe that variations in that parameter have little impact at low pH (small b d ). The effect is largest at intermediate pH and still significant at high pH. Residues with a large shift in pK a tend to be located deeper in the membrane at high pH than residues with a smaller shift. The former tend to have a higher protonation probability than the latter. The second trend is, of course, at least in part, a consequence of the first trend.
By strong contrast, but unsurprisingly, a variation in r w produces much weaker changes as illustrated in panel (b) for just one case. Decreasing r w from unity by as much as a factor of 10 (not shown) has little effect. Increasing r w by the same factor produces the effect shown. Recall that changing r w means changing the relative time scales between the two intertwined processes A and B. A factor of 10 either way is huge, yet we have to keep in mind that here we are studying stationary states. The trajectory in time from any initial state toward the unique stationary state is expected by much more strongly affected by variations in r w .

Kinetics
We begin our investigation of the kinetics of insertion of the protonatable residue by tracking its motion in time for a given initial state under different circumstances. In Fig. 3.14 we show the position of the residue as a function of time under variations of two parameters.
Panel (a) represents a case of sluggish motion, r w = 1. The residue starts in the exterior region moves to the interface at high pH or to the center of the interior region at low pH. The residue gets to its destination on very similar time scales. It makes almost no difference wheter the inital state of the residue is protonated or deprotonated. We know from Fig. 3.11 that the final state is with high probability deprotonated at high pH and protonated at low pH. This makes sense if we take into account that the force in the exterior region is stronger for the protonated residue than for the deprotonated residue. At high pH the (average) protonation goes down or stays down whereas at low pH it goes up or stays up.
Next we examine the kinetics of a situation where the protonatable residue is in the stationary state at high pH and then experiences a sudden drop in pH.
The initial state is the one identified in Sec. 3.5.1 and described in the context of Fig. 3.11(a). The residue is, with high probability, deprotonated and located at the center of the interface.
When parameter controlling the pH is switched from the initial value, b  in the sharp rise of probability P 9 . All the probabilities shown then change more slowly toward their long-time asymptotic values. In this slower process we see the kinetics of relocation in action. The highest probabilities position the residue either near the center of the interface (states 7,8) or at the center of the membrane (states 13,14). The probabilities for the residue to be at intermediate positions are smaller. If the residue is still at the interface it is more likely to be deprotonated, if it has moved to the interior it is more likely to be protonated.

Variation of pK a
There is considerable uncertainty regarding the the amount and the rate by which the pK a of the protonatable residues change between the (exterior) aqueous environment and the (interior) highly non-polar environment. While the modeling carried out thus far ties the rate of change to the amount of change, we can vary the amount through the model parameter a p .
In Fig. 3.17 we show what the effect on the stationary probability distribution as discussed in the context of Fig. 3.11 is if we assume a significantly smaller or a significantly larger change in pK a . The larger change is represented by the value a p = 0.2 and the data point connected by solid lines whereas the smaller change is represented by the value a p = 0.8 and the data point connected by dashed lines.
The case shown in Fig. 3.11 represents an intermediate change in pK a .
The main insight from this exercise is that that the insertion tendency is enhanced by a larger change in pK a . The difference between the two sets of data is highest at intermediate pH. In panel (c), for example, a small change in pK a has the residue still with high probability at the interface in a deprotonated state, whereas the a large change in pK a has it with high probability at the center of the membrane in a protonated state. The enhanced tendency of insertion is already present at high pH as seen in panel (a). Only at low pH is the role of the parameter a p of marginal significance.
The main insight already gained from stationarity is largely confirmed by results portraying the kinetics such as shown in Fig. 3 in pK a [panel (b)]. The trend towards insertion is equally strong for the case oh low pH despite the different (average) starting position.

Average-force approximation
The type of motion we have been considering throughout this study is overdamped and slow. However, what matters in our kinetic modeling is relative slowness, a measure captured by the parameter r w . It compares the speed of the pro/dep process with the speed of translocation. For r w 1 and smaller, the residue does not move significantly between pro/dep events whereas for r w 1 it does We recall that the pro/dep transition rates vary a great deal between the aqueous environment and the membrane interior. Their slowing down is attributable to two factors: the scarcity of free protons as reaction partners and the highly endothermic nature of proton production in the non-polar medium. Allowing for situations with large r w has made the modeling significantly more complex than it would be if r w 1 could be guaranteed for all applications likely to be encountered.
Here we explore the limiting case in our modeling where the distance traveled between pro/dep events can be assumed to be tiny during the entire process. In this limit, the Markov chain model can be simplified as follows. Instead of the transition matrix W, which is the product of the two non-commuting matrices A and B, representing pro/dep status change and translocation, respectively, we now use a matrix C with the same structure as matrix B but modified elements.
The nonzero off-diagonal elements of B depend on the force experienced by the residue in a given pro/dep status as encoded in (3.12).  The corresponding elements in the matrix C, by contrast, depend on the average pro/dep status of the residue at any given position, as determined in Sec. 3.3.4.
Keeping the same numbering of states, which now contains some redundancy, we 20) for the off-diagonal elements and infer the diagonal elements from the normalization condition as in (3.12).
In Fig. 3.19 we show the data of Fig. 3.14 amended by data (shown dashed) representing the average-force approximation. As expected the approximation is very accurate for r w = 1, where the pro/dep status equilibrates faster than the translocation, but turns unreliable at r w = 10, where significant motion between pro/dep are likely in some cases.

Project extensions
Naturally, the kinetics of pHLIP during its insertion into a membrane is more complex on several counts. It contains several protonatable residues and has a protonatable C terminus, all at different locations in the membrane environment and thus subject to different pro/dep transition rates. The relative positions between these residues and the C terminus is constrained in ways that depend on the conformation of the peptide (flexible coil or rigid α-helix).
In addition, we have to take into account, the forces acting on the other residues. The strongest inward forces will be experienced by the hydrophobic residues (e.g Leu) and the strongest outward forces by the positively charged Arg residue and N terminus. The relative locations of these sites are again constrained by the orientation and conformation of the peptide.
Here we have arrived at a point of closest contact with part two of the threepart modeling outlined in Sec. 3.2. In Ref. [17] we have explored pHLIP insertion pathways that are all downhill in the free-energy landscape or have to overcome, at most, small energy barriers. Free-energy landscapes are suggestive of realistic insertion pathways. They merely hint at the distinction between fast and slow pathways by the presence and size of energy barriers.
Insertion time scales of predictive significance require a kinetic study or a simulation study based on realistic rates for at least the dominant microscopic processes involved. This certainly includes the pro/dep transition rates and the rates of translocation of residues as investigated in Secs. 3.3-3.5 of this work. One challenge for simulations is to deal with processes that involve a wide range of time scales. The approach taken in this study faces several challenges including the following.
The parameters K p , a p , b d , which have served the purpose for which they have been introduced reasonably well, must be more closely linked to chemical data such as pH in water, the pK w of H 2 O molecules in the non-polar lipid environment, and the pK a of protonatable residues. The forces acting on individual residues in the modeling carried out thus far have strongly simplified profiles with a single parameter c f setting the time scale for translocation. A significant refinement will be necessary in this aspect as well.
The design of our modeling is such that it is capable of accommodating these desired extensions. The number of discrete locations can readily be scaled up to improve spatial resolution. It is also straightforward to extend the space to include both sides of the membrane including a layer of water on either side at either the same or different levels of pH.
Our scheme of combining processes with individual time scales via controlled products of distinct transition matrices can be extended to include any number of relevant factors that govern the kinetics of insertion. As the level of complexity increases, a tensor notation of the Markov chain process will become more natural.

List of References
[1] R. E. Jacobs and S. H. White, "The nature of the hydrophobic binding of small peptides at the bilayer interface: implications for the insertion of transbilayer helices," Biochemistry, vol. 28, no. 8, pp. 3421-3437, 1989.
Next we show that these recursive relations can be satisfied by Chebyshev polynomials of the second kind, which themselves are generated recursively from S 0 (w) = 1 and S 1 (w) = w via If (A.14) holds for some m then we infer from (A.11) that it also holds for m − 1: This validates (A.14) for m = 2, . . . , µ. We use (A.10) to obtain w 1 = S µ (w) τ S µ−1 (w) .
The two sources of disorder identified in Sec. 1.4.1, namely the disorder in the sequence of coil/helix segments of diverse lengths and disorder within individual coil segments, are related to the population densitiesN m of 2µ species of particles from three catgories (hosts, hybrids, and tags).
Hosts (m = 1) generate coil segments out of the helix pseudo-vacuum whereas hybrids (m = 2, . . . , µ) and tags (m = µ + 1, . . . , 2µ) extend coil segments at the expense of helix segments. Thermally excited hosts at random locations along the polypeptide helix thus produce one source of disorder and germinate the other source of disorder via the thermal excitation of hybrids and tags nested inside.

APPENDIX B Statistically interacting polymer links
The origin in quantum many-body theory of the methodology used here and its adaptation to problems of current interest in classical statistical mechanics has already been presented from several different angles. In chapter 1 we worked out one particular application with mathematical rigor. The extensions presented in Secs. 2.3 and 2.4 are all built from that foundation.
Here we summarize those results from chapter 1 that are being used as the main building blocks for these extensions. The microscopic model for the coilhelix transition of a long polypeptide at a water-lipid interface solved in Chapter 1 has three parameters: the growth parameter t, the nucleation parameter τ (both continuous) and the (discrete) parameter µ that numbers the coil states available to each residue.
For our extensions we only use the cases µ = 2 and µ = ∞ at τ > 0 and we only use selected results. We quote the relevant expressions in ways that are easy to trace back to their derivations in chapter 1.

Rate constants of protonation and deprotonation
Here we present some details about our reasoning that led to the model parameters used in Sec. 3.3. We consider (protonatable) Asp residues in a range of locations between pure water at given level of pH and the center of the membrane.
The focus is on the rates k p and k d of protonation and deprotonation, respectively, in particular their dependence on location and pH. The reaction at hand is where we define k + . = k d + k p . The solution for initial condition n(0) = n 0 reads n(t) = n 0 e −k + t + k p N k + 1 − e −k + t . (C.7) The stationary solution of (C.6), n N = 1 1 + k d /k p = P s , (C. 8) represents the probability P s that any given Asp residue at that location is protonated. It is approached by (C.7) exponentially fast at the rate k + .
The ratio k d /k p is determined by the dissociation equilibrium. In pure water we have H + = 10 −pH , A − H + AH = 10 −pKa , (C. 9) and, in any environment, Comparison of (C.10) with (C.8) with use of (C.9) yields the key result (3.1) used in Sec. 3.3: k d k p = 10 pH−pKa . (C.11) Given that pK a 4 for Asp in water, the ratio (C.11) is large at neutral pH and becomes small at sufficiently low pH, implying that Asp is very likely to be deprotonated at neutral pH and to become protonated when the aqueous environment turns sufficiently acidic.
At locations away from pure water toward the membrane interior, the dissociation equilibrium condition is more complex. It now depends on the pK w of H 2 O in addition to the pK a of Asp. Both quantities are expected to rise. Evidence from MD simulations [Vila-Vicosa et al. 2018] strongly suggests that the pK a of Asp rises from 4 to near 7 between water and membrane interior. The pK w likewise rises from twice the pH to a value yet to be estimated. It is reasonable to expect that if the pH in water is lowered, then the pK w will go down throughout the membrane environment.
A different chain of reasoning can be employed to estimate the dependence on position of k d and k p in the membrane environment. The relative dielectric constant, (x) is known to vary between a high value, 1 80, in pure water and a much lower value, 2 5, inside the membrane. Key for our argument is that all electrostatic energies are reduced by a factor (x) −1 .
Deprotonation creates two charges. Its energetic cost rises steeply between polar and non-polar environments. The activation energy per charge associated with deprotonation thus contains a factor (x) −1 . It is then fair to conclude that the rate constant of deprotonation of a neutral Asp residue is suppressed as follows: The rate constant of protonation of a charged Asp residue at a given location x is also suppressed but less so and only indirectly due to the suppressed availability of protons. Only the charge of the proton affects the rate. The suppressed density of deprotonated residues itself is already accounted for by the factor (N − n) in (C.3). Therefore, we can write k p ∝ e −βEe/ (x) , (C.13) which, with (C.12), implies k d k p ∝ e −βEe/ (x) . (C.14) If this reasoning is sound then it follows that both rate constants are exponentially suppressed on the way from water into the membrane but the ratio is also exponentially suppressed. Comparison of (C.14) with (C.11) then implies that the pK w rises more slowly than the pK a as the location of the protonatable residue changes from water toward the center of the membrane. Our modeling of the matrix elements in (3.2) are based on these conclusions.