It is a residue-coupled model to predict ß-turns in proteins. A tetrapeptide can be generally expressed as

where R_i represents the amino acid at position i, R_i+1 represents the amino acid at position i+1, and so forth. Since there are 20 different amino acids, the number of possible tetrapeptides would be 20⁴=1.6 x 10⁵. Tetrapeptides can be classified into two categories: the ß-turn set denoted as S⁺, and non-ß-turn set denoted by S^-.

Given a tetrapeptide, its attribute to the ß-turn set S⁺ or the non-ß-turn set S^- can be expressed, respectively, by an attribute function ø. If the amino acid at each of the tetrapeptide subsites can be treated as an independent element, i.e. there is no coupling at all among these subsites, then its attribute to the ß-turn set S⁺ and that to the non-ß-turn set S^- can be expressed, respectively, as

ø₀⁺(R_iR_i+1R_i+2R_i+3)=P_i⁺(R_i)P_i+1⁺(R_i+1)P_i+2⁺(R_i+2)P_i+3⁺(R_i+3)

ø₀^-(R_iR_i+1R_i+2R_i+3)=P_i^-(R_i)P_i+1^-(R_i+1)P_i+2^-(R_i+2)P_i+3^-(R_i+3)

where P_i+u⁺(R_i+u)(u=0,1,2,3) is the probability of amino acid R_i+u occurring at subsite u in the ß-turn set S⁺, and its value can be derived from a training dataset consisting of only ß-turn tetrapeptides. P_i+u^-(R_i+u) has the same meaning as P_i+u⁺(R_i+u), except that it is associated with the non-ß-turn set S^- and its valueshould be derived from a training dataset consisting only of non-ß-turn tetrapeptides. The attribute function thus formulated for a tetrapeptide actually corresponds to the zero-order Markov chain.

If the coupling effect of a residue with those adjacent to it must be taken into account, then the probability functions as described above will be modified, i.e. substituted by the first-order conditional probabilities. Since given a tetrapeptide, its attribute to the ß-turn set S⁺ or non-ß-turn set S^- can be expressed, respectively by an attribute function ø. In this model, the coupling effect of a residue with its adjacent residue is taken into account. So, according to the first-order markov chain, the attribute function ø is given as

where g=10⁴ is the amplifying factor used for making the data in a range easier to handle. P_i⁺(R_i) is the probability of amino acid R_i occurring at subsite i in the ß-turn tetrapeptide set S⁺, and it is independent of the other subsites because R_i is located at the first position of the 4 subsite sequence. P_i+1⁺(R_i+1|R_i) is the probability of amino acid R_i+1 occurring at the subsite i+1, given that R_i has occurred at position i; P_i+2⁺(R_i+2|R_i+1 ) is the probability of amino acid R_i+2 occurring at the subsite i+2, given that R_i+1 has occurred at position i+1, and so forth. Similarly, for non-ß-turns, the attribute function ø will be

For a given tetrapeptide R_iR_i+1R_i+2R_i+3 if its attribute function to the ß-turn set S⁺ is greater than that to the non-ß-turnset S^-, i.e. ø⁺ >ø^-, then the tetrapeptide is predicted to be a ß-turn; otherwise, it is predicted to be a non-ß-turn. A discrimination function D is given as

where w⁺ and w^- are the weight factors for the probabilities derived from the ß-turn and non-ß-turn training sets respectively. If there is no special reason, they are generally set to be one, i.e. w⁺=w^- =1. Thus, the criterion of the ß-turn prediction for a given tetrapeptide in proteins can be formulated as :

Here we use S¹, S^1', S², S^2', S⁶, S⁸, and S^- to represent the tetrapeptide sets of type I ß-turn, type I' ß-turn, type II ß-turn, type II' ß-turn, type VI ß-turn, type VIII ß-turn and non-ß-turn. Given a tetrapeptide, its attribution to the sets S¹, S^1', S², S^2', S⁶, S⁸ and S^- can be expressed by ø¹, ø^1', ø², ø^2', ø⁶, ø⁸, and ø^-, respectively. These attribute functions can be calculted for each turn type in the same way as they are calculated for turns and non-turns sets (as given above)

For a given tetrapeptide R_iR_i+1R_i+2R_i+3, if its attribute function to the ß-turn set S¹ is greater than that to the ß-turn set S², i.e.,ø¹ > ø², then the tetrapeptide would have a greater propensity for the ß-turn type I than for the ß-turn type II. Based on such a rationale, the tetrapeptide R_iR_i+1R_i+2R_i+3 can be logically predicted to be the structural type for which ø has the maximum value, as can be formulated as follows. Suppose

ø^× = max {ø¹, ø^1', ø², ø^2', ø⁶, ø⁸, ø^-}

where the operator max means taking the maximum of the quantities in the braces; then the superscript will give the corresponding S^× (× = 1, 1', 2, 2', 6, 8, or -) to which the predicted tetrapeptide R_iR_i+1R_i+2R_i+3 should belong.