The SASA implicit solvation model Characteristics of the SASA model The SASA model is a fast implicit solvation model that is useful to simulate structured peptides and miniprotein motifs [1]. The polar and non-polar contributions of each atom to the free energy of solvation are assumed to be proportional to their solvent accessible surface areas. The SASA model uses only two surface-tension like solvation parameters (constants of proportionality) and approximates the solvent accessible surface area of each solute atom with a simple analytical function that is easily derivable. The electrostatic screening between solute charges is accounted for by using a distance dependent dielectric function and by neutralizing the formal charges (Asp, Glu, Arg, Lys, and the termini) as in the EEF1 model [2]. The SASA model has been successfully applied to peptides, removing the major artifacts of in vacuo simulations and reproducing reversible folding [3]. Benchmarks indicate that a simulation with SASA is only about 50% slower than an in vacuo simulation. Range and limitations The SASA model has been applied to structured peptides, see for instance [3]. However, it should not be used for large proteins mainly for two reasons. Firstly, it has been parameterized for small proteins [1] and secondly, the dielectric function does not take different environments into account, i.e., it does not distinguish whether or not the interacting partial charges are buried or on the protein surface. Theoretical aspects The potential energy of the system consisting of the solute and the solvent can be decomposed in three parts: the intra-solute potential energy U(X), the intra-solvent potential energy V(Y), and the interaction potential energy of the solute and the solvent W(X,Y), where X denotes the degrees of freedom of the solute and Y the degrees of freedom of the solvent. Integrating out all solvent degrees of freedom one obtains the potential of mean force W(X), also called the effective energy. It can be written as the sum of the intra-solute potential energy U(X) and the free energy of solvation, or mean solvation term, DW(X) that describes all solvent induced effects. This is a rigorous result from statistical mechanics. For more details consult, for instance, the review [4]. Any implicit solvation model based on the solvent accessible surface area approximates the major contribution to the free energy of solvation using the first solvation layer, and the screening effect of the solvent. Following this idea it is assumed that the mean solvation term is a sum of atomic contributions proportional to the solvent accessible surface area of each atom, plus an energy term that accounts for the screening. The constants of proportionality are the surface-tension like solvation parameters. Technical aspects of the SASA model There are three important aspects for any implicit solvation model that is based on the solvent accessible surface area: (A) how to calculate the atomic solvent accessible surface areas, (B) how many surface-tension like solvation parameters to use, and (C) how to account for the screening of solute charges. (A) Calculation of the solvent accessible surface area In the SASA model the solvent accessible surface area is approximated by a probabilistic approach. For details please refer to [5]. The base is a simple formula for the probability to hit the accessible surface of atom i if N atoms are present by choosing randomly a spot on the solvation shell of atom i. However, this formula is only true under the assumption that all the atoms are distributed randomly. This is of course not the case because of covalent geometry and the Pauli exclusion principle. Now instead of elaborating in sophisticated probability calculations the pragmatic approach of [6] is adapted where the original formula is parameterized by including two sets of parameters: a set of probabilistic parameters (called atom type parameters in [1,6]) and a set of connectivity parameters. The probabilistic parameters depend on the atom type and correct for systematic errors primarily due to hybridization. The connectivity parameters distinguish bound atoms from more distant ones. Also, the radii used to calculate the approximated solvent accessible surface were optimized for this purpose. (These radii are used only for the calculation of the approximated area and they are different from the radii used for the CHARMM van der Waals energy term.) The optimal values for the radii, probabilistic parameters, and connectivity parameters were taken from [6]. The atom types in [6] and the CHARMM atom types do not match exactly, so the most reasonable assignments were chosen, analogous to the choices in [7]. The only internal degree of freedom considered by this approximation is the interatomic distance. Therefore the major defect is the absence of a better correlation between exact and approximated areas upon changes in internal coordinates like dihedral angles. (B) Solvation parameters By default, the SASA model uses only two non-vanishing surface-tension like solvation parameters: one for hydrophobic and one for hydrophilic groups. The solvation of explicit hydrogen atoms is neglected. However, this can be changed by the user (see below). The solvation parameters were adjusted for the EEF1-modified CHARMM 19 polar hydrogen parameter set by Philippe Ferrara in a trial and error approach in 1999. The criterion was to minimize the root mean square deviation from the native state for six small proteins by performing molecular dynamics simulations of 1ns at 300K [1]. (C) Screening of solute charges For the screening of solute charges the SASA model uses a distant dependent screening function, eps(r)=2r, and neutralizes the charged groups of polar amino acids in exactly the same way as it is done in the EEF1 model of Lazaridis and Karplus [2]. Implementation in CHARMM To every CHARMM atom type the SASA model assigns a surface-tension like solvation parameter (zero by default for explicit hydrogen atoms and non-zero for hydrophobic and hydrophilic groups), a radius optimized for the approximation of the solvent accessible surface area, and a probabilistic parameter. Additionally, a connectivity parameter is assigned to every pair depending on whether the two atoms of the pair form a 1-2 or a more distant pair. These parameters are called the SASA parameters [1]. When initializing SASA, i.e., by invoking the SASA command, all the SASA parameters are printed to the CHARMM output file. A value of -999.000 means that this specific parameter hasn't yet been determined for SASA and therefore the corresponding CHARMM atom type can't be used by default for the SASA calculations - the user would have to assign a meaningful value. (Currently, SASA does not support the following CHARMM atom types by default: HA, HT, LP, CT, CM, NP, OH2, OM, OT, OS, and FE.) All SASA parameters can be changed by the user. For the corresponding syntax, see below. To evaluate the approximative formula for the solvent accessible surface area of an atom i, all neighbors of atom i within a certain cutoff are required. This cutoff is calculated by 2*(2.365A+1.4A)=7.53A, where 2.365A is the largest van der Waals radius in the CHARMM parameter set 19, and 1.4A is the radius of the solvent probe sphere. Most, but not all of these neighbors, are included in the nonbond pair list in CHARMM. The missing pairs are stored in a new pair list, the SASA pair list. More precisely, the SASA pair list contains all pairs that (1) are not in the nonbond pair list, that (2) do not belong to the fixed exclusions (as given in the topology file), and that (3) are within the above mentioned cutoff. The SASA pair list assures correct operation of the SASA model for any nonbond exclusion mode (-5 to 5). This means that for any nonbond exclusion mode from 1 to 5, the SASA energy and its derivatives are identical. The same is true for any nonbond exclusion mode from -5 and 0. The differences in the SASA energy and its derivatives between the nonbond exclusion modes regions of -5 to 0 and 1 to 5 stem from the fact that the fixed exclusions are treated differently: For an exclusion mode from -5 to 0 they are included in the nonbond pair list, opposed to an exclusion mode from 1 to 5 where they are excluded from the nonbond pair list. Caveat Please note that the radii used for the calculation of the approximated solvent accessible surface area are labeled 'van der Waals radii' in the SASA output of the CHARMM output file. This is consistent with the terminology used in [6]. However, these radii are different from and do not replace in any way the CHARMM default van der Waals radii used to calculate the van der Waals energy term. Additional Input Files Two additional files are needed to use SASA (taken from EEF1): (1) toph19_eef1.inp : This is a modification of toph19.inp where ionic side chains and termini are neutralized and contains an extra parameter type (CR). (2) param19_eef1.inp: This is a modification of param19.inp which includes the extra parameter type (CR). These files can be found in test/data/. * Menu: * Syntax:: Syntax of the SASA command * References:: Useful references * Example:: Input file
Syntax of the SASA command There is only one SASA command: SASA atom-selection [S<number> <real>] [R<number> <real>] [P<number> <real>] [fcon <real>] [ncon <real>] [surf] [infx] [newp] atom-selection:== (see *note select:(select.doc).) <number>:== number corresponding to a CHARMM atom type from param19.inp or param19_eef1.inp according to the following list: number = 001 for H number = 002 for HC number = 003 for HA number = 004 for HT number = 005 for LP number = 006 for CT number = 007 for C number = 008 for CH1E number = 009 for CH2E number = 010 for CH3E number = 011 for CR1E number = 012 for CM number = 013 for C1ES number = 014 for N number = 015 for NR number = 016 for NP number = 017 for NH1 number = 018 for NH2 number = 019 for NH3 number = 020 for NC2 number = 021 for O number = 022 for OC number = 023 for OH1 number = 024 for OH2 number = 025 for OM number = 026 for OT number = 027 for OS number = 028 for S number = 029 for SH1E number = 030 for FE number = 031 for CR The SASA command sets up the SASA model for a simulation. All the values have to be given on one command line. Invoking the SASA command a second time reinitializes all values either to the default or to the user specified values. atom-selection This determines the atoms to be used for the SASA calculations. All atoms that are not included in this selection are treated by SASA as if not existent. Their solvation free energies are not considered, i.e., are set to zero, and the decrease in the solvent accessible surface areas of the selected atoms due to the not selected ones is neglected. S<number> <real> Changes the surface-tension like solvation parameter of the CHARMM atom type corresponding to <number> (see list above) from the default value to <real>. R<number> <real> Changes the radius used by SASA (for the calculation of the approximated solvent accessible surface areas) of the CHARMM atom type corresponding to <number> (see list above) from the default value to <real>. P<number> <real> Changes the probabilistic parameter of the CHARMM atom type corresponding to <number> (see list above) from the default value to <real>. fcon <real> Changes the connectivity parameter for 1-2 pairs from the default value to <real>. ncon <real> Changes the connectivity parameter for more distant than 1-2 pairs from the default value to <real>. surf The approximated atomic solvent accessible surface areas are stored in WMAIN. infx Includes the fixed exclusion pairs in the SASA pair list. By default, the fixed exclusion atoms are not considered in the SASA surface calculations for historical reasons. This means that, for instance, for the CG of the residue PHE (see the topology file), CZ is not considered to calculate its accessible surface. To include all neighbors use the keyword infx, but note that SASA was not parameterized with this option. newp Triggers the use of the new (Haberthuer 2002) radii, probabilistic parameters, and connectivity parameters (see below). This keyword must be used with infx. The SASA standard setup [1,3] looks like this: nbond nbxmod 5 atom rdiel shift vatom vdistance vshift - cutnb 8.0 ctofnb 7.5 ctonnb 6.5 eps 2.0 e14fac 0.4 wmin 1.5 sasa selection (.not. hydrogen) end The nonbond options are the default nonbond options from the default param19.inp file with the exception of the eps value that is set to 2 instead of 1 and rdiel is used instead of cdiel. The default solvation parameters are -0.06 for hydrophilic groups (N, NR, NH1, NH2, NH3, NC2, O, OC, and OH1) and 0.012 for hydrophobic groups (C, CH1E, CH2E, CH3E, CR1E, S, SH1E, and CR) and 0.0 for explicit hydrogen atoms. Please note that the param19_eef1.inp file has different default nonbond options (especially the cutoffs are different) that were not used to parameterize SASA and therefore should not be used or used only with care in simulations with SASA since consistency is lost. If you select all atoms for SASA instead of excluding the (explicit) hydrogens, all atoms will be considered for calculating the solvent accessible surface area of each atom. However, since the solvation parameter for hydrogen atoms is zero by default, the solvation energy of the hydrogen atoms is also zero by default. If you insist on including the solvation energy due to the solute hydrogen atoms, you have to assign a non-zero solvation parameter to the hydrogen atoms by yourself, using 'S001 <real> S002 <real>' in the CHARMM input file. Be aware that this is not the default. Solvation Parameters for Proteins (not fully tested) A second set of surface-tension like solvation parameters has been derived: -0.144 for hydrophilic groups, 0.024 for hydrophobic groups and 0.0 for explicit hydrogen atoms with eps(r)=r. It is not the default. It seems to work better for large proteins but no results have been published up to date (June 2004). More Than One Chain If you run a simulation with several molecules (so that you have more than one segment identifier), make sure that you invoke the SASA command after the generation of the last segment since any molecule generated after the last use of the SASA command is not included in the SASA calculations. You don't have to invoke the SASA command after every generation of a segment, it is sufficient to use the SASA command once after the generation of the last segment. Accessing the Solvation Energy The SASA solvation energy is stored in the variable 'SASL'. Use '?SASL' in the CHARMM input file to access the value. New Surface Parameters A new set of surface parameters (radii, probabilistic parameters, and connectivity parameters) was derived in 2002 by Haberthuer. They give a better correlation with exact analytical surfaces than the original Hasel and Still surface parameters. (A set of 20 structures [10 native and 10 unfolded conformations] was used for the calibration.) The new surface parameters are not the default because the surface-tension like solvation parameters were optimized by Ferrara with the original Hasel and Still surface parameters. The following example illustrates how to setup SASA with the new surface parameters. nbond nbxmod 5 atom rdiel shift vatom vdistance vshift - cutnb 8.0 ctofnb 7.5 ctonnb 6.5 eps 2.0 e14fac 0.4 wmin 1.5 sasa infx newp
References [1] Ferrara, P.; Apostolakis, J.; Caflisch, A.; Evaluation of a Fast Implicit Solvent Model for Molecular Dynamics Simulations; Proteins 2002; 46; 24-33. [2] Lazaridis, T.; Karplus. M.; Effective Energy Function for Proteins in Solution; Proteins 1999; 288; 477-487. [3] Ferrara, P.; Caflisch, A.; Folding Simulations of a Three-Stranded Antiparallel Beta-Sheet Peptide; Proc. Natl. Acad. Sci. USA; 2000; 97; 10780. [4] Roux, B.; Simonson, T.; Implicit Solvent Models; Biophysical Chemistry; 78; 1999; 1-20. [5] Wodak, S. J.; Janin, J.; Analytical Approximation to the Solvent Accessible Surface Area of Proteins; Proc. Natl. Acad. Sci. USA; 1980; 77; 1736. [6] Hasel, W.; Hendrickson, T. F.; Clark, S. W.; A Rapid Approximation to the Solvent Accessible Surface Area of Atoms; Tetrahedron Computer Methodology 1988; Vol. 1; No. 2; 103-116. [7] Fraternali, F.; van Gunsteren, W. F.; An Efficient Mean Solvation Force Model for Use in Molecular Dynamics Simulations of Proteins in Aqueous Solution; J. Mol. Biol. 1996; 256; 939.
Examples Check test/c29test/sasa.inp for a more complex example with minimization, equilibration and dynamics. Here is a short version: * Example input file for the SASA implicit solvation model. * ! --- Begin generation procedure --- open read card name toph19_eef1.inp unit 30 read rtf card unit 30 close unit 30 open read card name param19_eef1.inp unit 30 read parameter card unit 30 close unit 30 open read card name filename.crd unit 30 read sequence coor unit 30 close unit 30 generate main warn setup ! --- End generation procedure --- ! --- Begin reading coordinates --- open read card name filename.crd unit 30 read coordinate card unit 30 close unit 30 ! --- End reading coordinates --- ! --- Begin setting up SASA --- ! Use the SASA standard setup. nbond nbxmod 5 atom rdiel shift vatom vdistance vshift - cutnb 8.0 ctofnb 7.5 ctonnb 6.5 eps 2.0 e14fac 0.4 wmin 1.5 sasa selection (.not. hydrogen) end ! --- End setting up SASA --- ! --- Begin minimization --- minimize sd nstep 300 nprint 20 tolgrad 0.1 minimize conj nstep 200 nprint 20 tolgrad 0.1 ! --- End minimization --- stop
NIH/DCRT/Laboratory for Structural Biology
FDA/CBER/OVRR Biophysics Laboratory
Modified, updated and generalized by C.L. Brooks, III
The Scripps Research Institute