CHARMM c32b1 cfti.doc



File: CFTI ]-[ Node: Top
Up: (perturb.doc) -=- Next: Constraints\n


 
         CFTI: conformational energy/free energy calculations

 
* Menu:

* Constraints::   Note on constrained optimization implementation
 
* CFTINT::        Description and syntax of standard conformational
                  free energy thermodynamic integration

* CFTIM::         Description and syntax of multidimensional onformational
                  free energy thermodynamic integration

 


File: CFTI ]-[ Node: Constraints
Up: Top -=- Previous: Top -=- Next: CFTINT\n


Constraints:

Energy minimization with holonomic constraints has been implemented.
There are no special commands for this option.
Charlie Brook's TSM module allows for MD simulations with constrained
values of selected conformational coordinates - distances, atoms,
dihedrals.
This has been expanded to also allow energy minimization using several
algorithms. The method is an alternative to using harmonic restraints
in generating structures of flexible molecules with desired properties,
or generating adiabatic profiles.

To use this option, simply enter the 'TSM' module and give set
of 'FIX' commands to define set of fixed internal coordinates
(see perturb.doc for details). Next specify an energy minimization
(see minmiz.doc).

Algorithms that work: SD, CONJ, POWE
                      (ABNR works also, for reasons unclear to me, KK)




File: CFTI ]-[ Node: CFTINT
Up: Top -=- Previous: Constraints -=- Next: CFTIM\n


CFTI: standard (one-dimensional) conformational thermodynamic integration


Description of method

Method expands the capabilities of the TSM module.
The TSM module employs the Thermodynamic Perturbation (TP) approach
to conformational free energy simulations. The basis of the
calculation is a MD simulation with a constrained value of a
conformational coordinate.  With minimal
modifications, the alternative Thermodynamic Integration (TI) method
is added on. In the modified code the user has the option of using
TP only (as previously) or activating TI, in which case the same
simulation and data files are used to give both TP and TI results.

[SYNTAX CFTI]
 
All commands are parsed by the TSM command parser, so should be
within a 'TSM ... END' block.
 

CFTI
     command activates the thermodynamic integration calculation
     the context of use should be the same as for a thermodynamic
     perturbation run, i.e. some coordinates should be fixed by
     'FIX ,,,', saving data to a disk file should be specified by 
     'SAVI ...', and one perturbation should be defined by 'MOVE ...'
     The derivative dA/dx is calculated for the coordinate defined
     in the 'MOVE ...' statement. This coordinate has to be also
     fixed with 'FIX ...'; other coordinates may also be fixed if
     desired. The formula for dA/dx involves only averaging over
     the corresponding derivative of the potential energy U, omitting
     the 'Jacobian term' realted to changes in phase space volume:
          dA/dx = <dU/dx>
          dU/dx = Sum(j=1,3N) (dU/dy_j)(dy_j/dx) 
          y_j, j=1,...,3N - atomic Cartesian coordinates
     Notes:
     1) The formatted data file generated by 'SAVI ...' may be read
     by both TI postprocessing command (CFTJ) and TP postprocessing
     (POST). The SAVI 'NWIN' keyword has meaning only for TP, it can
     be set to an arbitrary value if TI only is to be used.
     2) For consistency with TP, the 'BY <real>' part of the 'MOVE'
     command was retained. The <real> value has meaning for TP only,
     it can be set to an arbitrary number if TI only is to be used.

CFTJ [TEMP <real>] [UICP <int>] [CONT <int>]
     Command to calculate the conformational free energy derivative 
           dA/dx = <dU/dx> 
     as well as the energy-entropy components: d<U>/dx, -TdS/dx
     Data is read in from the formatted file generated by the 
     'SAVI ...' command
     TEMP - specifies temperature, needed for energy-entropy components
     UICP - specifies unit with data
     CONT - defines length of data block for error analysis
            e.g. if data file has 1000 entries, 'CONT 100' will
            divide data into 10 blocks and calculate the standard
            deviation of the mean of the block averages
     
CFTA [FIRSt <int>] [NUNIt <int>] [BEGIn <int>] [STOP <int>] [SKIP <int>]
     [CONT <int>] [TEMP <real>]
     Command activates analysis of CFTI-generated trajectory.
     Trajectory coordinate file(s) should be in consecutive units
     FIRST, NUNI, BEGIN, STOP, SKIP - define trajectory reading
     CONT, TEMP - as in CFTJ

Examples of usage : see test cases: cftidist.inp, cftiangl.inp, cftidihe.inp
These test cases also compare the TI to TP results, showing the small size
of the 'Jacobian term'.




File: CFTI ]-[ Node: CFTIM
Up: Top -=- Previous: CFTINT -=- Next: Top\n

 
CFTM: multidimensional conformational thermodynamic integration


Description of method

This is a new approach. MD simulations are performed with several
conformational coordinates simultaneously constrained to fixed values.
The partial derivatives of the conformational free energy with
respect to all the coordinates in the fixed set are calculated
from this one simulation. The free energy gradient may be used 
in different ways to explore conformational free energy surfaces
of flexible molecules.

Method expands the capabilities of the TSM module.
Only TI calculations possible, no corresponding TP analysis
possible.

[SYNTAX CFTM]
 
All commands are parsed by the TSM command parser, so should be
within a 'TSM ... END' block.
 

CFTM
     command activates the multidimensional TI method
     the context of use should be the same as for a thermodynamic
     perturbation run, i.e. several coordinates should be fixed by
     'FIX ,,,', saving data to a disk file should be specified by 
     'SAVI ...'. and  a perturbation should be defined by a 'MOVE ...'
     statement for each  of the fixed coordinates.
     Only the average of the derivatives of the potential energy U
     are calculated, the 'Jacobian term' is ignored - see notes below
     and test cases.
          dA/dx_k = <dU/dx_k>    x_k, k=1,...,m - fixed coordinates
          dU/dx_k = Sum(j=1,3N) (dU/dy_j)(dy_j/dx_k) 
          y_j, j=1,...,3N - atomic Cartesian coordinates
     Notes:
     1) The formated data file defined by 'SAVI ...' has a different
        format under CFTM than under CFTI. This file is only useful
        for CFTM post-processing.
     2) For consistency with TP, the 'BY <real>' part of the 'MOVE'
     command was retained. The <real> value has no meaning in CFTM.
     'INTE' keyword has to be specified within the 'MOVE' command.

CFTC [TEMP <real>] [UICP <int>] [CONT <int>]
     Command to calculate the conformational free energy derivatives
           dA/dx_i = <dU/dx_i> 
     as well as the energy-entropy components: d<U>/dx_i, -TdS/dx_i
     Data is read in from the formatted file generated by the 
     'SAVI ...' command
     TEMP - specifies temperature, needed for energy-entropy components
     UICP - specifies unit with data
     CONT - defines length of data block for error analysis
            e.g. if data file has 1000 entries, 'CONT 100' will
            divide data into 10 blocks and calculate the standard
            deviation of the mean of the block averages
     Output includes all individual partial derivatives, and
     optionally their analysis into groups. The derivative with
     respect to a path direction is also calculated.
     
CFTB [FIRSt <int>] [NUNIt <int>] [BEGIn <int>] [STOP <int>] [SKIP <int>]
     [CONT <int>] [TEMP <real>]
     Command activates analysis of CFTM-generated trajectory.
     Trajectory coordinate file(s) should be in consecutive units
     FIRST, NUNI, BEGIN, STOP, SKIP - define trajectory reading
     CONT, TEMP - as in CFTJ
     Output is the free energy gradient with respect to the set
     of fixed coordinates, the derivative along a specified direction
     (see DIRE) and optionally a group contribution analysis.

CFTS [FIRSt <int>] [NUNIt <int>] [BEGIn <int>] [STOP <int>] [SKIP <int>]
     [CONT <int>] [TEMP <real>] [DUNI <int>]
     Analogous to CFTB, additionally writes out potential energy and
     dU/dx_i to a disk file specified by DUNI.

NCOR NUMB <int>
     NUMB specifies the number of internal coordinates involved
     (=NICP).  Used in calculating the path derivative.

DIRE LAMB <int>
     <real, real, ... , real>
     The LAMB value specifies number of step (progress along reaction
     path). The following line(s) contain NICP real numbers defining
     a path vector. The vector will be normalized automatically.
     The unit vector will be used to calculate derivatives of dA/dl,
     d<U>/dl, -TdS/dl along the path from the gradients.
     The real numbers correspond to weights of the fixed coordinates.
     Note: the vector components are read in free format

CFTG NGRUp <int>
     <int, int, ..., int>
     <string,string,...,string>
     Define groups for group contribution analysis to free energy
     NGRUP is the number of groups.
     The following line(s) contain the integer group numbers of the 
     coordinates (LGRUP(J),J=1,NICP) in free format
     After that follow line(s) with group symbols (i.e. tags that
     will be used to denote the groups) in (20A4) format
     (GSYM(J),J=1,NGRUP)
     Example of usage:
     The system is a decapeptide, we calculate derivatives with
     respect to all phi and psi backbone dihedrals (NICP=18).
     In the 18 'MOVE ...' commands we specify the 9 phi first
     and the 9 psi at the end. The following will calculate and
     print out an aggregate of all phi and all psi contributions
     labelled by the tags 'PHI' and 'PSI':

     cftg ngrup 2
     1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2
     PHI PSI 
     cfts

Examples of usage: see test cases cftmala10.inp, cftmtst1.inp

Checks that 'Jacobian term' is small: cftmtst2.inp, cftmtst3.inp,
cftmtst4.inp, cftmtst5.inp


>NOTE on sign of derivatives:

 In both  CFTI and CFTM it is possible to obtain a derivative value
with incorrect sign by cleverly manipulating the atom selections in the
'MOVE ...' command. A simple way of checking the sign is to run
a 1-D test case using both TI and TP postprocessing (see test cases
cftidist.inp, cftiangl.inp, cftidihe.inp).
A general rule is to think about how the coordinate
is defined and how motions of fragments influence it.

E.g. for a distance between atoms A and B, the coordinate is the
length of the vector from A to B. Perturbations (TP) involve
actual displacements of A and B along the =vector= from A to B;
Derivative calculations (TI) do not involve actual motions of atoms,
but rather predictions of how atomic positions will vary with
infinitesimal coordinate changes.
 Moving B along this coordinate by delta > 0
will increase the coordinate, while moving A by delta will decrease
the coordinate. Alternatively, we can distort the bond by delta by moving
A by -delta/2  and B by + delta/2.
To get correct sign of derivative you have to specify B as the moving part
or specify both B and A, but maintaining that order (B first, A next).
This is illustrated schematically below:

Correct scheme 1:
  FIX DIST <spec atom A> <spec atom B>
  MOVE DIST <spec atom A> <spec atom B> BY 1.0 INTE -
    sele <atom B> end
  
Correct scheme 2:
  FIX DIST <spec atom A> <spec atom B>
  MOVE DIST <spec atom A> <spec atom B> BY 1.0 INTE -
    sele <atom B> end sele <atom A> end
  
Both give the same result (I tested this, KK).
See test case cftidist.inp.

The same holds true for a dihedral defined by atoms I-J-K-L.
Mentally divide the molecule into two parts by cutting through
the J-K bond. Atoms before the cut (I, J and all atoms bound to them
except K) for the first part, the rest of the atoms form the second
part. To distort the dihedral, we can either rotate second half by
delta around J-K axis, or rotate first half by -delta/2 and second
half by +delta/2.  To get correct derivative either define the second
part as moving or define both parts but in correct order: (second,
first). Here is an example for the alanine dipeptide. The following
defines the atoms (in toph19, see cftmtst1.inp):

    1    1 ACE  CH3    3.06258   0.64613   1.42088 ALA  1      0.00000
    2    1 ACE  C      2.33541  -0.68685   1.35313 ALA  1      0.00000
    3    1 ACE  O      2.01413  -1.29380   2.37725 ALA  1      0.00000
    4    2 ALA  N      2.07725  -1.18175   0.14371 ALA  2      0.00000
    5    2 ALA  H      2.45870  -0.76210  -0.65152 ALA  2      0.00000
    6    2 ALA  CA     1.35635  -2.43045  -0.00242 ALA  2      0.00000
    7    2 ALA  CB     0.69707  -2.49721  -1.37506 ALA  2      0.00000
    8    2 ALA  C      2.38192  -3.54475   0.11749 ALA  2      0.00000
    9    2 ALA  O      3.17467  -3.80389  -0.78914 ALA  2      0.00000
   10    3 CBX  N      2.41984  -4.12094   1.31507 ALA  3      0.00000
   11    3 CBX  H      1.92248  -3.68658   2.04150 ALA  3      0.00000
   12    3 CBX  CA     3.28397  -5.30373   1.59827 ALA  3      0.00000

The following is a correct set-up for a phi-psi gradient calculation
using the single-selection variant:

tsm
 fix dihe ala 1 c ala 2 n ala 2 ca ala 2 c toli 1.0e-5
 fix dihe ala 2 n ala 2 ca ala 2 c ala 3 n toli 1.0e-5
 maxi 100
 cftm
 move dihe ala 1 c ala 2 n ala 2 ca ala 2 c by 1.0 -
   inte sele bynum 6:12 end
 move dihe ala 2 n ala 2 ca ala 2 c ala 3 n by 1.0 -
   inte sele bynum 8:12 end
end

And here is the correct alternative double-selection variant:

tsm
 fix dihe ala 1 c ala 2 n ala 2 ca ala 2 c toli 1.0e-5
 fix dihe ala 2 n ala 2 ca ala 2 c ala 3 n toli 1.0e-5
 maxi 100
 cftm
 move dihe ala 1 c ala 2 n ala 2 ca ala 2 c by 1.0 -
   inte sele bynum 6:12 end  sele bynum 1:5 end
 move dihe ala 2 n ala 2 ca ala 2 c ala 3 n by 1.0 -
   inte sele bynum 8:12 end  sele bynum 1:7 end
end

See test cases cftidihe.inp, cftmala10.inp.


>NOTE on integrators:

With CHARMM c30a2x I have tested LEAP, NOSE and NOSE VVER aproaches,
which worked fine.

The LANGevin integrator LED TO INCORRECT FORCES stored in the formatted
data file (defined with 'SAVI ...'). Thus, post-processing using 
the 'CFTA' or 'CFTB' approaches worked fine, as this method re-reads
the trajectory and re-calculates derivatives. The 'CFTJ' and/or 'CFTC' gave
incorrect results! This is probably related to incorrect placement of
the 'CALL DYNICT' and 'CALL DYNICM' commands within the dynamics files
so that the energy gradient DX,DY,DZ does not agree with the coordinates 
X,Y,Z.  I will look into this at a later date. KK

CHARMM .doc Homepage


Information and HTML Formatting Courtesy of:

NIH/DCRT/Laboratory for Structural Biology
FDA/CBER/OVRR Biophysics Laboratory
Modified, updated and generalized by C.L. Brooks, III
The Scripps Research Institute