E. The Protein Parameter File

This file contains a variety of parameters used in protein modeling. Each parameter is identified by a four-letter keyword and the data for that parameter is terminated with the keyword END. Several data types involve two-dimensional matrices of values for one amino acid type against another amino acid type. The order in which the amino acids appear in the matrix is defined by the keyword ORDEr followed by a list of the 4-letter codes for the amino acids in the appropriate order. The list is terminated with an asterisk (*).

This is a list of the residue names which would be recognized as amino acids and displayed in the Sequence Viewer. Any other residue names in a mainly-protein MSF would be regard as solvent or ligand.

The format of the list is:

name code substitute

name: four letter residue name

code: single character code used in the Sequence Viewer

substitute: for none standard amino acids many parameters have not being derived but parameters for a similar standard amino acid will be substituted if the four-letter name of the similar standard amino acid is entered.

This keyword is followed by a list of four-letter residue names which will be recognized as water. This information is used in the protein specific display utility Molecule Colors on the Protein Utilities palette.

This is followed by definitions of the classifications used in coloring sequences:

NAME classification 
group_type color name_1 name_2 ... name_n

classification: the name of the classification (e.g. Hydrophobicity) This name will appear in the Molecule Colors dialog box for you to select coloring by that classification.

group_type: the name for the type of a group of amino acids color: the color that will be applied to a the group of amino acids

name_1, name_2, ...name_n: the four-letter names for the amino acids which belong to this group

In modeling side chain conformations may be copied between equivalent residues in homologous structures. If the residues are not of identical amino acid type then copying side chain torsions is only reasonable where the side chains have some structural similarity. The following table gives the maximum number of torsions that are copied between pairwise combinations of amino acid types.

Parameters which have one value for each amino acid type are defined. Currently only the ACCESS parameter, the maximum solvent accessible area, is defined here.

name n_rotamer 
sec_str frequency tor_1 tor_2 ... tor_n

rotamer_type: the name of the author of the rotamer library name: four character residue name n_rotamer: the number of rotamers for that residue type - the rotamers are list on the following lines

sec_str: if the rotamer is specific for one secondary structure type then H (helix) or E (extended) are indicated

frequency: the observed percentage frequency of that rotamer tor_1, tor_2 ...

tor_n: the ideal torsion values for the rotamer. These are not necessary for all the torsions in the sidechain.

n_res sec_str_code name
phi_1 psi_1
phi_2 psi_2
...
phi_n psi_n

The standard backbone folds are defined here and are used to fold a chain in a standard structure in the Apply Conformation utility in the Model Backbone application.

n_res: number of residues in the fold. A negative value implies that the fold can be extended indefinitely.

sec_str_code: A numerical code for the secondary structure type used within QUANTA. The values are:

-3, -4

beta bulge

-2

beta strand

-1

possible beta strand with appropriate conformation but without correct hydrogen bonding

0

undefined

1

possible alpha helix with appropriate conformation but without correct hydrogen bonding

3

3-turn

4

4-turn

5

5-turn

6

alpha helix

name: the name of the fold which will be used in the interface dialog box

phi_n, psi_n: The main chain phi and psi angles, in degrees, for n_res residues. A value of -999.9 implies that there is no defined value for this torsion.


© 2006 Accelrys Software Inc.