15. Protein Information


Overview

This utility retrieves textual information on PDB files from the protein structure database by accessing the QUANTA file $HYD_LIB/database.dat. This database file contains information on all the PDB files currently in the Brookhaven Protein Databank. It is the same data file used by the structural database utility.

An example of how this utility might be used would be to query for information on all the hemoglobins in the database. The query would return a list of all the hemoglobin PDB files, a short textual description of each, and data, such as the number of residues in the PDB file.

IUPAC-IUB Commission on Biochemical Nomenclature.

Biochemistry 9 3471-3479 (1970).

C.M. Wilmott and J.M. Thornton J. Mol. Biol. 203 222-223 (1988).

W. Kabsch and C. Sanders Biopolymers 22 2577-2647 (1983).


Retrieving Protein Information

This utility retrieves textual information for each protein structure without reading in the protein structure. It is activated from the Protein Design menu and displays the Specify Proteins dialog box. The dialog box contains five options, several with preset defaults.

Once the PDB textual information has been retrieved, it is listed to the textport. This information includes:


Tools

This option displays the Specify Protein dialog box. Using different variables within the option fields, it is possible to retrieve either general or specific information.

Each of these options is described below.

This tool allows you to enter either a keyword or the PDB filename for the protein structure of interest. More than one keyword or PDB name can be entered by clicking the OK button. The text already entered is saved, and the entry field cleared so additional text can be entered.

If more than one keyword or PDB name is entered, these are considered to be connected by a logical OR. If two keywords are entered, then information will be retrieved for any protein that has either keyword_1 or keyword_2.

This tool enables you to limit the search to structures whose crystal structure determination had a resolution less than a given value. The lower the resolution, the better the structure. For example, a resolution less than 2.0 Å is good. However, less than 3.0 Å may be acceptable for determining the main chain conformation and side chain position, but some parts of the structure may not be resolved as well as others.

The search normally looks through every protein in the database, but this tool enables you to limit the range of proteins searched. This option is only useful if the your database is set up with a known selection of proteins in a particular position in the database.

This tool enables you to specify an output log filename or the default name, info.log. Once the Search button is clicked, a command file for the search program is written and the search runs automatically. The results are written to the selected log file and also displayed in the textport. The default the log file is automatically overwritten each time it is used.


Running a Protein Information Query

The following exercise demonstrates how to use the Protein Information utility and shows an example of typical information results.

1.   From the Protein Design menu select the Protein Information... option.

The Specify Proteins dialog box is displayed.

2.   Enter the following variables:

Search for keyword: pepsin

Maximum crystal resolution: 5

Output log file name: pepsin.log

3.   Click the Search button. The search is run and the results are displayed in the textport. Press <Enter> to continue scrolling through the information in the textport. To quit, press <q>, and then <Enter>. The information is automatically stored in the file pepsin.log for use later.



© 2006 Accelrys Software Inc.