Hingefind (X-PLOR) -

an algorithm to investigate domain motions in proteins.

9-24-96

This page contains documentation specifically for the X-PLOR version.

Content:

1. General Remarks about the X-PLOR Version
2. Files and Shell Scripts
3. Output Files

1. General Remarks about the X-PLOR Version

The X-PLOR (Axel T. Brunger, 1992) version is designed for longer batch jobs which allow the user to run a large number of trial runs in various modes. The X-PLOR version has all the features described in the paper. In particular, it allows to maintain the spatial connectivity of the found domains. The version requires that the user obtains a free license for the X-PLOR software. The output psf and pdb files can be visualized with any standard graphics package.

2. Files and Shell Scripts

There are several files and scripts in this directory to set up the algorithm:

hingefind

A unix shell script that runs the X-PLOR job and writes three X-PLOR stream files which contain commands from which X-PLOR can compute filenames and the tolerance used the algorithm. It is recommended to run the cases within a range of tolerances between 50 and 100% of the initial rms deviation. There are case descriptors from which hingefind.inp computes the actual coordinate files (in pdb format) of the structures used in the comparison, i.e. here: foo.COO, soo.COO, bar.COO. Change these to your own filenames. Note that hingefind.inp will always interpret "foobar" as "foobar.COO", so you may want to edit hingefind.inp if you use other file suffixes (e.g. ".pdb") instead of ".COO".

partition.str

A stream file which contains necessary X-PLOR commands to set up the structure. It may contain a pointer to a psf file. The segid "APO" is be used for the parts of the protein which should be aligned, otherwise hingefind.inp has to be modified. Note that the coordinates in the two compared pdb files must be both compatible with the structure. The pdb files may contain additional atoms which do not have to be specified in partition.str if not used in the alignment. To appear in the output files, such additional structures which are not used in the alignment should be given a segid different from "APO" in partition.str.

dum.top

An X-PLOR topology file with the residues of dummy molecules used in the algorithm for visualization of pivot points and axes.

prexplor.dim

The X-PLOR file which contains array sizes for compilation (35,000 atom version). It will probably be necessary to compile X-PLOR with the larger BUFMAX parameter for the loops. This executable is named "xl" in the hingefind script.

hingefind.inp

The X-PLOR script with the algorithm. There are a variety of variables and paths the user has to specify in the head of the file : $ndomains: The max. number of domains to be found (<= 999). Recommended: 999 to yield full partitioning. Small values (2-5) should be used for test runs. $maxccounter: The number of maximum cycles of the "converge" loop. In case the algorithm does not converge within the specified number of cycles (this was very rarely observed to occur in the "fas" partitioning mode at extreme tolerances), a warning message is written in the log file. Recommended: 10 - 20. $ptmeth: This variable determines the mode of the partitioning part of the script: "man" specifies manual assignment of domains, no partitioning. Up to 9 domains can be assigned and $ndomains must be smaller than 10. "fas" codes for the fast version of the automatic partitioning algorithm, in which the connectivity of the residues in the found domains is NOT maintained. "slo" specifies the slow partitioning algorithm with maintained connectivity of the domains. $subset: This variable determines the subset used to partition protein atoms with segid "APO" for $ptmeth = "slo" or $ptmeth = "fas". "cao" specifies C-alpha atoms only; "bac" specifies the backbone atoms with name C, CA, N; $cutdom: Many found domains will be very small in size. This cutoff value for the residue number determines minimum domain size of a domain for determination of effective rotations. store1...9: The selection attributes which allow the assignment of up to 9 domains by hand in the "man" mode. $case1COO: String that specifies input file for the coordinates in pdb format or pointer to pdb file. The path has to be specified. X-PLOR can compute the filename from the variable $case1 defined in the streamfile casefile1 written by the shell script. Coordinates written to main coordinate set. $case2COO: String for 2nd pdb file (comparison coordinate set Analogous to $case1COO. $oname: Output pdb file with assigned domains, pivots, axes. The filename will be computed using some of the above variables and the $fname variable which contains the tolerance as defined by the shell script. $uname: Output psf file, analogous to $oname. $dname: Output log file with information about the proposed effective rotations, residues, accuracies.

Running a particular system with a range of tolerances in "fas" mode, it was found that there exists one or more windows of optimum tolerance where the relative errors were very small. Therefore it is recommended to try a range of tolerances first with the "fas" mode, find the window(s) of small error and then calculate selected tolerances in the window(s) in "slo" mode with a higher number of domains. The error of the domain fitting was found to decrease with "slo" partitioning due to the connectivity of the domains.

3. Output Files

There are three output files specified by the variables $oname $uname and $dname: pdb and psf files of the labeled structure, and the log file of the run. The pdb and psf files can be used to visualize the results of the algorithm: The data is labeled by segid's:

"D0" is the unconverged rest of the protein,
"D1" is the reference domain of the protein ("rigid core"),
"D2", "D3", etc, are additional domains,
"R" denotes small domains with a size below $cutdom (see 2.),
"A2", "A3", etc, are the dummy molecules which visualize the effective rotation of the domains.

In addition, the user may also find untouched atoms with the segid "APO" and other segid's not used in the alignment.

The dummy molecules show an arrow along the rotation axis with it's orientation representing the left-handed rotation about the axis. The constructed "pivot" can be found in the middle of the arrow and is connected to the COM of the main and comparison coordinate set of the domain to illustrate the rotation angle. The rotation angle and other useful information about the run, the domains, and the accuracy of the rotational fitting can be found in the self-explanatory log file.

NOTE: The X-PLOR log files would contain several Mbytes of data for each run, so the standard output is piped to /dev/null. The standard output should only be used for debugging of modified or augmented scripts.