ECALC (CCP4: Supported Program)
NAME
ecalc
- calculate normalised structure amplitudes
SYNOPSIS
ecalc hklin
foo.mtz
hklout
foo_e.mtz
[Keyworded input]
DESCRIPTION
The program ECALC is used to calculate normalised structure amplitudes for
input to molecular replacement and direct methods programs and to
calculate origin-removed Patterson coefficients. If the output is to an MTZ
file, then it will contain all entries in the input file + F E SIGE F2OR E2OR, see
below.
KEYWORDED INPUT
The various data control lines are identified by keywords, those
available being:
EXCLUDE,
LABIN (compulsory), LABOUT,
MULTAN, REFLECTIONS,
RESOLUTION, SCALE,
SHELL, SNB,
SPACEGROUP, TITLE
EXCLUDE [SIGP <nsigp>] [SIGPH <nsigph>]
[FPMAX <fpmax>] [FPHMAX <fphmax>] [DIFF <diffmax>]
Set criteria for excluding data from the generation of E values.
Large errors can distort the normalisation seriously.
Excluded data will still be written to the output file but there will be no
associated value for E; it will be flagged as a "Missing number".
The default is to include all data.
- The following subkeys select the tests to be applied:
-
- SIGP <nsigp>
-
exclude reflections if FP < <nsigp>* SIGP
- SIGPH <nsigph>
-
exclude reflections if FPH < <nsigph>* SIGPH
- FPMAx <fpmax>
-
exclude reflections if FP > <fpmax>
- FPHMax <fphmax>
-
exclude reflections if FPH > <fphmax>
- DIFF <diffmax>
-
exclude reflections if abs(FPH-FP) > <diffmax>
TITLE <title>
Title for the output file (up to 80 characters). The text PRODUCED
BY ECALC will be appended to this title automatically.
SPACEGROUP <group>
The space group is read from file with logical name SYMOP.
Default: Take the SPACEGROUP from the MTZ header.
Group
is the space group name or number in International Tables. Only the
rotation part of the symmetry operations is used, so for example
177 (P622), 178 (P6122) and 179 (P6522) are all equivalent.
This keyword is required only if the symmetry information in the
reflection file header is missing or wrong.
RESOLUTION <resmax>
Default: take the maximum resolution from the MTZ header.
The value <resmax> is the resolution cutoff in Angstroms.
Usually 0 to include all reflections.
SHELL <number>
Specifies the approximate number (default 200) of reflections wanted per shell.
If this is too small you are likely to get shells with no
reflections and the program will stop with the message Empty
shell. If it is too big there will be an insufficient number of
shells to get sensible averages. Note this number refers to
independent reflections; however the output shows the number in
a hemisphere of reciprocal space.
MULTAN
No further data are required on this line. Outputs E values in a
formatted ASCII file e.g., for Direct Method packages. Normally however, most
Direct Method programs will calculate Es internally. Default is to output E
values in standard MTZ format e.g., for ALMN.
SNB
No further data are required on this line. Outputs E values in a
formatted ASCII file suitable for SnB (Shake-and-Bake).
REFLECTIONS <nwant>
This only applies when outputting reflections to a file and not an MTZ file
i.e. in conjunction with the MULTAN/SNB cards. The largest <nwant> Es are written
to HKLOUT, the default is to write all reflections. This cutoff may be
necessary because some programs will only accept a limited number of
reflections. Also, when getting Es from |FPH-FP|, small E values do not
necessarily reflect the true E value calculated from the heavy atom
sub-structure. For instance, all the centric reflections will have an E
of zero.
LABIN <program label>=<file label> ...
Column label assignments for
H, K, L, FP and optionally SIGFP, FPH and SIGFPH.
FP
is the native structure amplitude,
SIGFP
is its standard deviation,
FPH
is the derivative structure amplitude and
SIGFPH
is its standard deviation.
If FPH is assigned, the absolute difference between
the columns assigned to FP and FPH (i.e., the isomorphous difference)
corrected by the standard deviation (in columns SIGFP and SIGFPH)
bias is used to calculate the F and E values for output. It is important that
the standard deviations on the file are sensible, otherwise this won't work.
Standard deviations (>0) are also used to check for the existence of
measured data but unlike other programs no check is made on Fs. It is much
better to use the missing number flag to check for unmeasured data.
If FPH is not assigned, the column assigned to FP is used
to calculate the E value. SIGFP>0 is used to check for existence of measured
data. Similar to above FP is not checked.
LABOUT <program label>=<file label> ...
This card is only required when outputting reflections to an MTZ file. There
are four items that can be used with LABOUT: F, E, F2OR and E2OR. These are
the four extra columns that are output in the MTZ file, the terms are explained
below. Basically you can change the column name of these four extra columns,
the default names are F, E, F2OR and E2OR. It is essential to change the
column names if they already exist in the input MTZ file.
SCALE <scale>
The output columns F and F2OR will be scaled by the value <scale>. This is
normally not necessary but you may wish to reduce F2OR to managable values.
The default scale is 1.0.
INPUT AND OUTPUT FILES
The input files are
The control data file.
- HKLIN
-
The input reflection data file in standard MTZ format.
- HKLOUT
-
If no MULTAN/SNB keyword is specified, the output file is a reflection data
file in MTZ format containing the items H K L (all input) + F E SIGE F2OR E2OR where F=FP
is copied from the input file if only FP is asigned,
or F=sqrt(max((FPH-FP)^2 -
SIGFP^2-SIGFPH^2,0)) if FPH is assigned as well.
E is the normalised structure amplitude, SIGE is its standard deviation.
F2OR is the origin-removed Patterson coefficient (F^2-<F^2>) and E2OR
is the normalised origin-removed Patterson coefficient.
Note than when using F2OR or E2OR to compute an origin-removed Patterson
with FFT, the Patterson keyword must not be given, as the coefficients
are already in squared form.
For the MULTAN option the output is
H K L 1000*E in FORMAT(3I4,I6) terminated by E=-1.
- SYMOP
-
The library symmetry data file, normally defaulted.
PRINTER OUTPUT
The line printer output may be divided into the following sections:
-
Echo of the input control data.
-
A table showing the distribution of the reflections in shells (chosen to give
roughly equal numbers per shell) with mean d*^3, F^2, E^2-1 and (E^2-1)^2.
-
Scatter plot of F versus d*^2 with a smoothed plot of r.m.s. F versus d*^2
superimposed.
-
Mean values of E^2 and (E^2-1)^2 by parity groups.
-
Mean values of E^n where n = 1 to 6.
Mean values of |E^2-1|^n where n = 1 to 3.
For each mean the theoretical value for the acentric, centric and
hypercentric distributions is also tabulated.
-
Cumulative distribution of E's for centric and acentric with theoretical
values. This table can also be graphed with xloggraph.
PROGRAM FUNCTION
The program ECALC is used to calculate normalised structure amplitudes
for a reflection data set. The normalised structure amplitude for a
reflection is the structure amplitude divided by the product of
epsilon (a factor dependent on the Laue group symmetry) and the r.m.s.
value of the structure amplitudes at its sin(theta)/lambda value. The
rotation and translation functions calculated using these normalised
structure amplitudes result in sharper maps and the peak positions and
heights are less susceptible to changes in resolution limits than the
corresponding functions calculated with structure amplitudes.
EXAMPLES
Example of the control data for calculating a set of normalised
structure factors.
ecalc hklin junk1.mtz hklout junk2.mtz << eof
TITLE TEST OF PROGRAM ECALC WITH C2HKL REFLECTION DATA
LABI FP=FO SIGFP=SIGFO
eof
ecalc hklin junk1.mtz hklout junk2.ref << eof
TITLE TEST OF PROGRAM ECALC WITH C2HKL REFLECTION DATA
LABI FP=FO SIGFP=SIGFO
SNB
eof
$ecalc hklin junk1.mtz hklout junk2.dat
TITLE TEST OF PROGRAM ECALC For isomorphous differences
LABI FP=FO SIGFP=SIGFO FPH=FPH1 SIGFPH=SIGFPH1
MULTAN
REFLECTION 1500
ecalc hklin junk1.mtz hklout junk2.mtz << eof
TITL Es from isomorphous differences without sigma bias
LABI FP=FP FPH=FPHderv1
SCAL 0.001
SHEL 150
eof
PROGRAM STRUCTURE
The program structure is straightforward and involves three passes
through the input reflection data file. The structure is outlined
below:
-
Open files
-
Pass 1 through reflection data: Find maximum F and S values and count
the number of reflections. Print these values.
-
Pass 2 through reflection data: Collect F^2 values in bins of d*^3
(sums and numbers of reflections). Print a table of these results.
Apply adjacent channel smoothing for points giving the average F^2
and d*^3 values for these bins.
-
Open the output mtz file.
-
Pass 3 through reflection data: Calculate E values (using the function
AVF, write the output reflection data and collect data for the
statistics.
-
Print scatter plot, average values of E^1 to E^6 and cumulative
distribution of E's.
AUTHOR
Originator: Ian Tickle
Contact: Ian Tickle, Birkbeck College