010410

Report of Task Force on Numerical Criteria in Structural Genomics
Summary of conclusions

1. Shortcuts are not justified in structures determined for structural genomics, and success should be judged by quality in addition to quantity. Quality should be assessed by conventional validation criteria.

2. All data should be deposited so that other workers can complete or rerefine the structure.

3. Indicators of local quality (e.g. local density fit, B factor analysis, local density of NMR restraints) should be provided to users of structures.

4. It is not yet possible to set numerical criteria defining when a structure determination is completed. We recommend that structures be refined until there is no clear signal of where the model can be improved.

5. We recommend that at least a minimal set of numerical criteria for quality assessment be reported with structural data.

6. The criteria to be reported should be reassessed as methods develop and the database of structures expands.

Background

Creation of task force

The First International Structural Genomics Meeting (Hinxton, April 2000) brought together researchers from the international community as well as representatives of two agencies funding their efforts: the NIH and the Wellcome Trust. One of the major questions under discussion was the necessity for rapid release of structures determined through publicly-funded initiatives. The following statement is taken from the "Agreed Principles" document generated at the meeting:

It was agreed that a small task force would be set up to address the question of "numerical criteria". In the "Implementation" document from the meeting, the following appears:

Task force membership

The task force was set up to represent both the X-ray and NMR experimental communities. John Moult was appointed as an ex officio member.

Randy J. Read	Chair, X-ray	John L. Markley	NMR
Eleanor J. Dodson	X-ray	Michael Nilges	NMR
T. Alwyn Jones	X-ray	Yoshifumi Nishimura	NMR
Thomas C. Terwilliger	X-ray	John Moult	ex officio

Previous work on structure validation

The problem of setting numerical criteria for the completeness and accuracy of structures in structural genomics is obviously intimately connected with the issue of validation. This is an issue the structural community has been grappling with over the last decade. Two members of the task force (Eleanor Dodson, Alwyn Jones) have been involved in a European validation initiative.

However, validation to date has been concerned with achieving not only a basically-correct structure but the best possible structure from the data available. In the context of structural genomics, we might not want to go as far down the path of diminishing returns and may be willing to accept something less than perfection. Structures will be useful for many purposes, even at an intermediate stage of refinement, as long as serious errors are eliminated.

Scope of study

The remit of the task force was to determine whether it was possible to establish numerical criteria to judge the quality of structures determined in structural genomics projects, both as an indicator of when the structures are sufficiently reliable to be released to the public, and as a quality guide for users of those structure.

We sought, as far as possible, to take advantage of the extensive work that has been done in the area of validation. Deciding the extent to which the problems are the same required us to consider whether there should be different expectations for the quality of structures determined in structural genomics than in traditional structural biology.

Structural genomics projects will use both X-ray and NMR methods to determine structures. Validation has been an issue for a longer time in the crystallographic community, so it was important to consider whether the lessons learned in crystallography apply directly to NMR, or whether the problems are substantially different and require more study.

Finally, we considered the nature of numerical criteria that can be defined, and examined whether we could recommend targets for these criteria in different circumstances (depending on the technique used, as well as measures of parameter:observation ratio).

Questions considered

1. What are the expectations of quality for a structure determined in the structural genomics context? Is it enough to be confident that the fold is basically correct? Should the structure be the best that can be achieved from the data? Or should the goal be a tradeoff that maximizes throughput of structures of reasonable quality?

X-ray and NMR methods differ in whether or not an experiment can be designed just to define the fold. This is possible for NMR, where a limited set of constraint data will define the overall fold unambiguously without defining all the details of the conformation of the protein. Because the information in NMR is local, it is possible to design experiments that answer local questions. Obtaining more detail requires more work, so it is conceivable to improve throughput in defining fold space by appropriate experimental design. Nonetheless, an increase in the quality and quantity of NMR data will improve the ease and reliability of spectral assignments and structure determinations. Ambiguous or incorrect spectral assignments are commonly resolved or corrected in the process of refinement and validation; thus these are essential steps in NMR structure determination.

Crystallographic experiments have an all-or-none character because the information is spread throughout the data. In fact, if higher-resolution data are available, it is easier to determine crystal structures and to automate the process, so time is not saved by restricting resolution. While one might expect to be able to trace the chain correctly in an electron density map and then stop, it is the experience of task force members that errors in chain tracing are best detected in the course of further refinement. With both crystallography and NMR, it seems to be difficult to separate the processes of refinement and validation.

Nonetheless, it is true that there are diminishing returns in structure refinement for both techniques. In crystallography, most of the improvement in correcting the conformation of main chain and side chains comes early on, with a disproportionate amount of effort being required to determine the most probable conformer for side chains in poorly-ordered regions or to detect all the ordered solvent. In protein NMR the overall fold is revealed early, and a great deal of additional work may be needed to maximize the number of stereospecific assignments, assignments for longer side chains, and to sort out ambiguous NOE assignments. These refinements lead to better geometry and lower energies for models but rarely alter the global architecture. Eventually a point is reached at which additional constraints fail to alter the structure appreciably within the errors of the measurements. It is reasonable to have somewhat lower expectations of quality in the fine details of a structure determined in the context of structural genomics. Because the experimental data will also be available, an interested researcher could complete the fine details of refinement.

2. Is it possible to define a graduated scale indicating whether the structure is suitable for: fold analysis, design of mutagenesis experiments, drug design?

Some validation criteria (e.g. residue environment scores) give a global indication of whether the fold is basically correct. However, for uses requiring greater precision, it is much more important to have local quality indicators, scoring individual atoms, residues or ranges of residues. NMR backbone assignments, which commonly are determined as a first step in structural analysis, provide patterns of chemical shifts that can be used as reliable indicators of secondary structure; these results, plus key constraints from NOE or residual dipolar coupling may be useful for fold analysis and could be made available in advance of detailed structure determination.

3. What criteria can be defined based on the experimental data? (E.g. for X-ray: R, Rfree)

X-ray: The traditional R-factor (R=Ƀ||Fo|-|Fc|| / Ƀ|Fo|, with sums taken over all data) is a valuable statistic but has fallen out of favor as an absolute criterion because it is easily biased by over-fitting when the parameter:observation ratio is high. The Rfree statistic, computed using only cross-validation data that have been omitted from the refinement target, has been shown to be much more reliable. It is difficult to set sharp threshold values for the R-factors to expect from a good structure, although there are rules of thumb based on experience (tabulated below). Some work should be put into new scores that are more closely related to the likelihood targets used in modern refinement programs. The log-likelihood-gain per cross-validation reflection might be an interesting statistic, but interpretation of this score would require further study. Apart from agreement indicators, the quality of the experiment (data completeness, redundancy, signal-to-noise, merging statistics) must be described. Experimental data could also be used to obtain individual coordinate error estimates, through analysis of the normal matrix. Some local quality indicators, such as real-space fit to electron density or difference density quality (DDQ) analysis, are also based on experimental data. In addition, anomalous difference Fourier maps (which can be calculated providing the unmerged Friedel pairs are deposited) can be used to test the identity of solvent atoms (e.g. water vs. Ca⁺⁺) and verify the position of sulfur atoms.

NMR: The conventional measures of quality of NMR structures are the agreement between the models and the input constraints for final refinement and the tightness of the family of conformers that represent the structure, usually represented as the positional root-mean-square deviation (rmsd) of the individual models from the mean structure, or as the circular variance (cv) of the (backbone) dihedral angles. A variety of experimental constraints can be employed, including NOEs, 3-bond J-couplings, residual dipolar couplings, chemical shifts, and coupling constants across hydrogen bonds. It is not yet established how to weight these relative to one another in refinement targets or in validation criteria. The equivalent of a free R-factor can be computed, by leaving out a subset of observations, but there is much work to be done in understanding how many observations to leave out, and how to weight the different types of observation. Back-calculation methods can be used to ensure that NMR structures are consistent with experimental results, and if some of these data (for example, chemical shifts or residual dipolar couplings) are not used in the refinement, they can provide an independent check against gross errors in the structure. A problem particular to NMR is that the 'observations' are derived quantities extracted from the data in a process that often involves a subjective element. In addition, there are not yet uniform procedures across laboratories for translating spectral data into constraints.

It is essential to deposit the experimental data no later than releasing the coordinates. For X-ray data, it would be preferable to deposit the unmerged data as well as the merged data, so that it would be possible to check for errors in spacegroup assignment or scaling/merging. For NMR, the depositions should include chemical shift assignments, the constraints used in the initial structure determination and final refinement (from NMR spectra or other sources), and, ideally, the raw data sets (prior to Fourier transformation) so that structures could be re-examined as improved methods emerge.

4. What criteria can be defined from the coordinates only (i.e. data-independent criteria)?

Some of the traditional geometric criteria (e.g. bond-length and bond-angle deviations) are not informative, as they reflect primarily the restraints used during refinement. However, torsion angle distributions can be very useful, partly because they are not commonly restrained. Even if torsion angles are restrained, additional information can be found from correlations between, for instance, x₁ and x₂ angles for side chains.

The Ramachandran plot of main-chain torsion angles is particularly powerful in detecting incorrect and badly/sloppily refined crystal structures. However, an examination of recent high-resolution structures has shown that the core regions of the Ramachandran plot are narrower than originally thought, so that the analysis by programs such as ProCheck is too forgiving. Analysis against a stringent-bounds Ramachandran plot (Kleywegt & Jones, 1996) is considerably more sensitive to structural errors.

Non-geometric criteria can also be defined: atomic contacts, side-chain environment analysis, void analysis, detection of unsatisfied hydrogen-bonding partners. A comprehensive survey of such criteria has been published by the EU 3-D Validation Network (1998), and software is available to perform such checks, e.g. WHATCHECK, PROCHECK, SQUID.

5. Are the data-independent criteria equally applicable to X-ray and NMR structures? Does the interpretation of some of these criteria depend on the technique used?

To a certain extent, these criteria are equally informative for X-ray and NMR structures. For instance, side-chain environment criteria, as implemented for instance in PROSA, will distinguish correct from incorrect folds for both techniques. However, because the NMR structural information tends to be more local in character (e.g. close contacts deduced from observations of NOEs), it is easier to satisfy the data while imposing geometrical restraints. For NMR structures computed with fewer restraints, torsion angles are poorly determined, which reduces the applicability of Ramachandran plots. As well, it appears to be easier to enforce a satisfactory Ramachandran plot on a structure determined by NMR.

To a large extent, geometry violations reflect the weighting of the corresponding restraints; if a geometry term is restrained, it becomes much less useful for validation. So the model must be accompanied by a description of the restraints applied in its refinement and the relative weights of those restraints. This must include special restraints, such as bonding distances to metal atoms. There is an argument for leaving some parameters unrestrained, e.g. main-chain torsion angles, to retain some unbiased validation criteria.

6. Can multiple criteria be combined into a single, meaningful numerical score with one threshold for acceptance (e.g. combined Z-score)? Or should a separate threshold be set for each criterion?

This is extremely difficult and open to misunderstanding, since to define such a score properly would require knowledge of the joint probability distribution of all the indicators. In any event, global criteria tend to hide local problems, so it is most informative to report a set of indicators of local structure quality through the chain.

7. Should the criteria/thresholds depend on measures of parameter:observation ratio such as resolution (X-ray) or number of restraints (NMR)?

Yes, for both techniques. There is a trend for errors in fitting the data or satisfying the restraints to decrease with increasing number of observations. Even unrestrained geometrical criteria (e.g. torsion angles) fit into narrower, more ideal distributions as the number of observations increases. This may partly reflect the intrinsic order of the structure (the average of a disordered structure tends not to have good geometry; fewer restraints can be observed in NMR for disordered regions) but also the difficulty in finding the global minimum of the target with too few observations.

For crystallography, it should be noted that parameter:observation ratio is affected not only by resolution but also by the overall solvent content of the crystals and, most importantly, the presence of non-crystallographic symmetry.

8. What kinds of errors arise in structure determinations? Do they differ depending on the technique? How can they be detected?

Types of errors do differ with technique.

For X-ray structures, possible types and level of error have been summarised by Jones & Kjeldgaard (1997): totally wrong fold, locally wrong fold (for instance, one subunit of multi-subunit structure), locally wrong structure (e.g. build main-chain through side-chain density), out-of-register error (especially arising in loops), wrong side-chain conformation, wrong main-chain conformation, incomplete model (lacking part of macromolecule, ligand or ordered solvent), overfitting (reflected in unrealistic deviations from target geometry, non-crystallographic symmetry). The incidence of these errors can be reduced by proper refinement technique, but some errors will inevitably remain, such as incorrect side-chain conformers in less well-ordered regions, incomplete description of ordered solvent. It is most important in practice to detect locally wrong structure (connectivity errors) and out-of-register errors. In a full refinement, sorting out the most probable conformations of the worst side-chains and defining the final details of the solvent structure might easily consume 90% of the investigator's time. This has little impact on the quality of the structure for most uses, so a greater level of error in these areas might be expected and tolerated in the context of structural genomics.

Errors in earlier NMR structures have been analysed by Doreleijers and coworkers (1998, 1999ab) NMR data can be locally incomplete, leading to very local errors and ambiguities. However, spectral assignments associate observations with individual atoms or groups of atoms in the covalent structure of the protein, and, provided that these assignments are correct, the analysis is unlikely to yield globally wrong folds. Possible errors include: inversion of part or all of the topology, helices on the wrong side of a ɿ-sheet, incorrect interhelical angles. Many of these errors can be avoided by validation against additional experimental data, such as residual dipolar couplings.

9. For the chosen numerical criteria, can we predict the rates of false positives and false negatives in finding errors? Will this be acceptable to the experimental and modelling communities? Or is more work required?

With modern validation tools, we can be confident in detecting the worst errors, such as wrong folds. However, the relative incidence of different kinds of error will change as structural genomics becomes more automated. Detecting those errors will be part of the automation process, and will hence be a subject for continued investigation. In addition, for X-ray work a much higher proportion of structures for structural genomics will be determined by Se-Met MAD experiments; such experiments provide landmarks in the form of methionine positions, which will reduce the chance of global fold errors compared to conventional crystallographic analyses.

It is difficult to devise a set of benchmark structures to test all the criteria, especially those that require experimental data, because the data have rarely been deposited for structures demonstrated to possess serious errors.

Numerical Criteria

There is a profusion of possible numerical criteria by which structures may be judged. Members of the task force agreed that the subset of criteria laid out in the tables below are the most informative.

Criteria based on X-ray experimental data

Table 1: Criteria measuring quality of experiment (report overall and as function of resolution)
Criterion	Target	Comments
Completeness	>95% of reflections to reported resolution	One of the most important factors in success. The low resolution or most intense reflections should not be missing selectively or overloaded.
Redundancy	>6-fold (including all wavelengths in MAD)	Important for rejecting outliers, which can hamper phasing and refinement. Lower redundancy must be accepted for crystals that suffer significant radiation damage.
Mean I/ɭ(I) for merged data	Higher values indicate better data	Good measure of signal-to-noise.
R_meas¹	Lower values indicate better data	Measure of agreement between replicate observations, corrected for bias from number of replicates.

¹Defined by Diederichs & Karplus (1997)

Table 2: Criteria measuring quality of model (report overall and as function of resolution)
Criterion	Target for good structure	Comments
R=	values slightly lower than R_free are expected²	Traditional measure of data agreement, but subject to severe bias from over-fitting.
R_free¹	< 0.32 at 3.0ŕ resolution < 0.26 at 2.5ŕ resolution < 0.22 at 2.0ŕ resolution	Not subject to over-fitting bias, but subject to statistical error because of small number of observations used.
Residue density correlation³	No absolute target values	Regions significantly lower than mean are either incorrect or poorly ordered. Regions with out-of-register error often flanked by regions of low density correlation.

¹R-factor computed with cross-validation subset of data, not used in refinement target. In general, about 5-10% of observations should be set aside for cross-validation; however, at least 500 are required to reduce statistical error, but more than 2000 is excessive.
²A large difference between R and Rfree (more than 0.1 at 3ŕ resolution, 0.05 at 2ŕ resolution) indicates over-fitting.
³ Defined by Jones et al. (1991).

Criteria based on NMR experimental data

Because the field is evolving rapidly, any numerical criteria for NMR structures of proteins must be considered tentative. The following criteria, which are representative of some that have been proposed, may provide topics for useful discussion.

Table 3: Criteria measuring the quality of experimental data
Criterion	Target (all tentative)	Comments
Completeness of backbone assignments	> 85%	Assignments to ¹H^�� ¹³C^��, and ¹³C' are particularly valuable for analysis of secondary structure
Redundancy in assignment pathways	?	Should provide a useful measure of confidence in individual assignments that would give higher weight to structures determined from proteins labeled with ¹⁵N and ¹³C.
Completeness of NMR-observable proton contacts within 4 �	> 50%	(Doreleijers et al., 1999a)
Structural constraints per residue	> 12	For well-defined regions. These may be a mixture of various types of constraints

Table 4: Criteria measuring the quality of the model
Criterion	Target (tentative)	Comments
Atom coordinate rmsds within the set of conformers representing the structure	backbone atoms < 1ŕ all atoms < 2ŕ	These are left deliberately loose so as not to encourage over-refinement of structures
NOE restraint violations (rmsd)	< 0.05ŕ	(Doreleijers et al., 1998)
Persistent NOE violations across the ensemble of structures	None > 0.5ŕ�	(Doreleijers et al., 1998)
Back calculation of NMR observables from the family of models	?	This could be NOESY spectra, chemical shifts, dipolar couplings, etc., as has been demonstrated in the literature. Consensus approaches are likely to develop in the next few years.

Table 5: Criteria based on coordinates

Criterion	Target	Comments
Ramachandran outliers¹	< 10%, < 5% for well-refined high-resolution structures	Perhaps the most sensitive test for serious errors, particularly for X-ray structures.
Unsatisfied H-bonding partners²	Few in core	Indication of chemical sensibility of structure
Symmetry clashes	None	X-ray only. Close contacts are often not restrained between symmetry-related molecules.

¹Fraction outside core region of stringent bounds Ramachandran plot as defined by Kleywegt & Jones (1996).
²Computed in WhatCheck

Recommendations

1. Despite the pressures for high throughput, the experiment should be conducted carefully. Careful collection of complete data will lead to better structures that are more easily validated. Suggestions for good practice are summarized in Appendix 1 for X-ray and in Appendix 2 for NMR.

2. All data should be deposited, preferably in an unreduced form. For X-ray data, unmerged integrated intensities (omitting outliers but including systematically-absent axial reflections) should be deposited along with the final, merged intensities and amplitudes for all wavelengths and/or derivatives. Merged data should include entries for Friedel pairs. Cross-validation reflections should be flagged. (It is not clear at present whether raw images are sufficiently useful to justify the storage costs.) For NMR, raw data sets (prior to Fourier transformation) should be deposited; until validation software is developed that uses time domain data, objectively-chosen peak lists from the different experiments should also be deposited. It is not sufficient to provide just the derived constraints. For conventional approaches to NMR structure determination in which assignments are determined first and structures solved and refined, the assignments could be deposited when completed and revised with the final structure deposition. The staged deposition of backbone chemical shifts is important, because they provide a measure of the secondary structure and useful information for functional proteomics and drug discovery in advance of the structure.

3. Refinement of final models should be carried out at least to a point of diminishing returns, which can be assessed by the lack of a clear signal of where the model can be improved (e.g. flat log-likelihood-gradient map for crystal structure). X-ray models should include coordinates for heavy/anomalous atoms used in phasing. Atoms for which there is no associated electron density should be flagged in a consistent fashion (to be determined) to avoid misleading the user of the structure. NMR models should include a representative ensemble of structures that satisfy the experimental data. NMR depositions should include chemical shift assignments, and the constraints used in the initial structure determination and final refinement.

4. To allow appropriate interpretation of validation criteria, models should be deposited together with a complete description of the restraints used and their relative weights. For many structures it will be sufficient to state which restraint library has been used, but any special restraints (e.g. bond distances to metal atoms) should be defined.

5. Tools should be provided by the database providers to allow users to evaluate the quality of fit of the model to the experimental data. One example is the Electron Density Server at the University of Uppsala, which presents molecules in electron density as Web-based objects. Indicators of local quality should be computed from the deposited data: residue density correlation for X-ray, and running averages (computed over a window of residues) of the RMSD from restraints and NOE completeness for NMR.

6. Software used in structural genomics should implement accepted principles of good practice.

7. In the assessment of structural genomics projects for further funding, success should be judged not only by the number of structures produced but also by the quality of those structures.

8. It is not yet possible to set numerical criteria defining when a structure determination is completed.

Suggestions for further study

1. Structural genomics will produce a database of structures and associated data that should be monitored to develop more robust criteria for correctness of fold and quality of structure.

2. Methods to estimate errors in individual atomic positions require further research.

3. There is a need for refinement targets for NMR that refer more directly to the experimental data.

4. The types and frequency of errors that arise in high-throughput structure determination should be monitored. This should feed back to automation efforts.

Bibliography

EU 3-D Validation Network. (1998). Who checks the checkers? Four validation tools applied to eight atomic resolution structures. J. Mol. Biol. 276: 417-436.

Diederichs, K. & Karplus, P.A. (1997). Improved R-factors for diffraction data analysis in macromolecular crystallography. Nature Structural Biology 4: 269-275.

Dodson, E.J., Davies, G.J., Lamzin, V.S., Murshudov, G.N. & Wilson, K.S. (1998). Validation tools: can they indicate the information content of macromolecular crystal structures? Structure 6: 685-690.

Doreleijers, J.F., Rullmann, J.A.C., & Kaptein, R. (1998). Quality assessment of NMR structures: a statistical survey. J. Mol. Biol. 281: 149-164.

Doreleijers, J.F., Raves, M.L., Rullmann, T., & Kaptein, R. (1999a). Completeness of NOEs in protein structure: a statistical analysis of NMR data. J. Biomol. NMR, 14: 123-132.

Doreleijers, J.F., Vriend, G., Raves, M.L., & Kaptein, R. (1999b). Validation of nuclear magnetic resonance structures of proteins and nucleic acids: hydrogen geometry and nomenclature. Proteins, 37: 404-416.

Jones, T.A. & Kjeldgaard, M. (1997). Electron-density map interpretation. Methods Enzymol. 277: 173-208.

Jones, T.A., Zou, J.-Y., Cowan, S.W. & Kjeldgaard, M. (1991). Improved methods for the building of protein models in electron density maps and the location of errors in these models. Acta Cryst. A47: 110-119.

Kleywegt, G.J. & Jones, T.A. (1996). Phi/Psi-chology: Ramachandran revisited. Structure 4: 1395-1400.

Kleywegt, G.J. & Jones, T.A. (1997). Model building and refinement practice. Methods Enzymol. 277: 208-230.

Markley, J. L., Bax, A., Arata, Y., Hilbers, C.W., Kaptein, R., Sykes, B.D., Wright, P.E. & W��hrich, K. (1998). Recommendations for the Presentation of NMR Structures of Proteins and Nucleic Acids (IUPAC Recommendations 1998). Pure & Appl. Chem. 70, 117-142.

Appendix 1: Recommendations of good practice for X-ray structure determination

Satisfactory validation of a crystal structure requires data of good quality, and it is hindered by over-fitting of the data, which obscures indications of poor local fit to density. For this reason, the task force felt that it was appropriate to provide recommendations for good practice in the determination of structures for structural genomics. Application of principles of good practice will also increase the quality of the structures and the usefulness of the database.

Data collection

1. Data should be complete to the limit of resolution to which the crystals diffract. Systematic incompleteness (missing low resolution data, most intense reflections missing) is most harmful. To avoid missing the most intense reflections or underestimating their intensity, data may need to be collected in two sweeps, one with longer exposures to achieve good counting statistics at the resolution limit, and a second with shorter exposures to prevent overloading the most intense reflections. Beamlines for structural genomics should be equipped with beamstops that are sized and positioned to minimize the loss of low-resolution data, and with goniometer heads that allow reorientation of crystals for cusp data collection.

2. Data should be collected with sufficient redundancy to reject outlier observations. The suggested target is 6-fold overall redundancy. It is recognized that some crystals are too sensitive to radiation damage to achieve this goal.

3. Beamline software should be configured to store in the headers of data files experimental parameters such as: wavelength, crystal-to-detector distance, oscillation angles, detector swingout angle.

Model-building and refinement

Typically, X-ray structure determinations have a poor parameter:observation ratio, so model-building and refinement should be guided by the need to avoid over-fitting of the data. In model-building, the number of parameters adjusted can be minimized by restricting the conformations to ones previously observed in high-resolution, well-refined structures. This can be achieved, for instance, by using database peptide fragments to fit the main chain of proteins and by using preferred rotamers to fit side chains.

In refinement, parameters should only be introduced only when required to fit the data, as judged by an improvement in agreement with cross-validation data. In the presence of non-crystallographic symmetry (NCS), strict NCS should be abandoned only when it is clear from difference maps or from the improvement in cross-validation agreement that it is necessary to allow the molecules to differ. Similarly, NCS restraints should only be relaxed when required by the data. Models of thermal motion should use as few parameters as required, choosing from the following models of increasing complexity: overall isotropic B-factor, group B-factors for domains, group B-factors for residues, group B-factors for main-chain atoms and side-chain atoms of residues, restrained individual isotropic B-factors, restrained anisotropic B-factors. The use of TLS (translation/libration/screw) models of thermal motion is also recommended, although more experience is required. Solvent molecules should be introduced cautiously, using protocols validated by agreement of cross-validation data.

A useful summary of good practice for model-building and refinement can be found in a review by Kleywegt & Jones (1997).

Appendix 2: Recommendations of good practice for NMR structure determination

The methodology of protein NMR spectroscopy continues to evolve rapidly. The IUPAC-IUBMB-IUPAB task group 'Recommendations' (Markley et al, 1998) contains a snapshot summary of good practices for NMR structure determination that have passed the review of the major research laboratories in the field. A convenient source for this document (with corrections) is the BMRB web site (under 'Features'): http://www.bmrb.wisc.edu/.

The Recommendations (Markley et al., 1998) also address the scope of information to be archived for structural biological projects (Table 6). These data items have been modelled into computer-readable format at the BMRB, and development versions of the schema and data dictionary for these are available from the NMR-Star dictionary development site (URL listed above). BMRB is working with participants in structural genomics pilot projects and with others on standards for data harvesting. In addition, BMRB is making preparations to archive raw data sets from structural genomics projects.

Table 6: NMR data items to be collected for a complete protein structure deposition

Authors

Citation

Molecular system studied

Natural source for the molecular system

Experimental source of molecular system

Sample contents

Sample conditions

NMR experiments carried out

Spectrometer used

Raw spectra

Spectral peak lists

Assigned spectral peak lists

Chemical shift referencing

Assigned chemical shifts

Coupling constants

Relaxation data (T1, T1rho, heteronuclear NOE, T2)

H-exchange rates (particularly if used to suggest H-bonding for structure

calculation)

< Constraints listed by type

NOE

ROE

other distance constraints

H-bond

disulfide bond

residual dipolar coupling

torsion angle

J-coupling values

Chemical shifts (CA, CB, C', HA in particular)

Others (paramagnetic relaxation, salt bridges, ?)

Violated constraints (violations greater than 0.05 angstrom?)

Coordinates for a representative conformer

Coordinates for a family of conformers

Description of the software used to calculate conformers

Protocol used to calculate conformers