First International Structural Genomics Meeting
sponsored by NIGMS and the Wellcome Trust

The Wellcome Trust Genome Campus, Hinxton Hall Conference Centre
Hinxton, Cambridgeshire, UK
April 4-6, 2000

Agreed Principles

Coordination of International Programs in Structural Genomics

This document reports the principles agreed at the April 4-6, 2000 meeting of representatives of the structural genomics community. Its purpose is to generate further co-operation in the structural biology and general scientific communities. The document may serve as a basis for an appropriately timed announcement to the international public on the initiation of a worldwide effort in structural genomics.

I. Introduction

Success of the genome sequencing projects and major advances in methods of protein structure determination have led the structural biology community to propose the large scale mapping of protein structure space. This structural genomics initiative aims at the discovery, analysis and dissemination of three-dimensional structures of protein, RNA and other biological macromolecules representing the entire range of structural diversity found in nature. Such a complete knowledge will facilitate fundamental understanding and applications in biology, agriculture and medicine. The three-dimensional structures will be crucial for rational drug design, for advancing catalysis in chemistry and biotechnology, and for diagnosis and treatment of disease, as well as for advancing basic principles of biology.

This opportunity is made possible by rapid recent progress in several related key technologies. These include the construction of synchrotrons and high-field NMR instruments, the MAD method of phase determination, high throughput cloning and recombinant expression, a flood of information from genome sequencing projects, and new bioinformatic methods for fold assignment, model building, and prediction of function.

The following document outlines issues related to achieving this expansion of knowledge. The goal is to encourage harmonious cooperation among a broad range of public and private sector institutions in the international effort to characterize macromolecular structures in living organisms on a pan-genomic scale.

II. Goals

A.  Specific goals

  1. Large scale determination and analysis of three-dimensional structures.
    1. To determine by experimental methods a representative set of macromolecular structures, including medically important human proteins and proteins from important pathogens and model organisms.
    2. To provide models based on sequence homology to significantly extend the coverage of structure space.
    3. To derive functional information from these structures by experimental and computational methods.

  2. Development of methods for Structural Genomics.
    1. Methods of selecting representatives of protein families based on enhancement of structure space coverage, or functional significance.
    2. High-throughput methods for production of target proteins suitable for structure determination.
    3. Methods for high throughput data collection.
    4. Methods for automated determination, validation, and analysis of 3D structures.
    5. Methods for homology-based modeling, related methods and validation of modeled structures.
    6. Informatics systems to optimize and support the process of structure determination.
    7. Bioinformatics methods for assessing biological function based on structure and other linked biological information sources.
    8. Methods for more challenging problems of production and structure determination such as those involving membrane proteins and multimolecular complexes.

B.  Programs needed

  1. Financial and organizational support for structural genomics projects.
  2. Establish an international coordinating network to promote efficient application of resources and rapid dissemination of methods and results; to coordinate policies, standards, and formats; and to promote access to unique resources such as synchrotron and high field NMR facilities.
  3. Support for the collection, archiving and dissemination of atomic information, experimental data, protocols, and materials.

III. Cooperation

A.  Public funding agencies can cooperate:

  1. By agreeing and implementing uniform policies for deposition, quality standards, and formats.
  2. By providing sustainable support for public domain programs in structural genomics.
  3. By encouraging and supporting appropriate international collaborative programs.

B.  Information and Material Release in the National Structural Genomics Programs

  1. The primary impetus for structural genomics is to obtain a base of freely available structural information and tools that will support advancements in wide areas of biology and medicine. Free exchange of data and materials is essential to the success of this effort, including the timely deposition of coordinates, data, and protocols.
  2. For the public structural genomics programs, the following guidelines for release of structural data should be supported:
    1. Timely release of coordinates and associated data. Release should follow immediately upon completion of refinement. For the time being, the decision regarding 'completion' will be made by the investigator. A longer-term goal is the automatic triggering of data release using numerical criteria.
    2. Public information on progress of projects. A primary mechanism for encouraging compliance with the guideline of timely release will be openness of progress tracking for projects. Members of the programs should maintain a public web site, showing progress status on determining the structure of each target. This information will be updated frequently.
    3. Short scientific papers. Ensuring high quality of released structures is a priority. In order to help achieve this, structures released by members of the public programs may be accompanied by a short, peer-reviewed paper. These papers could be similar in format and content to the publications of small molecule crystal structures in Acta Cryst. C. The key requirement is that reviews are rapid, and the whole process of preparing the release completed in about three weeks. Normal full-length publication is of course also possible, but should not delay the release of data.
  3. To promote communication and prevent unintended duplication of effort, it is desirable to openly share information on targeting of proteins for structure determination.
  4. At the time of coordinate deposition and data deposition, protocols for cloning, expression, crystallization and structure determination should also be deposited, enabling re-determination of all structures in the database from time to time.
  5. Material deposition of clones, cell lines, and protein samples is also encouraged, provided that satisfactory procedures can be put in place for collection, storage, and dissemination.

C.  Relationship to industrial activities

  1. The structural genomics community should explore formation of an international consortium involving industrial partners to further the goals of structural genomics.
  2. International efforts should be made to facilitate the eventual deposition of structures determined in the private sector, and to promote harmonious cooperation and exchange between the public and private sectors.

IV. Intellectual property rights

Raw fundamental data on the shape of natural protein molecules, including 3D positional coordinates, should be made freely available to researchers everywhere. However, intellectual property protection for inventions based on these can play an important role in stimulating the development of important new health care projects. Policies should be established to permit an appropriate balance between these goals.

V. Future Meetings

Annual meetings of representatives of the structural genomics community are anticipated for the continued discussion of these issues. The Second International Structural Genomics Meeting is being planned for April 4-6, 2001, in Virginia, USA.

These principles were supported by the participants in the First International Structural Genomics Meeting in Cambridge, UK, April 4-6, 2000.


Dr Sherin Abdel-Meguid
Suntory Pharmaceutical Research Laboratories

Dr Geoff Barton

Professor Helen Berman
Protein Data Bank
Rutgers University

Professor Ivano Bertini
Magnetic Resonance Center
University of Florence

Professor Sir Tom Blundell
University of Cambridge

Professor Stephen K Burley
The Rockefeller University

Dr Christian Cambillau

Mr. David Carr
The Wellcome Trust

Dr Marvin Cassman

Dr Cyrus Chothia
MRC Laboratory of Molecular Biology
University of Cambridge

Miss Nicky Clarkson
Hinxton Hall Conference Centre

Dr Robert Cooke
GlaxoWellcome Research & Development

Ms Anna Curson
The Wellcome Trust

Professor Christopher Dobson
University of Oxford

Professor Guy Dodson
University of York

Dr Richard Durbin
Informatics Division
The Sanger Centre

Dr Charles Edmonds

Professor Aled Edwards
University of Toronto

Professor Roger Fourme
LURE Universite Paris-Sud

Professor Paul Freemont
Imperial College of Science, Technology and Medicine

Professor Paul Gilna
National Science Foundation

Professor Udo Heinemann
Max-Delbr¸ck-Centrum f¸r Molekulare Medicin

Professor Wayne Hendrickson
Columbia University

Dr Nobuo Kamiya
RIKEN Harima Institute/Spring 8

Professor Robert Kaptein
University of Utrecht
The Netherlands

Professor Sung-Hou Kim
University of California

Dr Richard Kramer
Novartis Pharmaceuticals Corp

Dr Victor Lamzin

Professor Michael Levitt
Stanford Medical School

Professor Peter Lindley
European Synchrotron Radiation Facility

Dr Michal Linial
The Hebrew University

Dr Albrecht Messerschmidt
Max-Planck-Institute for Biochemistry

Dr Stefan Michalowski

Dr Colin Miles

Dr Gaetano Montelione
CABM - Rutgers University

Dr Michael Morgan
The Wellcome Trust

Professor John Moult
University of Maryland

Dr John Norvell

Dr Mark Palmer
Medical Research Council

Dr Ari Patrinos
US Department of Energy

Professor Simon Phillips
University of Leeds

Dr Debbie Poole
The Wellcome Trust Genome Campus

Professor Randy Read
Wellcome Trust/MRC Building

Professor David Rice
University of Sheffield

Dr Ajay Royyuru
IBM T J Watson Research Center

Professor Chris Sander
Millennium Predictive Medicine

Mr. David Seemungal
The Wellcome Trust

Dr Barbara Skene
Wellcome Trust

Dr Sharon Spencer
Catalyst Biomedica Ltd.

Professor Joel Sussman
Weizmann Institute of Science

Dr William Taylor
National Institute for Medical Research

Dr Tom Terwilliger
Los Alamos National Laboratory

Dr Jean-Claude Thierry

Professor Janet Thornton
University College London

Dr Tony Wilkinson

Dr Shigeyuki Yokoyama
RIKEN Genomic Sciences Centre


Tuesday, April 4, 2000
17:00 onwardsArrival of Delegates
18.30Pre Dinner Drinks, served in the Hall Foyer
19.00Dinner, served prompt in the Hall Restaurant
Wednesday, April 5, 2000
07.30 - 08.15Breakfast, served in the Hall Restaurant
08.15Registration, in the Conference Centre Foyer
08.40Welcome & Overview
Session I: Definition of Structural Genomics
Leader: Chris Sander
09.00How will Structural Genomics differ from current Structural Biology projects?
10.00Coffee, served in the Cloisters
Session II: Goals of The Structural Genomics Project
Leader: Gaetano Montelione
10.30What will be accomplished by an organized, international Structural Genomics effort? What are the specific goals? What are reasonable mileposts over the next 5 years? What types of programs need to be developed to meet these goals?
Session III: Policy Issues: Opportunities for Collaborative Efforts
11.30Co-operation: How can the agencies co-operate in making Structural Genomics an international effort? What requirements are needed to ensure the openness of SG projects? Relationship to industrial activities?
Leaders: Steve Burley & Janet Thornton
12.30Lunch, served in the Foyer
13.30Publications and Data Release: What constitutes publication (of structures) in Structural Genomics? What structural information (in addition to coordinates) should be deposited in the PDB? How do the agencies and scientists determine when structures must be deposited? Which SG results should be subject to public release requirements?
Leaders: Wayne Hendrickson & Guy Dodson
14.30Bioinformatics/Databases: Is there a need for an international Structural Genomics Database?
Leaders: John Moult & Geoff Barton
Randomised Groups (Tea, Coffee & Biscuits available in breakout rooms)
 Group A: Leader: Gaetano Montelione
Loft Room 1
 Group B: Leaders: Steve Burley & Janet Thornton
Loft Room 2
 Group C: Wayne Hendrickson & Guy Dodson
Hall Lounges
 Group D: Leaders: John Moult & Geoff Barton
Green Room
 (Breakout Group Leaders to draft statements overnight)
18.00Close of Day One
18.30Pre Dinner Drinks, served in the Hall Foyer
19.00Dinner, served prompt in the Hall Restaurant
Thursday, April 6, 2000
Session IV: Summary
Leader: Chris Dobson
08.00 - 09.00Breakfast, served in the Hall Restaurant
09.00Presentation of statements and agreement of content
11.00Close of Meeting