

In addition the Gene Ontology (GO) project ( 12) defines a standardized way to further annotate proteins with attributes ranging from functional notes to cellular location. Structures deposited in the PDB may be linked to one or more UniProt ( 10) protein sequence, which in turn can be grouped into larger Pfam ( 11) families. The PDB is a universal repository for all experimentally derived membrane protein structures, holding a single accession for each deposited structure. With rapid growth in the number of protein structures available, several database systems have been developed in order to organize and annotate these structures in a biologically meaningful way.

Meanwhile, the enhanced resolution and variety of structures solved by Cryo-Electron Microscopy ( 9) opens up a wealth of new possibilities. A near-exponential increase in the number of published membrane protein structures ( 5) looks set to be sustained through continuous improvements to detergent solubilization and crystallization protocols, such as Lipidic Cubic Phase ( 6), HiLiDe ( 7) and MemGold ( 8).


Membrane proteins are of considerable biomedical interest, constituting ∼25% of published genomes ( 3) and 50% of current drug targets ( 4). There are now ∼3500 structures of over 1000 unique integral membrane proteins deposited in the Protein Data Bank (PDB) ( 1, 2). All files required to run further molecular simulations of proteins in the database are provided. Proteins may be searched using keywords, PDB or Uniprot identifier, or browsed using classification systems, such as Pfam, Gene Ontology annotation, mpstruc or the Transporter Classification Database. In addition, ensemble analyses are performed to detail conserved lipid interaction information across proteins, families and for the entire database of 3506 PDB entries. Simulations and the results of subsequent analysis can be viewed using a web browser, including interactive 3D visualizations of the assembled bilayer and 2D visualizations of lipid contact data and membrane protein topology. To make these data available to the scientific community, a web database ( ) has been developed. In 2015, the MemProtMD pipeline was developed to allow the automated lipid bilayer assembly around new membrane protein structures, released from the Protein Data Bank (PDB). Yet, these structures are rarely resolved in complex with membrane lipids. Advancements in experimental techniques are revealing high resolution structures for an increasing number of membrane proteins. Integral membrane proteins fulfil important roles in many crucial biological processes, including cell signalling, molecular transport and bioenergetic processes.
