Statistics of G2S databases:

Name Numbers of Entries
Last Update Date: 2020-02-13
Ensembl version Human Release-90
Number of Nonredundant Ensembl Entries: 104817
Uniprot(Swissprot) version UniProt Release 2017_09
Number of Nonredundant Uniprot Entries: 594927
Number of Combined Nonredundant Ensembl/Uniprot Entires: 559575
Number of PDB Entries of Chains and Segments: 139399
Number of PDB Biological Macromolecular Structures: 447403
Number of Protein-PDB alignments: 21985438

G2S database stored alignments from protein sequences against PDB sequences.

The sources of protein sequences are from Uniprot (Only includes Swissprot) and HUMAN ENSEMBL. All the protein sequences are combined together to remove redundancy.

G2S makes automatic frequent (WEEKLY) updates to the set of pre-caculated alignments as new/obsolete/modified protein structures are added to RCSB Protein Data Bank, while keeping the set of protein sequences from public protein sequence databases fixed. Less frequently (MONTHS), the system is re-initialized using an updated set of protein sequences in new release of Uniprot/Ensembl, and during re-initialization an all-against-all alignment process is executed.