Statistics of G2S databases:

Name Numbers of Entries
Last Update Date: 2020-05-19
Ensembl version Human Release-90
Number of Nonredundant Ensembl Entries: 104817
Uniprot(Swissprot) version UniProt Release 2017_09
Number of Nonredundant Uniprot Entries: 594927
Number of Combined Nonredundant Ensembl/Uniprot Entires: 559575
Number of PDB Entries of Chains and Segments: 142614
Number of PDB Biological Macromolecular Structures: 462832
Number of Protein-PDB alignments: 22769499

G2S database stored alignments from protein sequences against PDB sequences.

The sources of protein sequences are from Uniprot (Only includes Swissprot) and HUMAN ENSEMBL. All the protein sequences are combined together to remove redundancy.

G2S makes automatic frequent (WEEKLY) updates to the set of pre-caculated alignments as new/obsolete/modified protein structures are added to RCSB Protein Data Bank, while keeping the set of protein sequences from public protein sequence databases fixed. Less frequently (MONTHS), the system is re-initialized using an updated set of protein sequences in new release of Uniprot/Ensembl, and during re-initialization an all-against-all alignment process is executed.