SRS databases

From Wiki CEINGE

(Difference between revisions)
Jump to: navigation, search
Revision as of 20:13, 25 June 2007 (edit)
Giovanni (Talk | contribs)

← Previous diff
Revision as of 20:19, 25 June 2007 (edit) (undo)
Giovanni (Talk | contribs)

Next diff →
Line 7: Line 7:
===Available databases=== ===Available databases===
Practically all the most used public databases are available, such as: Practically all the most used public databases are available, such as:
-:*[[DNA databases]]+*[[DNA databases]]
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+EMBL EMBL]'': The EMBL nucleotide sequence database including updates **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+EMBL EMBL]'': The EMBL nucleotide sequence database including updates
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+REFSEQ REFSEQ]'': Database providing non-redundant curated data representing knowledge of known genes **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+REFSEQ REFSEQ]'': Database providing non-redundant curated data representing knowledge of known genes
Line 15: Line 15:
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+EMBLWGS EMBLWGS]'': The EMBL nucleotide sequence database - whole genome shotgun sequences **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+EMBLWGS EMBLWGS]'': The EMBL nucleotide sequence database - whole genome shotgun sequences
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+REFSEQNEW REFSEQNEW]'': Database providing non-redundant curated data representing knowledge of known genes RefSeq Updates **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+REFSEQNEW REFSEQNEW]'': Database providing non-redundant curated data representing knowledge of known genes RefSeq Updates
-:*[[Protein databases]]+*[[Protein databases]]
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+REFSEQP REFSEQP]'': Database of protein information from NCBI **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+REFSEQP REFSEQP]'': Database of protein information from NCBI
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+UNIPROT UNIPROT]'': The UniProt Knowledgebase is the central database of protein sequences with accurate, consistent, **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+REMTREMBL REMTREMBL]'': REM-TrEMBL (REMaining TrEMBL) contains translations of EMBL nucleotide sequences that will not be included in TrEMBL **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+UNIPROT UNIPROT]'': The UniProt Knowledgebase is the central database of protein sequences with accurate, consistent, **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+REMTREMBL REMTREMBL]'': REM-TrEMBL (REMaining TrEMBL) contains translations of EMBL nucleotide sequences that will not be included in TrEMBL
Line 28: Line 28:
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+UNIPROT_TREMBL UNIPROT_TREMBL]'': The UniProt Knowledgebase is the central database of protein sequences with accurate, consistent, and rich sequence and functional annotation. UniProt/Trembl consists of computationally analyzed records that await full manual annotation **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+UNIPROT_TREMBL UNIPROT_TREMBL]'': The UniProt Knowledgebase is the central database of protein sequences with accurate, consistent, and rich sequence and functional annotation. UniProt/Trembl consists of computationally analyzed records that await full manual annotation
-:*[[Gene-related databases]]+*[[Gene-related databases]]
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+ENTREZGENE ENTREZGENE]'': NCBI's database for gene-specific information. **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+ENTREZGENE ENTREZGENE]'': NCBI's database for gene-specific information.
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+EPD EPD]'': Eukariotic Promoter Database - Philipp Bucher (1996) **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+EPD EPD]'': Eukariotic Promoter Database - Philipp Bucher (1996)
Line 40: Line 40:
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+RHPANEL RHPANEL]'': The RHPANEL RH Mapping panels database **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+RHPANEL RHPANEL]'': The RHPANEL RH Mapping panels database
-:*[[Protein-related databases]]+*[[Protein-related databases]]
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+INTERPRO INTERPRO]'': Integrated Resource of Protein Domains and Functional Sites **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+INTERPRO INTERPRO]'': Integrated Resource of Protein Domains and Functional Sites
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+IPRMATCHES IPRMATCHES]'': All hits to Swiss-Prot and TrEMBL entries in which the signatures are found by INTERPRO **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+IPRMATCHES IPRMATCHES]'': All hits to Swiss-Prot and TrEMBL entries in which the signatures are found by INTERPRO
Line 53: Line 53:
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+PRODOM PRODOM]'': A comprehensive collection of protein domain families **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+PRODOM PRODOM]'': A comprehensive collection of protein domain families
-:*[[Ontology databases]]+*[[Ontology databases]]
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+GOA GOA]'': Gene Ontology Annotation of UniProtKb **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+GOA GOA]'': Gene Ontology Annotation of UniProtKb
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+GO GO]'': GO - Geneontology Database **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+GO GO]'': GO - Geneontology Database
-*'''3D structures databases''':+ 
 +*[[3D structures databases]]
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+NRL3D NRL3D]'': PIR-NRL3D Sequence-Structure Database. **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+NRL3D NRL3D]'': PIR-NRL3D Sequence-Structure Database.
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+PDB PDB]'': Protein Data Bank (PDB) - repository for the processing and distribution of 3-D biological macromolecular structure data **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+PDB PDB]'': Protein Data Bank (PDB) - repository for the processing and distribution of 3-D biological macromolecular structure data
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+PDBFINDER PDBFINDER]'': Directory for the Brookhaven Protein Data Bank. Constructed from the PDB, DSSP and HSSP databases **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+PDBFINDER PDBFINDER]'': Directory for the Brookhaven Protein Data Bank. Constructed from the PDB, DSSP and HSSP databases
-*'''Methabolic pathway databases''':+ 
 +*[[Methabolic pathway databases]]
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+PATHWAY PATHWAY]'': Kyoto Encyclopedia of Genes and Genomes (KEGG) **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+PATHWAY PATHWAY]'': Kyoto Encyclopedia of Genes and Genomes (KEGG)
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+LENZYME LENZYME]'': Ligand Chemical Database for Enzyme Reactions **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+LENZYME LENZYME]'': Ligand Chemical Database for Enzyme Reactions
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+LCOMPOUND LCOMPOUND]'': Ligand Chemical Database for Enzyme Reactions **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+LCOMPOUND LCOMPOUND]'': Ligand Chemical Database for Enzyme Reactions
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+ENZYME ENZYME]'': Database of enzyme nomenclature **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+ENZYME ENZYME]'': Database of enzyme nomenclature
-*'''Reference databases''':+ 
 +*[[Reference databases]]
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+TAXONOMY TAXONOMY]'': Contains names of all organisms represented in sequence databases by at least one nucleotide or protein sequence **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+TAXONOMY TAXONOMY]'': Contains names of all organisms represented in sequence databases by at least one nucleotide or protein sequence
**''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+GENETICCODE GENETICCODE]'': NCBI database of genetic codes **''[http://bioinfo.ceinge.unina.it/srs7131bin/cgi-bin/wgetz?_AUTHS_-AUTHE_-page+LibInfo+-lib+GENETICCODE GENETICCODE]'': NCBI database of genetic codes

Revision as of 20:19, 25 June 2007

Aboout 70 DBs are locally mantained and available at CEINGE. SRS is used as the main web interface for accessing to these biological databases. An automatic procedure specifically developed takes care to mantain the databases up to date: every night public servers are checked for the presence of new releases: in case of new releases, the new file data are downloaded and automatically indexed.
Through SRS, more than 60 public databases are available at CEINGE, stored as flat-files on a dedicated file server, for a total of over 2 terabyte of HD space.

The complete list of all the available databases on SRS is available here.

Available databases

Practically all the most used public databases are available, such as:

  • DNA databases
    • EMBL: The EMBL nucleotide sequence database including updates
    • REFSEQ: Database providing non-redundant curated data representing knowledge of known genes
    • FANTOMn: Database of mouse transcriptome
    • UTRnr: 5-end and 3-end Untranslated Regions Database
    • IMGT: ImMunoGeneTics database. A database containing nucleotide sequences of immune system-related genes
    • EMBLWGS: The EMBL nucleotide sequence database - whole genome shotgun sequences
    • REFSEQNEW: Database providing non-redundant curated data representing knowledge of known genes RefSeq Updates
  • Protein databases
    • REFSEQP: Database of protein information from NCBI
    • UNIPROT: The UniProt Knowledgebase is the central database of protein sequences with accurate, consistent, **REMTREMBL: REM-TrEMBL (REMaining TrEMBL) contains translations of EMBL nucleotide sequences that will not be included in TrEMBL
    • UNIREF100: Non redundant sequence database which combines identical sequences and sub-fragments from the same organism into a single UniRef entry
    • UNIREF90: A non-redundant sequence set, based on uniref100 with each sequence representing a cluster of sequence with at least 90% sequence identity
    • UNIREF50: A non-redundant sequence set, based on uniref100 with each sequence representing a cluster of sequences with at least 50% sequence identity
    • FANTOMp: Database of translations of mouse transcriptome
    • IMGTHLA: The IMGT/HLA Database is part of the international ImMunoGeneTics IMGT project
    • IPI: International Protein Index - a top level guide to main proteome databases
    • REFSEQPNEW: Database of protein information from REFSEQ RefSeq Protein Updates
    • UNIPROT_SWISSPROT: The UniProt Knowledgebase is the central database of protein sequences with accurate, consistent, and rich sequence and functional annotation. UniProt/Swissprot contains manually-annotated records with information extracted from literature and curator-evaluated computational analysis
    • UNIPROT_TREMBL: The UniProt Knowledgebase is the central database of protein sequences with accurate, consistent, and rich sequence and functional annotation. UniProt/Trembl consists of computationally analyzed records that await full manual annotation
  • Gene-related databases
    • ENTREZGENE: NCBI's database for gene-specific information.
    • EPD: Eukariotic Promoter Database - Philipp Bucher (1996)
    • UNIGENE: Unique gene cluster db from the NCBI
    • UNISEQ: Sub-component of the UniGene db. Contains the sequence information from UniGene.
    • UTRSITE: Sub-component of the UTRnr
    • HGBASE: Human Genic Bi-Allelic Sequences Database
    • RHDB: The RHDB Radiation Hybrid Mapping Submissions database
    • RHEXP: The RHDB Radiation Hybrid Mapping Experimental Conditions database
    • RHMAP: The RHDB Radiation Hybrid Map Information database
    • RHPANEL: The RHPANEL RH Mapping panels database
  • Protein-related databases
    • INTERPRO: Integrated Resource of Protein Domains and Functional Sites
    • IPRMATCHES: All hits to Swiss-Prot and TrEMBL entries in which the signatures are found by INTERPRO
    • PROSITE: A Dictionary of Protein Sites and Patterns - A. Bairoch
    • BLOCKS: The Blocks database of multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins.
    • PRINTS: Protein Motif Fingerprint Database
    • PFAMA: The A division (human curated) division of the Pfam database. Alignments of protein domains and conserved regions.
    • PFAMB: The B division (automatically clustered) division of the Pfam database. Alignments of protein domains and conserved regions
    • SWISSPFAM: An annotated description of how Pfam domains map to (possibly multidomain) SwissProt entries.
    • PFAMHMM: PfamHmm database. Database of the Hidden Markov Models (HMMs) derived from the seed alignment in Pfam.
    • PFAMSEED: PfamSeed database. Seed alignments (hand edited) representing each domain
    • PRODOM: A comprehensive collection of protein domain families
  • 3D structures databases
    • NRL3D: PIR-NRL3D Sequence-Structure Database.
    • PDB: Protein Data Bank (PDB) - repository for the processing and distribution of 3-D biological macromolecular structure data
    • PDBFINDER: Directory for the Brookhaven Protein Data Bank. Constructed from the PDB, DSSP and HSSP databases
  • Reference databases
    • TAXONOMY: Contains names of all organisms represented in sequence databases by at least one nucleotide or protein sequence
    • GENETICCODE: NCBI database of genetic codes
    • OMIM: Online Mendelian Inheritance in Man database.
    • REBASE: Restriction Enzyme database.

Bioinformatics: Bioinfo Services: Bioinfo programs - Databases - Local facilities - Research DB
Personal tools