Data Resources

Sequences: (via blast, fasta, text search)

  • GenBank info (NR, GBALL, GBMAM, GBBCT, GBENV, GBEST, GBGSS, GBHTC, GBHTG, GBPAT, GBINV, GBPHG, GBPLN, GBPRI, GBROD, GBSTS, GBSYN, GBUNA, GBVRL, GBVRT)
  • Uniprot info (SwissProt, TREMBL)

  • ENSEMBL
  • PDBFINDER2 info
  • REFSEQ info NCBI's curated collection of DNA, RNA, and Protein Sequences
  • Third Party Annotation info NCBI's database of inference- and experiment- based annotations
  • UniParc info UniProt's archive of all reported protein sequences
  • UniMES info UniProt's database of Metagenomic and Environmental Samples

Profiles/Motifs

  • Blocks info (via Blimps) The Blocks Database is no longer updated by FHCRC.
  • Pfam info (via HMMPFAM and RPSBLAST)
  • Prints info (via FingerPRINTScan)
  • Prosite info ( via PFSCAN; PFSEARCH; PPSEARCH; PROSEARCH)
  • CDD info (via RPSBLAST). A superset including NCBI-curated domains and data imported from Pfam, SMART, COG, and PRK

  • COG (via RPSBLAST). COG database of orthologous protein families focusing on prokaryotes. See Cognitor for an alternative.
  • KOG info (via RPSBLAST). The eukaryotic counterpart to COGs. KOGs are not included in the CDD superset.
  • PRK info (via RPSBLAST). Reference Sequence proteins encoded by complete prokaryotic and chloroplast plasmids and genomes.
  • SMART info (via RPSBLAST) Very Short SMART Peptides and SMART Superfamilies are not included in this DB.
  • PDBFINDER2 info (via text search). Searchable database of PDB metadata and structure features.

Structures

  • PDB (via text search and the Sirius browser). Universal database of protein structure models.