Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

Seqdb (not Seqdblite) FASTA file

Chris Mungall cjm at fruitfly.org
Mon Mar 29 17:30:52 PST 2004


Due to popular demand, here's the answer to Chris' question:

The GO Database monthly releases come in 4 distributions

termdb   - just the ontology
assocdb  - ontology plus gene associations
seqdb    - above plus UniProt sequence
seqdblite - above minus associations with IEA code

The FASTA format file is generated from seqdblite (this is the same db
that AmiGO uses). We actively discourage people from making transitive
sequence similarity associations, hence the decision to build the fasta
file from seqdblite.

If you want a fasta file that includes IEA associations in the header,
download the mysql seqdb database, and run this script (assuming you have
installed go-db-perl)

  get-seqs.pl -d mygo -h myhost -all -fullheader -skipnogo -withname > file

[if you omit skipnogo you get seqs without any GO annotation]

Neither of these fastadbs are as complete as they could be - they are
built from the swissprot proteomes directory which has only a subset of
SPTR/UniProt, indexed by taxa id

  ftp.ebi.ac.uk/pub/databases/SPproteomes/swissprot_files/proteomes/

I'm trying to work out a fast way of slurping in all of UniProt with GO
associations in such a way that the monthly releases won't be slowed down
further. This isn't a priority at the moment - the SP proteomes directory
is reasonably complete with respect to well annotated genomes.

For more details on the GO DB and perl tools, see

  http:/www.godatabase.org/dev

Chris


On Fri, 26 Mar 2004, Chris Beck wrote:

> Is there anyway I can get a seqdb go file in FASTA format?  Or get the tools that are used to generate the seqdblite FASTA file so I
> can generate my own copy?  Or does anyone have any guidance of what e value I might use to blast against seqdblite and get the hits
> that adding the seqdb entries would give?
>
> Cheers,
> Chris
>


--
This message is from the GOFriends moderated mailing list.  A list of public
announcements and discussion of the Gene Ontology (GO) project.
Problems with the list?           E-mail: owner-gofriends at geneontology.org
Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
Web:          http://www.geneontology.org/



More information about the go-friends mailing list