Search Mailing List Archives
From gene_product to database
cjm at fruitfly.org
Mon Feb 23 15:47:04 PST 2004
On Mon, 23 Feb 2004, David Martin wrote:
> There doesn't seem to be any link between the gene_product table and the db
> table without going through association.
Hi David, there is actually a mailing list specifically for the GO
database, see the newly renovated GO database website:
The "db" table is for metadata about a particular database. It isn't
particularly useful at the moment, it is intended to allow applications
such as AmiGO to construct appropriate URLs and so forth.
Most of the time in the GO database we follow a normalized design and use
explicit foreign keys. The db table is one exception to this rule; most of
the time we store the database *name* directly in the dbxref table, to
avoid the extra join.
The one place we have an explicit foreign key is in the association table.
This has a very particular purpose, it is it identify the source of an
association if the source is different from the database providing the
For example, SwissProt/GOA annotate protein sequences in their database,
and so their gene products all have primary UniProt identifiers. However,
SwissProt may have sourced some of their data from MGI or FlyBase in which
case this would be indicated in the association table.
For more information on this, use the URL above and go to the database
modules documentation and click on the "associations" module. This has a
description of how each of the columns relates to the annotation files
that are fed into the GO database; the annotations file format is
Now, onto your other two questions
> Two questions:
> 1. Does GO store sequences that as yet have no association? Such storage
> would allow questions such as 'what are the proportion of gene products in X
> that are associated with a term Y.
First off, the concept of gene_product and sequence are different in the
GO database. Associations are made to gene products, which themselves have
sequences (although some groups provide associations directly to
sequences, we create an implicit gene product).
The GO database only contains what the various annotation model organism
databases and other groups contribute. Generally if a gene is of unknown
function, it will be absent from the database. Some groups do use the
"unknown" category within GO.
For example, FlyBase has annotations on over half the protein coding genes
in melanogaster. You won't get this info from the GO db at the moment, you
have to go to the source.
Even armed with this knowledge, I don't know if you could truthfully
answer your question above
> 2. Is there not the potential for namespace collision as there is no UNIQUE
> constraint on gene_product.symbol ? (with a database id field as db_id one
> could add a constraint such as UNIQUE(symbol, db_id).
I've recently changed the unique constraint so that it is on dbxref_id
The constraint you mention would only really work if the db_id or dbname
was in the gene_product table rather than the dbxref table.
> Many thanks
> This message is from the GOFriends moderated mailing list. A list of public
> announcements and discussion of the Gene Ontology (GO) project.
> Problems with the list? E-mail: owner-gofriends at geneontology.org
> Subscribing send "subscribe" to gofriends-request at geneontology.org
> Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org
> Web: http://www.geneontology.org/
This message is from the GOFriends moderated mailing list. A list of public
announcements and discussion of the Gene Ontology (GO) project.
Problems with the list? E-mail: owner-gofriends at geneontology.org
Subscribing send "subscribe" to gofriends-request at geneontology.org
Unsubscribing send "unsubscribe" to gofriends-request at geneontology.org
More information about the go-friends