Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

Gene identifier synonym table standard and/or repository?

Gabriel Berriz gberriz at hms.harvard.edu
Wed Feb 19 08:19:03 PST 2003


Thanks for all of the responses to our earlier question (Fritz's post) 
about synonym tables!

We will go with column 11 in the GO gene association tables and support 
those model organisms that make use of it.  We may also supplement with 
Ensembl (although see Ensembl NOTE below).

One comment on obtaining synonyms from the association tables rather than a 
stand-alone synonym file.  Shortcomings include:
1) No synonyms for genes that are not annotated.
2) Since synonyms are stored in a "denormalized" way, there is potential 
for inconsistencies between records for the same gene (although this is 
less of an issue if the files are automatically generated from a normalized 
database).

FYI, the following association files (from 
ftp.geneontology.org/pub/go/gene_associations/) do not use column 11:
gene_association.GeneDB_Pfalciparum
gene_association.GeneDB_Tbrucei
gene_association.GeneDB_tsetse
gene_association.compugen.Genbank
gene_association.compugen.Swissprot
gene_association.fb
gene_association.gramene_oryza
gene_association.zfin

These do:
gene_association.GeneDB_Spombe          890   synonyms; 3765   genes
gene_association.goa_human                      19727 synonyms; 19727  genes
gene_association.goa_sptr                       28397 synonyms; 566342 genes
gene_association.mgi                            10080 synonyms; 9088   genes
gene_association.rgd                            522   synonyms; 1424   genes
gene_association.sgd                            6573  synonyms; 6905   genes
gene_association.tair                           45327 synonyms; 18771  genes
gene_association.tigr_Tbrucei_chr2      2     synonyms; 289    genes
gene_association.tigr_ath                       269   synonyms; 5749   genes
gene_association.tigr_shewanella                1233  synonyms; 3767   genes
gene_association.tigr_vibrio            1415  synonyms; 2924   genes
gene_association.vida                           19    synonyms; 83     genes
gene_association.wb                             1319  synonyms; 6833   genes

It would be a great help to have similar standardized lists of all 
"annotatable" genes for each GO organism.  In principle the association 
tables could serve as the source of all "annotatable" genes if they always 
included at least one annotation--possibly to attributes of type 
"unknown"--for each annotatable gene id (or is this the case now?).  As far 
as I know, there is no easy way to determine whether this is the case for 
any given association table.

Ensembl NOTE: Ensembl looks to be quite useful for us, and will get us a 
more normalized table of synonyms, but we did some spot-checking in fly and 
couldn't get an Ensembl list of synonyms to include full-length gene names 
(e.g., Wingless, Kruppel) in addition to gene symbols (Wg, Kr).  In both 
human and fly we never saw more than one synonym for any given gene.  Are 
we doing something wrong?

Thanks again for all of your help!

Best Regards,
Gabriel Berriz

At 10:19 AM 2/13/2003 -0800, Suzanna Lewis wrote:
>Hi,
>
>I'm double-checking here that we are getting this loaded
>into the DB as well. They don't currently appear in amigo,
>nor can they be searched, but Brad and I are talking about
>how to do that.
>
>-S
>
>On Thursday, February 13, 2003, at 09:33 AM, Valerie Wood wrote:
>
>>
>>
>>i utilize this column for S. pombe too. btw I use "|" to separate
>>multiple
>>synonyms, is this correct?
>>
>>
>>
>>On Thu, 13 Feb 2003, Tanya Berardini wrote:
>>
>>>
>>>In the TAIR gene_association file, column 11 is populated with
>>>synonyms/aliases for the annotated object.  These may include
>>>BAC-based
>>>names from the genome sequencing phase, full names for the lettered
>>>abbreviations (e.g. EMF1 is embryonic flower 1), other aliases for
>>>that
>>>gene (e.g. ATROP4 = ROP4 = ATGP3 = ARAC5), Arabidopsis Genome
>>>Initiative
>>>(AGI) locus names (of the format ATxgXXXXX), and gene product names.
>>>
>>>Tanya
>>>
>>>
>>>On Thu, 13 Feb 2003, Suzanna Lewis wrote:
>>>
>>>>In the gene associations table the 11th column is listed
>>>>as DB_object_synonym. I believe that this column was
>>>>added especially to address this issue. It allows for
>>>>white space and has a cardinality of 0, 1, or >1. I think
>>>>this is a more a problem of the organism databases not
>>>>having made the switch to providing this information
>>>>when the gene associations are submitted. Column 12
>>>>is the db object type (is it a gene, or a protein, or a .....)
>>>>and column 13 is the taxon. I think if these were being
>>>>populated it would perhaps help you.
>>>>
>>>>Any chance of this being put into practice annotators??
>>>>
>>>>-S
>>>>
>>>>On Thursday, February 13, 2003, at 07:54 AM, Fritz Roth wrote:
>>>>
>>>>>Greetings GOphiles,
>>>>>
>>>>>We are working on some new software that uses GO annotation, and we
>>>>>would really like it to support all GO-annotated organisms.  Our
>>>>>chief
>>>>>barrier to doing this is the lack of gene identifier synonym tables
>>>>>for each organism (so that users can enter gene names rather than
>>>>>being restricted to MOD IDS, e.g., SGD or MGI IDs).
>>>>>
>>>>>Is there an agreed GO Consortium standard for gene identifier
>>>>>synonym
>>>>>tables (could be as simple as tab-delimited text with a
>>>>>synonym-uniqueID pair on each line).  If so, is there a repository
>>>>>for
>>>>>such files?  Or is this a GMOD question?
>>>>>
>>>>>Thanks!
>>>>>Fritz Roth
>>>>>
>>>>>-------------------------------------------------
>>>>>Frederick P. Roth, Asst. Professor
>>>>>Harvard Medical School
>>>>>Dept. of Biological Chemistry and Molecular Pharmacology
>>>>>250 Longwood Avenue, SGMB-322, Boston, MA 02115
>>>>>(617) 432-3551 phone            (617) 432-3557 FAX
>>>>>froth at hms.harvard.edu           http://llama.med.harvard.edu


--
This message is from the GOFriends moderated mailing list.  A list of public
announcements and discussion of the Gene Ontology (GO) project.
Problems with the list?           E-mail: owner-gofriends at geneontology.org
Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
Web:          http://www.geneontology.org/



More information about the go-friends mailing list