Search Mailing List Archives
[Gofriends] Redundancy in go_XXXXXX-assocdb-tables/dbxref.txt
Gabriel Berriz
gberriz at hms.harvard.edu
Mon Sep 8 14:49:03 PDT 2008
Dear GO friends,
For some species, the info given in the dbxref table includes IDs from
multiple databases, which raises the possibility of "cryptic
redundancies", i.e. associations distinct because they are assigned
IDs from different databases that in fact refer to the same underlying
gene product. For example, if I compare the sets of rat dbxref's that
have database names RGD and ENSEMBL respectiely, I find that the
overlap (redundancy) of these two sets of IDs consists of about 1000
IDs, which is over 5% all the possible rat gene products. (To compute
this overlap, I first mapped the RGD IDs to ENSEMBL IDs using the
mappings provided by Ensembl version 49.)
Would it help to avoid these "cryptic redundancies" if a single
database (Ensembl, RGD, whatever) was used for each species?
Thanks for your comments,
Gabriel Berriz
=============================================================
Gabriel F. Berriz, PhD
Bioinformatics Developer
Roth Lab
Biological Chemistry and Molecular Pharmacology -- Harvard Medical
School
Seeley G. Mudd Building 322B
Boston, MA 02115-5701
Telephone: 617.432.3555
Fax: 617.432.3557
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/go-friends/attachments/20080908/6ae390c8/attachment.html>
More information about the go-friends
mailing list