Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

Quantifying Specificity of GO Terms

Gavin Sherlock sherlock at genome.Stanford.EDU
Thu Apr 19 07:59:09 PDT 2007


One possibility that occurred to me this morning, based on a user  
request wanting to eliminate certain GO terms from being tested in  
GO::TermFinder, is that you could calculate the best possible p-value  
that could be generated for a GO node, based on the hypergeometric  
distribution.  E.g. if a node has 20 annotations, then the best  
possible p-value you could generate would be based on observing all  
20 of those annotations in your list of interesting genes.  Likewise  
for something that has 5,000 annotations.  I suspect, though haven't  
tested, that for the non-specific GO terms, the best possible p-value  
that could be generated would be non-significant - i.e. no matter how  
many observations of it you make in your list of interesting genes,  
that node might never achieve significance.  Such results could of  
course be easily ranked.

Cheers,
Gavin

On Apr 18, 2007, at 1:12 AM, Waclaw Kusnierczyk wrote:

> This is exactly the problem.  Coding is the least painful part.
> Both the length of the path from a node to the root and the count  
> of annotations say more about how well explored and interesting the  
> particular part of the GO is than about any sort of specificity of  
> the GO term.
>
> One possibility that I've been advocating for a while is to measure  
> specificity in terms of a GO term's correspondence with the taxon/ 
> taxa in the taxonomy of species, for which organisms the term may  
> be sensibly  used to talk about.  This would not, of course, solve  
> those of your problems that are not related to the classification  
> of organisms, but might help solve others.
>
> vQ
>
>
> Sorin Draghici wrote:
>> Tobias,
>> How exactly would you define the specificity of a GO term? If we  
>> had an exact definition, we could possibly write a piece of  
>> software that could do it.
>> Sorin
>> Tobias Sayre wrote:
>>> Dear GO Friends,
>>>
>>> I am working on a project that involves curation of protein data  
>>> that includes GO terms, and it would be very helpful if I had  
>>> some numerical quantification of the specificity of each term.   
>>> It is possible to manually examine each term to determine this  
>>> specificity, but because there is a large amount of data, I would  
>>> like to automate the process.  I understand that there is no  
>>> reliable way to do this simply using the level in the DAG  
>>> hierarchy, but I am wondering if any of you might have a work- 
>>> around.
>>>
>>> Thanks,
>>>
>>> Tobias Sayre
>
> -- 
> Wacek Kusnierczyk
>
> ------------------------------------------------------
> Department of Information and Computer Science (IDI)
> Norwegian University of Science and Technology (NTNU)
> Sem Saelandsv. 7-9
> 7027 Trondheim
> Norway
>
> tel.   0047 73591875
> fax    0047 73594466
> ------------------------------------------------------
>
> --
> This message is from the GOFriends moderated mailing list.  A list  
> of public
> announcements and discussion of the Gene Ontology (GO) project.
> Problems with the list?           E-mail: owner- 
> gofriends at geneontology.org
> Subscribing   send   "subscribe"   to   gofriends- 
> request at geneontology.org
> Unsubscribing send   "unsubscribe"  to  gofriends- 
> request at geneontology.org
> Web:          http://www.geneontology.org/


--
This message is from the GOFriends moderated mailing list.  A list of public
announcements and discussion of the Gene Ontology (GO) project.
Problems with the list?           E-mail: owner-gofriends at geneontology.org
Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
Web:          http://www.geneontology.org/



More information about the go-friends mailing list