Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

Quantifying Specificity of GO Terms

Gavin Sherlock sherlock at genome.Stanford.EDU
Thu Apr 19 08:30:18 PDT 2007


Hi Sorin,

Indeed, it would be dependent on the annotations, which are of course  
a reflection of both our knowledge and ignorance.  In terms of  
removing nodes from consideration when finding enriched GO nodes, it  
should work fine, because that itself is dependent on annotations.   
In terms of describing a node as specific or not, it will only do  
that in terms of annotations - it is thus not perfect, but I'm not  
sure whether specificity is an intrinsic property of a node itself,  
or a property only in the context of the annotations to it and all  
other nodes.  I think I favor the latter - it simply means that our  
notion of specificity would evolve over time as we chip away at our  
ignorance.

Cheers,
Gavin

On Apr 19, 2007, at 8:20 AM, Sorin Draghici wrote:

> Gavin,
>
> This is a very interesting idea but it seems to me that the results  
> would still be dependent on the current state of annotations, not  
> on the intrinsic concepts captured by the terms. I would like  
> something that would tell me that "boiling water in a brown kettle"  
> is more specific than "boiling water in a kettle" which in turn is  
> more specific than "boiling water", even if there are no genes  
> known to boil water at this time. What do you think?
>
> Sorin
>
> Gavin Sherlock wrote:
>> One possibility that occurred to me this morning, based on a user  
>> request wanting to eliminate certain GO terms from being tested in  
>> GO::TermFinder, is that you could calculate the best possible p- 
>> value that could be generated for a GO node, based on the  
>> hypergeometric distribution.  E.g. if a node has 20 annotations,  
>> then the best possible p-value you could generate would be based  
>> on observing all 20 of those annotations in your list of  
>> interesting genes.  Likewise for something that has 5,000  
>> annotations.  I suspect, though haven't tested, that for the non- 
>> specific GO terms, the best possible p-value that could be  
>> generated would be non-significant - i.e. no matter how many  
>> observations of it you make in your list of interesting genes,  
>> that node might never achieve significance.  Such results could of  
>> course be easily ranked.
>>
>> Cheers,
>> Gavin
>>
>> On Apr 18, 2007, at 1:12 AM, Waclaw Kusnierczyk wrote:
>>
>>> This is exactly the problem.  Coding is the least painful part.
>>> Both the length of the path from a node to the root and the count  
>>> of annotations say more about how well explored and interesting  
>>> the particular part of the GO is than about any sort of  
>>> specificity of the GO term.
>>>
>>> One possibility that I've been advocating for a while is to  
>>> measure specificity in terms of a GO term's correspondence with  
>>> the taxon/taxa in the taxonomy of species, for which organisms  
>>> the term may be sensibly  used to talk about.  This would not, of  
>>> course, solve those of your problems that are not related to the  
>>> classification of organisms, but might help solve others.
>>>
>>> vQ
>>>
>>>
>>> Sorin Draghici wrote:
>>>> Tobias,
>>>> How exactly would you define the specificity of a GO term? If we  
>>>> had an exact definition, we could possibly write a piece of  
>>>> software that could do it.
>>>> Sorin
>>>> Tobias Sayre wrote:
>>>>> Dear GO Friends,
>>>>>
>>>>> I am working on a project that involves curation of protein  
>>>>> data that includes GO terms, and it would be very helpful if I  
>>>>> had some numerical quantification of the specificity of each  
>>>>> term.  It is possible to manually examine each term to  
>>>>> determine this specificity, but because there is a large amount  
>>>>> of data, I would like to automate the process.  I understand  
>>>>> that there is no reliable way to do this simply using the level  
>>>>> in the DAG hierarchy, but I am wondering if any of you might  
>>>>> have a work-around.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Tobias Sayre
>>>
>>> --Wacek Kusnierczyk
>>>
>>> ------------------------------------------------------
>>> Department of Information and Computer Science (IDI)
>>> Norwegian University of Science and Technology (NTNU)
>>> Sem Saelandsv. 7-9
>>> 7027 Trondheim
>>> Norway
>>>
>>> tel.   0047 73591875
>>> fax    0047 73594466
>>> ------------------------------------------------------
>>>
>>> -- 
>>> This message is from the GOFriends moderated mailing list.  A  
>>> list of public
>>> announcements and discussion of the Gene Ontology (GO) project.
>>> Problems with the list?           E-mail: owner- 
>>> gofriends at geneontology.org
>>> Subscribing   send   "subscribe"   to   gofriends- 
>>> request at geneontology.org
>>> Unsubscribing send   "unsubscribe"  to  gofriends- 
>>> request at geneontology.org
>>> Web:          http://www.geneontology.org/
>>
>>
>> -- 
>> This message is from the GOFriends moderated mailing list.  A list  
>> of public
>> announcements and discussion of the Gene Ontology (GO) project.
>> Problems with the list?           E-mail: owner- 
>> gofriends at geneontology.org
>> Subscribing   send   "subscribe"   to   gofriends- 
>> request at geneontology.org
>> Unsubscribing send   "unsubscribe"  to  gofriends- 
>> request at geneontology.org
>> Web:          http://www.geneontology.org/
>>
>
> -- 
> Sorin Draghici, Ph.D.
>
> Director of the Bioinformatics Core, Karmanos Cancer Institute
>
> Associate Professor		Tel: (313) 577-5484
> Dept. of Computer Science	Fax: (313) 577-6868
> Wayne State University
> 5143 Cass Ave, Room 431 State Hall, Detroit, MI, 48202
> WWW: http://vortex.cs.wayne.edu/Sorin/ (personal)
> WWW: http://vortex.cs.wayne.edu/Projects.html (lab)
>
>
> Check out my recent book: Data Analysis Tools for Microarrays:
> http://www.crcpress.com/shopping_cart/products/product_detail.asp? 
> sku=C3154&parent_id=&pc=
>


--
This message is from the GOFriends moderated mailing list.  A list of public
announcements and discussion of the Gene Ontology (GO) project.
Problems with the list?           E-mail: owner-gofriends at geneontology.org
Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
Web:          http://www.geneontology.org/



More information about the go-friends mailing list